cDNA cloning and sequencing reveal the major horse allergen Equ c1 to be a glycoprotein member of the lipocalin superfamily.

The gene encoding the major horse allergen, designated Equus caballus allergen 1 (Equ c1), was cloned from total cDNA of sublingual salivary glands by reverse transcription-polymerase chain reaction using synthetic degenerate oligonucleotides deduced from N-terminal and internal peptide sequences of the glycosylated hair dandruff protein. A recombinant form of the protein, with a polyhistidine tail, was expressed in Escherichia coli and purified by immobilized metal affinity chromatography. The recombinant protein is able to induce a passive cutaneous anaphylaxis reaction in rat, and it behaves similarly to the native Equ c1 in several immunological tests with allergic patients' IgE antibodies, mouse monoclonal antibodies, or rabbit polyclonal IgG antibodies. Amino acid sequence identity of 49-51% with rodent urinary proteins from mice and rats suggests that Equ c1 is a new member of the lipocalin superfamily of hydrophobic ligand-binding proteins that includes several other major allergens. An RNA blot analysis demonstrates the expression of mRNA Equ c1 in liver and in sublingual and submaxillary salivary glands.

Exposure to animal danders, commonly present in the environment, is known to be a frequent cause of allergy. The inhalation of these potent animal dandruff allergens induces immunoglobulin E antibody (IgE) and subsequent development of asthma in atopic individuals. Among these allergens, a major allergen is defined to be the one that elicits an anaphylactic reaction in a majority of patients, presenting an immediate hypersensitivity response mediated by IgE against the basic raw material (1).
The reasons why a protein is allergenic are not clearly understood to date, although several authors favor the hypothesis of a possible relationship between the structure and the function of proteins and their allergenicity (2). The enzymatic activity of certain proteins has been assumed to have a capacity to enhance the IgE response (2). A family of proteins, the lipocalin superfamily, is known to include several allergens, such as the mouse major urinary protein mMUP 1 (3), the rat ␣-2-microglobulin (rA2U) (4), the bovine ␤-lactoglobulin (␤lg) (5), the cockroach allergen Bla g4 (6), and the recently described bovine dander allergen Bos d2 (7). Based on this observation, Arruda et al. suggested that lipocalins may contain a common structure that is able to induce the IgE response. Members of this superfamily, which bind or transport small hydrophobic molecules, are generally expressed in the liver and/or secretory glands. This is particularly true for the mMUP and rA2U proteins, which are multigenic families at about 35-40 members in the case of the mMUP family (8) and about 25 for the rA2U (9,10). These members are differentially expressed in the liver as well as salivary, lachrimal, and other secretory glands (11).
The major horse allergen, Equ c1, is a potent allergen responsible for about 80% of anti-horse IgE antibody response in patients who are chronically exposed to horse allergens. Although much work has been carried out on the isolation and identification of the horse allergenic agents responsible for human hypersensitivity response (12)(13)(14)(15)(16), the major horse allergen was only recently purified from hair and dandruff (17). A previous study by SDS-polyacrylamide gel electrophoresis (SDS-PAGE) and isoelectric focusing-PAGE showed that Equ c1 appears as a single polypeptide with a relative molecular mass of 21,500 daltons and a pI of 3.9. The purification of Equ c1 allowed the sequencing of the 27 N-terminal amino acids and of internal peptides (18).
To obtain more information on the structural and functional features of Equ c1, we have cloned the corresponding cDNA from the sublingual salivary gland (SLG). Here we report the molecular cloning and sequencing of this cDNA and expression of a recombinant allergen rSLG Equ c1 in a bacterial system. The recombinant protein was compared with natural Equ c1 for its recognition by antibodies raised against the natural Equ c1 in immunoblots and in inhibition/competition enzyme-linked immunosorbent assay (ELISA). We also show that the recombinant protein is able to elicit a rat mast cell degranulation by passive cutaneous anaphylaxis reaction.
Sequence comparisons reveal that Equ c1 is a new member of the lipocalin superfamily.

EXPERIMENTAL PROCEDURES
Materials-The horse salivary glands were obtained from a slaughterhouse and rapidly frozen in liquid nitrogen after dissection. They were stored at Ϫ80°C until protein and nucleotidic extractions were performed.
Protein Purification and N-terminal Sequencing-Equ c1 was puri-* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) U70823.
An Equ c1 tryptic proteolysis was performed for 15 min at 37°C in a buffer containing 50 mM Tris-HCl, 1 mM CaCl 2 , pH 7.0, with an enzyme ratio of 1:1000 (w/w). The sequencing was processed, using the method described by Baw et al. (19), in the microsequencing laboratory of the Pasteur Institute. Protein assays were performed with the colorimetric method using Micro BCA protein assay reagent from Pierce, according to Smith et al. (20).
Equ c1 cDNA Cloning-cDNA first strand synthesis was performed on 5 g of horse SLG total RNA for 1 h at 37°C in a total volume of 50 l with 20 pmol of the primer adapter oligo(dT): 5Ј-AAC CCG GCT CGA GCG GCC GCT TTT TTT TTT TTT TT-3Ј, 800 units of Moloney murine leukemia virus reverse transcriptase (Life Technologies, Inc.) in the manufacturer's buffer. The cDNAs so obtained were amplified by polymerase chain reaction (PCR) with the Opti Prime PCR optimization kit (Stratagene), with the oligomer 5Ј-GGY GAG TGG TAY TCY ATY TT-3Ј as primer 1 and the oligomer 5Ј-GGY GAG TGG TAY AGY ATY TT-3Ј as primer 2 derived from the Gly 35 -Ser 39 sequence and the 5Ј-GTS AGP TCR ATR ATR TTY TC-3Ј as primer 3 derived from the Glu 165 -Leu 170 sequence. The letter Y represents a 50% mixture (w/w) of nucleotides T and C, S a mixture of G and C, and R a mixture of A and G. After a first denaturation cycle at 98°C for 2 min, 30 cycles of PCR consisting of a 30-s denaturation step at 94°C followed by annealing at 50°C for 35 s and elongation at 72°C for 30 s were carried out in a thermocycler Hybaid (Ceralabo, Aubervilliers, France). Each reaction contained 1 l of cDNA reaction product, 0.2 mM dNTP, 2.4 units of Taq DNA polymerase, 68.8 pmol of the primer 3, and 34.4 pmol of each other primer. The variable parameters of buffers are pH, MgCl 2 , and KCl concentrations. The best amplification was obtained with buffer 6 (10 mM Tris-HCl, pH 8.8, 1.5 mM MgCl 2 , and 75 mM KCl) and buffer 12 (10 mM Tris-HCl, pH 9.2, 3.5 mM MgCl 2 , and 75 mM KCl). After separation by electrophoresis in a 1.2% agarose gel and purification, the products from the PCR reactions were inserted in pMOS Blue T vector (Amersham Life Sciences). Sequencing was performed after alkaline denaturation by the dideoxy chain termination method (24) using Sequenase version 2.0 (U.S. Biochemical Corp.) and ␣-35 S-labeled dATP.
Amplification of the cDNA Ends-The rapid amplification of cDNA ends (RACE) strategy was applied to clone 3Ј and 5Ј cDNA extremities. For 5Ј RACE, 12.5 l of the first single strand cDNA (as described above) were directly used for dC tailing, for 5 min at 37°C, in 10 mM Tris-HCl, pH 8.4, 25 mM KCl, 1.25 mM MgCl 2 , 50 g/ml BSA, and 10 units of terminal transferase. Reactions were stopped by increasing the temperature to 65°C for 10 min. The cDNA amplification was performed in the presence of 5 pmol of the oligomer 5Ј-GCG CCC AGT GTG CTG GCT GCA GGG GGG GGG GG-3Ј, complementary to the dC tail, and the oligomer 5Ј-CTT TTC CTT GAC GTC TGA AGC C-3Ј corresponding to the nucleotide sequence G 189 -G 210 , as a specific primer (antisense). A 5-l aliquot of dC-tailed cDNA was amplified by PCR in a 50-l volume in 20 mM Tris-HCl, pH 8.4, 50 mM KCl, 2.5 mM MgCl 2 , 100 g/ml BSA) and 0.2 mM each dNTP. The conditions of 35 cycles of PCR consisted of a 30-s denaturation step at 95°C followed by a 35-s annealing step at 60°C and a 30-s extension step at 72°C.
For cloning of the 3Ј region, the same experimental conditions were applied to the PCR amplification using the specific primer 5Ј-GCC CGA GAA CCA GAT GTG AGT-3Ј corresponding to the nucleotide sequence G 481 -T 501 and the primer adapter oligo(dT). All amplified products were cloned in pMOS Blue vector and sequenced as described above. Bacterial Expression of Recombinant Equ c1-A cDNA corresponding to the nearly complete Equ c1 sequence was amplified by PCR and cloned in a pET vector. Primers for PCR were designed to specifically hybridize with Equ c1 cDNA and contained EcoRI and XhoI sites. The primers used were 5Ј-CTT GAA TTC ATC GAG GGG AGA GAA AAC AGT GAT GTT GCG-3Ј (5Ј end primer) and 5Ј-CCA CTC GAG GAA GTA TTC ACT GTC-3Ј (3Ј end primer). In addition, the 5Ј primer provides the recombinant protein with a new proteolytic cleavage site for the factor Xa. PCR products were cloned into the EcoRI/XhoI sites of the plasmid pET 28 (a) under control of the T7 lac promoter (Fig. 1). This expression vector contains the kanamycin resistance gene and a His 6 tag at the N terminus of the recombinant protein. Competent Escherichia coli XL1 cells were transformed, and supercoiled plasmid was sequenced and transfected in E. coli BL 21 (DE3). Induction was performed by adding isopropyl ␤-D-thiogalactopyranoside to the medium at a final concentration of 1 mM for 180 min at 37°C. Induction was controlled by taking aliquots every 30 min. Cells were then harvested by centrifugation and resuspended in 50 mM Tris-HCl, pH 7.0, containing 1% (v/v) Triton X-100 and 100 g/ml lysozyme. The cells were incubated for 15 min at 30°C, and the DNA was disrupted by sonication. The supernatant obtained after centrifugation was filtered on a 0.2-m membrane and dialyzed against phosphate-buffered saline (PBS) with 0.5 mM NaCl. The resulting product was used for chromatographic purification.
Purification of the Recombinant Equ c1-An HR 5/5 column was packed with chelating Sepharose fast flow (Pharmacia Biotech, Inc.), washed according to the manufacturer's suggestions, and charged until saturation with metal ions from a 0.5% (w/v) copper(II) chloride solution. After thorough rinsing with water, the column was presaturated with buffer (PBS/0.5 mM NaCl) containing 10 mM imidazole (25). After equilibration of the column with the starting buffer (PBS/0.5 mM NaCl), 6 column volumes of supernatant was loaded, and the unbound material was collected. Competitive elution was carried out using imidazole at 40 and 120 mM (PBS/0.5 mM NaCl), pH 7.0, collecting 6 column volumes at each step (26). The whole process was controlled by an FPLC apparatus (Pharmacia). The fractions were concentrated using stirred cell ultrafiltration with a PM 10 membrane (Amicon) and dialyzed against the proteolysis buffer (50 mM Tris-HCl, pH 8.0, 100 mM NaCl, and 1 mM CaCl 2 ). Digestion with the factor Xa was performed overnight at 30°C. After proteolysis, the digest was dialyzed to remove the small digest peptides and lyophilized.
SDS-PAGE and Western Blots-All analysis of the different fractions was performed with the Adjustable Stab Gel kit ASG 400 (Prolabo) using 18% acrylamide/bisacrylamide (29:1) gels (27). Proteins were visualized with Coomassie Blue and/or silver nitrate staining. Electroblotting experiments were performed using nitrocellulose membrane (Schleicher & Schü ll). For immunological detection, polyclonal antibodies from human and rabbit sera and mouse monoclonal antibody directed against Equ c1 were used.
The rabbit immunization was performed by intradermal injection of 100 g of pure allergen. Sixteen patients with established allergy to natural Equ c1 were selected, and a pool from three nonallergic healthy donors was used as negative control. Bound IgE were detected using peroxidase conjugated to rabbit anti-human IgE. When mouse mAb anti-Equ c1 or the polyclonal rabbit IgG was used, the detection was FIG. 1. Plasmid construct for the bacterial expression of rSLG Equ c1 in E. coli. cDNA Equ c1 was inserted in pET 28 (a) after digestion with EcoRI and XhoI. The plasmid contains the lac operator used to induce, with 1 mM of isopropyl ␤-D-thiogalactopyranoside, the recombinant protein tailed at its N-terminal end. Factor Xa proteolytic site (LEFIEGR2ENSDVA) was introduced between rSLG Equ c1 ant the tail containing the polyhistidine tag.
performed with peroxidase conjugated to rabbit anti-mouse IgG or peroxidase conjugated to goat anti-rabbit IgG, respectively, using the diamino-3,3Ј-benzidine tetrachlorydrate as specific reagent.
The mouse anti-HD Equ c1 mAb were prepared in the Hybridolab of the Pasteur Institute according to the methods described by Köhler and Milstein (28).
Passive Cutaneous Anaphylaxis-Each mouse was immunized subcutaneously at day 0 and boosted at days 21 and 35 with 5 g of antigen (purified HD Equ c1, protein extract from horse hair dandruff, horse serum albumin, or ovalbumin) in the presence of 4% (w/w) Al(OH) 3 in a physiological solution. Each mouse was bled after being anesthetized, at day 42 by retro-orbital puncture in order to study IgE immune response. The IgE antibody titers were determined by the passive cutaneous anaphylaxis reaction in rats (29).
Serum samples were diluted in a physiological solution and 100-l aliquots inoculated intradermally on the shaved back of Lewis rats. Twenty-four hours later, each rat was challenged by intravenous inoculation in the tail of 1 ml of a physiological solution containing 50 g of antigen and 0.5% Evans blue. Thirty minutes later, rats were killed, and skin was excised for examination. The reciprocal of the highest dilution giving a blueing reaction of 10-mm diameter was taken as the passive cutaneous anaphylaxis titer.
Inhibition/Competition Experiments-These experiments were performed using ELISA as follows. Each well of the assay plate (Maxisorb, Nunc, Roskild, Denmark) was coated with 100 l of a highly purified HD Equ c1 or rSLG Equ c1, 10 g/ml in 0.1 M carbonate/bicarbonate buffer, pH 9.6. After saturation of the unoccupied sites with 0.5% BSA in PBS and appropriate washing, mAbs, after being previously preincubated 1 h at 37°C with different dilutions of competitor, were added in duplicate to the sample-coated wells and incubated for 1 h at 37°C. Bound mAb and rabbit antibodies were detected with peroxidase-conjugated rabbit anti-mouse IgG (Sigma) and peroxidase-conjugated goat anti-rabbit IgG, respectively, and revealed with o-phenylenediamine according to the manufacturer's recommendations.
Determination of Sugar Content-A study was done to perform deglycosylation on Equ c1, using anhydrous trifluoromethane sulfonic acid, as described by Sojar and Bahl (30). Each dry sample was acidtreated with a mixture of trifluoromethane sulfonic acid and toluene for 4 h at Ϫ20°C. Then trifluoromethane sulfonic acid was neutralized by adding to the reaction mixture pyridine and ammonium bicarbonate and dialyzed against 50 mM Tris/HCl, pH 7.5, 100 mM NaCl. Each sample was submitted to electrophoresis in SDS-PAGE. Gels were stained with silver nitrate. Analysis of the saccharide composition of the HD Equ c1 and Saliva Equ c1 was done using gas phase chromatography after acidic treatment, as described by Kamerling et al. (31).
RNA Analysis-Total mRNA was electrophoresed in an agarose/ formaldehyde gel (32) transferred to a nylon membrane, and hybridized with the Equ c1 cDNA probe. The probe was the full-length cDNA insert labeled by the random priming method (33).
The search for homologies between the deduced amino acid sequence of Equ c1 and the proteins of the Swiss-Prot data base or the Equ c1 cDNA and the GenBank TM nucleotide sequence data base were done, respectively, with the FASTP and FASTN program according to Altschul et al. (34).

Molecular
Cloning of the Equ c1 cDNA-Tryptic fragments were generated from HD Equ c1 isolated and purified from horse hair dandruff extract by a combination of size exclusion chromatography and hydrophobic interaction chromatography. These fragments were microsequenced, and two of them (shown in boldface type on Fig. 2) were used to design three degenerate primers. The design of the primers took into consideration the codon usage in horse.
It was previously demonstrated by Dandeu et al. (17) that Equ c1 from different sources, i.e. saliva, urine, and hair dandruff extracts, are similarly recognized by antibodies. Salivary secretions contain the highest amount of Equ c1 protein; therefore, the salivary glands were chosen to clone Equ c1 cDNA. Among the tested salivary glands, the sublingual glands had the highest level of Equ c1 immunoreactivity and were selected to prepare mRNA.
The mRNAs so obtained were reverse-transcribed, and the Equ c1 cDNA was amplified by PCR using a mixture of the three primers. This reverse transcription-PCR resulted in a DNA fragment of about 400 base pairs in length that was cloned in pMOS Blue; several positive clones were sequenced. In a second step, 5Ј and 3Ј ends of the SLG Equ c1 cDNA were obtained using a 5Ј and 3Ј RACE strategy. The two amplification products of 250 and 450 base pairs for the 5Ј and 3Ј RACE, respectively, were cloned and sequenced.
Sequence of the Equ c1 cDNA-The full-length sequence of Equ c1 cDNA and the deduced amino acid sequence are shown in Fig. 2. The SLG Equ c1 cDNA is 923 nucleotides long with an open reading frame of 560 nucleotides (excluding the stop codon), coding for a 187-amino acid protein. All peptides from HD Equ c1 can be localized in the SLG Equ c1 sequence and start after an arginine or a lysine residue, according to the tryptic proteolysis consensus sites. However, some differences in the amino acid sequence can be observed between rSLG Equ c1 from sublingual salivary gland and the tryptic peptides obtained from HD Equ c1. These differences are not PCR artifacts, because our nucleotide sequence results from the analysis of 12 clones from four independent PCR experiments. These Analysis of the deduced amino acid sequence revealed that the 5Ј end of the coding region contains a typical signal sequence (35) (Fig. 3A). According to the Von Heijne weight matrix method (36), a favored putative signal peptidase cleavage site can be assigned between the Ala 15 and Gln 16 residues, generating a protein beginning with QQEENSDVAI. In contrast, the N-terminal end of the protein initially purified from hair dandruff (SDVAI) would result from a cleavage between Asn 20 and Ser 21 , which is not predicted by Von Heijne's rules. Equ c1 was purified from saliva, and the microsequencing of its N-terminal peptide revealed a mixture of three sequences, one of them beginning at the predicted Gln 16 , but the others at Glu 18 and Ser 21 , respectively (Fig. 3B). Whether these N-terminal ends are due to cleavage by signal peptidase at different sites or are generated by proteolytic processing of the secreted protein is not known. Such heterogeneous N-terminal ends were also reported for human tear albumin (37), another member of the lipocalin superfamily Excluding the putative signal peptide, the protein contains two cysteine residues at positions 83 and 176. In a previous study, we observed an increase in the apparent molecular mass of Equ c1 from 21,500 to 25,000 daltons in SDS-PAGE gels under reducing conditions, indicating that these two cysteines could form a disulfide bridge. Equ c1 is highly rich in charged residues and aromatic residues. The calculated pI is 4.57, a value close to that determined by Dandeu et al. Glycosylation of Equ c1-Two putative N-glycosylation sites are present at positions Asn 53 and Asn 68 . Glycosylation of HD and SLG Equ c1 was confirmed by gas phase chromatography, which revealed the presence of approximately 8.6% (w/w) of carbohydrates, representing 1,850 daltons. These results could explain the decrease in apparent molecular weight of Equ c1 in SDS-PAGE (Fig. 4) and the modification of the pI after deglycosylation.
Analysis of the sugar residue composition in Table I shows the presence of GalNAc, Gal, NeuAc, GlcNAc, and Man. Carbohydrates attached to proteins can be classified into two groups, N-glycans and O-glycans. All N-glycans contain a common structure, Man␣136(Man␣133)Man␤134GlcNAc␤13 4GlcNAc3 Asn, called the trimannosyl core. Molecular ratio results (second column in Table I) indicate unambiguously the presence of this core and, therefore, the presence in the glucidic part of Equ c1 of one N-glycan member of the biantennary complex type that contains three mannose residues. One Nacetyl-lactosamine (Gal␤134GlcNAc) is attached to the outer two ␣ mannose residues, followed by sialic acid residues (for Equ c1) or additional N-acetyllactosamines (38).
The presence of GalNAc only found in the O-glycan components, except for several hormones (38), suggests that the protein is also O-glycosylated.
Expression of Equ c1 as a Recombinant Protein-A recombinant protein, starting at Glu 19 , was produced in a bacterial system, after cloning of the corresponding cDNA sequence in a pET 28 plasmid. This plasmid allows bacterial expression of a recombinant protein with a 40-amino acid polypeptide tail containing a polyhistidine tag to its N-terminal end (Fig. 1). To allow the production of a recombinant protein without any added amino acid, a factor Xa proteolytic site (LEFIEGR2ENSDVA) was inserted between the tail and the recombinant protein.
Two recombinant clones were tested for rSLG Equ c1 expression. Optimal production was obtained after a 150-min induc-tion by isopropyl ␤-D-thiogalactopyranoside. A protein determination assay showed that rSLG Equ c1 represents about 30% of the total bacterial protein. This protein was essentially present in the supernatant of the bacterial extracts. A single purification step by immobilized metal affinity chromatography was sufficient to obtain pure rSLG Equ c1, which migrates as a single band of 19.5-20 kDa in an 18% SDS-PAGE gel (Fig. 4, lane A) after cleavage by factor Xa. This molecular mass is compatible with the calculated mass of 19,469 daltons and rather similar to that of deglycosylated natural Equ c1.
Antigenicity of the Recombinant Protein-The recombinant protein was tested for its antigenic recognition by different antibodies raised against HD Equ c1, i.e. three mouse monoclonal antibodies (mAbs 118 and 197, which recognize two different linear epitopes, and mAb 220, which recognizes a conformational epitope), 2 mouse and rabbit polyclonal antibodies (IgG), and human IgE from the sera of 16 patients suffering from horse allergic reactions (characterized in Ref. 17).
Immunoblot analysis after SDS-PAGE (Fig. 5), performed on the total bacterial extract, shows that the three mAbs bind a 24-kDa single band corresponding to the recombinant protein with the His tag. The tailed rSLG Equ c1 is also recognized by polyclonal anti HD Equ c1 antibodies from mouse and rabbit sera, although the latter also binds a contaminating band around 36 kDa. In contrast, rSLG Equ c1 is not recognized by rabbit or mouse control sera from animals immunized with horse serum albumin or ovalbumin.
In addition, rSLG Equ c1 is also recognized by the sera of allergic patients in Western blot experiments, suggesting that some or all of the HD Equ c1 epitopes recognized by human IgE are also present on the rSLG Equ c1. Fifteen other sera of allergic patients with established allergy to natural Equ c1 were tested. The same results were obtained with all of these antisera (data not shown). Sera from nonallergic patients failed to detect rSLG Equ c1.
Inhibition/competition experiments with the three different mAbs in an ELISA were performed using rSLG Equ c1, after purification and proteolysis by the factor Xa, and using pure HD Equ c1. The results in Fig. 6A show that preincubation of mAb 220 with an adequate rSLG Equ c1 or HD Equ c1 concentration completely abolished its binding to natural HD Equ c1 coated on the plates. The IC 50 (concentration of inhibitor giving a 50% inhibition) was obtained with the same concentration of rSLG Equ c1 and of HD Equ c1, approximately 100 ng/ml. Similar results were obtained when the plates were coated with the rSLG Equ c1 protein. Experiments using the two other mAbs reveal that rSLG Equ c1 and HD Equ c1 are similarly recognized (data not shown). No competition was observed when BSA was used as a competitor.
The inhibition/competition experiment performed with the polyclonal antibodies from rabbit sera raised against HD Equ c1 (Fig. 6B) reveals similar competition profiles when rSLG Equ c1 or HD Equ c1 are used as competitors; 100 and 50% inhibition are obtained with 20 g/ml and 100 ng/ml, respectively, of either of them. This result suggests that the majority of the HD Equ c1 epitopes are present on the recombinant protein structure.
The biological activity of rSLG Equ c1 was also tested by passive cutaneous anaphylaxis on several rats as described under "Experimental Procedures." The mouse sera were harvested after animal immunization with HD Equ c1, hair dandruff extract, or control proteins (horse serum albumin or ovalbumin). The results in Table II show that rSLG Equ c1 elicits a positive reaction with the mouse anti-HD Equ c1 and the anti-horse hair dandruff sera. These positive reactions are obtained with rSLG Equ c1 and with HD Equ c1 at the same serum dilution. In the same conditions rSLG Equ c1 did not display any positive reaction with the control sera.

TABLE I Determination of monosaccharide composition
The sugar content was determined by gas phase chromatography on pure HD/SLG Equ c1 (31). The relative weight ratio was given for each monosaccharide. The molecular ratio (column 2) was compared with the theoretical ratio for one N-glycan biantennary complex type given in parenthesis.

Monosaccharide
Weight Molecular ratio % Man 1. Homologies of Equ c1 with Proteins of the Lipocalin Superfamily-Homology searches in the sequence data bases show that Equ c1 has sequence similarities with other members of the lipocalin superfamily (Fig. 7). The best score was obtained with the mouse major urinary proteins cLac1 MUP4, the cSmx1 MUP5 (cloned from lachrimal and submaxillary glands, respectively), and rA2U with homology ranging from 49 to 51% of identity and 76% of conservative mutations.
Sequence alignment shows that the two cysteines, Cys 83 and Cys 176 , that form a disulfide bond in mMUP and rA2U, as well as the majority of other lipocalins, are conserved (39). Only one potential N-glycosylation site, corresponding to position Asn 53 , is present in rA2U and is absent from the mMUP. The other site, at position Asn 68 , is specific to Equ c1 and is due to the insertion of a serine residue at position 69.
Three motifs, relatively well conserved among lipocalins, have been described by Flower et al. (40). Two of these motifs are found in Equ c1 (Fig. 7). The most highly conserved amino acid sequences with the lipocalin superfamily are Lys 32 (41). The other conserved motif is TDY (structurally conserved region 2), while Phe 109 , Ile 111 , and Asp 117 seem to be less conserved in the Equ c1 sequence. However, this motif is also absent from a number of true lipocalin members, such as the human tear albumin (37), von Ebner's gland protein (42), and hamster aphrodisin (43), and is less conserved in the bilin-binding protein (44), the ␣1-microglobulin (45), and rat odorant protein (46).
Tissue Expression of Equ c1 mRNA-To study the distribution of Equ c1 in the horse, total RNA was prepared from SLG and SMG salivary glands as well as from the liver, and it was analyzed by RNA blot hybridization (Fig. 8). Equ c1 mRNA was detected in each twice; however, the level in the SMG and liver is about 100 times lower than in the SLG. In addition, Equ c1 mRNA in liver seems to be slightly longer. Whether this is due to a true difference of size or to the presence of a longer poly(A) tail in liver Equ c1 mRNA was not investigated. DISCUSSION This paper reports the cloning, characterization, and expression in a bacterial system of the cDNA corresponding to a major horse allergen, Equ c1. This cDNA was cloned from the SLGs and some differences were noted between its deduced amino acid sequence and peptides generated from a protein purified from horse hair dandruff extract (HD Equ c1). Indeed, 6 amino acids out of 79 are different between the two sequences. Some of these changes are conservative. One likely explanation of these differences is that HD Equ c1 and SLG Equ c1 belong to the same multigenic family, whose members are tissue-specifically expressed, as was reported for rodent urinary proteins from mouse and rat (47). During the cloning of SLG Equ c1, we obtained no evidence for another member of this family being expressed in salivary sublingual glands; however, we cannot exclude the possibility that the choice of primers for reverse transcription-PCR might have favored the cloning of one cDNA only. An RNA blot study revealed the presence of mRNAs hybridizing with SLG Equ c1 cDNA in submaxillary glands and in liver too. Synthesis in the liver could explain the presence of Equ c1 in the horse's urine (18), since it was reported for proteins of the MUP family in rat and mouse (48).
Despite the slight differences in their amino acid sequences and the absence of glycosylation in rSLG Equ c1, rSLG Equ c1 and HD Equ c1 are similarly recognized in our immunoblotting studies and inhibition/competition ELISA experiments. Morever, the results obtained in inhibition/competition ELISA with three mAbs and with rabbit antibodies raised against HD Equ c1 suggest that all IgG epitopes of HD Equ c1 are also present in rSLG Equ c1, and thus in SLG Equ c1. In addition, at least  some of the IgE epitopes are also present in rSLG Equ c1, since rSLG Equ c1 is recognized by IgE from allergic patients in immunoblot experiments and binds to mouse IgE in passive cutaneous anaphylaxis experiments, resulting in the induction of a specific immediate hypersensitivity response in rats presensitized with HD Equ c1. Together, these results suggest that neither the differences in amino acids nor the absence of glycosylation in the bacterially expressed protein affects the global conformation of the protein.
The search in the sequence data base revealed homology with members of the lipocalin superfamily, in particular with cLac1 MUP4 and cSmx1 MUP5. Members of this family share a common structure as was shown by the x-ray crystal structures of retinol-binding protein (49), ␤-lactoglobulin (50), and MUP (51). The folding architecture of lipocalins consists of an eight-stranded ␤-barrel followed by a single ␣-helix and a short C-terminal ␤-strand (Fig. 9). The eight anti-parallel strands are arranged in two orthogonal ␤-sheets that leave a small hydrophobic cavity within the barrel (52). This pocket is in a highly apolar environment, appropriate for binding and transport of small hydrophobic molecules through a hydrophilic media. The binding pocket is entirely formed by aliphatic and aromatic side chains from the inner faces of the two ␤-sheets (these positions are indicated by arrows in the alignment shown in Fig. 7).
A structural model of Equ c1 (Fig. 9) was constructed from the x-ray coordinates of the mouse MUP1 model by Böcskei et al. (51) using the program QUANTA (MSI). This modeling was facilitated by the absence of amino acid insertions and deletions between the two proteins, with two exceptions: the insertion of Asp 22 at the N terminus and Ser 69 in the ␤-hairpin loop between the second and the third strands of Equ c1. At positions where the two proteins differed, the amino acid sequence was substituted, and the side chains were rebuilt using stereochemical criteria. The model was finally submitted to an overall energy minimization. As can be seen in Fig. 9, many of the amino acids of the presumed binding pocket (Ile 63 , Leu 71 , Phe 109 , Ile 111, Leu 124 , Leu 135 , and Tyr 139 ) are either strictly conserved or have conservative amino acid substitutions in SLG Equ c1 when compared with rA2U/mMUP. The most noticeable differences are the substitution of Ala 73 in Equ c1 by Leu/Phe in rA2U/mMUP and the substitution of Phe 90 in the adjacent ␤-strand of Equ c1 by alanine. Although the hydrophobic character of the binding pocket is maintained, these changes might modulate its shape and specificity.
In addition, the two possible N-glycosylation sites, which are not present in MUP1, are found in Equ c1 in exposed protein loops accessible to the solvent (Fig. 9), suggesting that the presence of an N-glycan does not interfere with the structure of the binding pocket. Moreover, the two cysteine residues that form a disulfide bridge linking the C-terminal part of the protein to the ␤-barrel (Fig. 9) in rA2U/mMUP (Fig. 7) and in the majority of other lipocalins are also conserved in Equ c1 (positions 83 and 176).
This structural model, therefore, suggests that Equ c1 could adopt the same tertiary structure as that described for other lipocalins. The exact physiological role of Equ c1 has not been established yet. Its presence in the urine of adult mares and stallions and its absence in the urine of yearlings (18) suggests that Equ c1 is only synthesized at sexual maturity. Thus, its physiological role could be similar to that of rodent urinary protein of mice and rats (pheromone-binding protein) but not completely identical, since these two proteins are essentially produced in males.
Our results allow us to add Equ c1 to the list of lipocalins able to induce an IgE response, thus enhancing the hypothesis of Arruda (6) that lipocalins could have an intrinsic property to stimulate the IgE production. The reasons why some members of the lipocalin superfamily are allergenic are not clear to date. One reason could be their high concentration in secretion in contact with humans, facilitating the captivation of these allergens. Indeed, Equ c1 is highly concentrated in secretory fluid such as saliva and urine as well as in hair dandruff extract (17). In addition, lipocalins have a highly conserved structure that confers a resistance to degradation. For example, ␤lg is able to resist acidic treatment and to pass the stomach intact (5). It has been suggested that this resistance may be important for immunogenicity.
Alternatively, there could be a link between the allergenicity of lipocalins and their small hydrophobic ligand transport function. However, such a link has not yet been established. In fact, the nature of the binding ligand differs between the lipocalins (retinol for ␤lg and several different pheromones for MUP and rA2U). The exact nature of the binding molecule is not known for a number of them such as Bos d2, Bla g4, and Equ c1. Last, we cannot exclude the possibility that, because of their sequence and structure similarities, lipocalins may share common epitopes important for IgE recognition. However, the ex-istence of such a cross-reactivity remains to be clearly established.
In this context, where some members of the lipocalin superfamily may have an intrinsic property to stimulate IgE production, the obtainment of a recombinant wild-type protein and of suitable mutants that can induce a biological activity will be an important tool to study the determinants involved in allergic reactions. Morever, rSLG Equ c1 may also help in the diagnosis of the allergic reaction to horses.