Species diversity in the structure of zonadhesin, a sperm-specific membrane protein containing multiple cell adhesion molecule-like domains.

A hallmark of gamete interactions at fertilization is relative or absolute species specificity. A pig sperm protein that binds to the extracellular matrix of the egg in a species-specific manner was recently identified and named zonadhesin (Hardy, D. M., and Garbers, D. L. (1995) J. Biol. Chem. 270, 26025-26028). We have now cloned a cDNA for mouse zonadhesin (16.4 kb), and it demonstrates a large species variation in the numbers and arrangements of domains. Expression of mouse zonadhesin mRNA is evident only within the testis, and the protein is found exclusively on the apical region of the sperm head. There are 20 partial D-domains, found as tandem repeats, inserted between two of the four full D-domains and an additional partial D-domain. These domains are homologous to the D-domains of von Willebrand factor and alpha-tectorin. A region at the N terminus of the mouse cDNA contains three tandem repeats homologous to MAM domains. These are domains comprised of about 160 amino acids that are present in transmembrane proteins such as the meprins and receptor protein-tyrosine phosphatases, where they appear to function in cell/cell interactions. Additionally, mouse zonadhesin contains a mucin-like domain and a domain homologous to epidermal growth factor (EGF). A putative single transmembrane segment separates a short carboxyl tail from the extracellular region. The existence of MAM, mucin, D-, and EGF domains suggest that mouse zonadhesin functions in multiple cell adhesion processes, where binding to the extracellular matrix of the egg is but one of the functions of this sperm-specific membrane protein.

Prior to fertilization, the sperm cell interacts with multiple cells including fellow sperm cells, Sertoli cells, epithelial cells within the male reproductive tract or female reproductive tract, cells associated with the egg, and the egg itself. In some cases, the interactions are attractive in nature and in others repulsive. In an attempt to identify sperm proteins that bind to the egg extracellular matrix, Hardy and Garbers (1) solubilized sperm membranes from the pig and determined which proteins bound to the pig zona pellucida. Subsequently, the cDNA encoding a protein that bound to the zona pellucida in a relatively species-specific manner was cloned and named zonadhesin (2). Pig zonadhesin is a transmembrane protein with a very short intracellular region. The putative extracellular region contains a mucin-like domain and five tandem repeats homologous to the D-domains of prepro-von Willebrand factor (ppvWF) 1 (3,4). The cloning of the pig sperm zonadhesin cDNA raised the questions of whether other species would contain homologs of zonadhesin and, if so, whether these would be variable within the above domain structure. Here, the mouse homolog of zonadhesin is cloned and shown to contain additional repetitive sequences within the D-domain region. Furthermore, three tandem repeats of a MAM domain were identified at the N terminus. The MAM domain is a module composed of about 160 amino acids including four conserved Cys; it is found in a number of functionally diverse transmembrane proteins such as the meprins and receptor protein-tyrosine phosphatases (5). Thus, zonadhesin displays a large species variation between pig and mouse and also contains multiple domains previously suggested as involved in cell adhesion processes.

MATERIALS AND METHODS
RNA Isolation-Total RNA was isolated from 40 mouse testes or similar amounts of other mouse tissues (including brain, heart, kidney, liver, lung, small intestine, and spleen) in guanidinium thiocyanate and N-lauroyl sarcosine, followed by extraction with acidic phenol/CHCl 3 and precipitation with isopropyl alcohol (6). Poly(A) ϩ RNA was purified from total RNA by oligo(dT)-cellulose chromatography (7).
Northern Blots-30 g of total RNA from various mouse tissues was loaded on each lane of a formaldehyde-agarose gel (8) and blotted overnight using 10 ϫ SSC as the transfer buffer. The blot was then fixed by UV light and hybridized with a random primer-labeled (Amersham Life Science, Inc.), 1.4-kb probe from the 3Ј-end of the mouse zonadhesin cDNA as described previously (1). After hybridization, the blot was washed with 0.2 ϫ SSC and 0.2% SDS at 65°C.
Screening of cDNA Libraries-The first mouse cDNA clone (#18) was obtained by a low stringency screening of a mouse testis cDNA library (Stratagene) using a cDNA fragment corresponding to the pig zonadhesin D4 domain as described previously (9). The same cDNA library was further screened with the mouse cDNA clone #18 as a probe under previously described conditions (1). The positive phage clones were converted to pBluescript plasmid by phage rescue and sequenced. The longest clone contained a 2.5-kb cDNA fragment.
Construction of a Plasmid cDNA Library-To obtain additional 5Јend sequence, 2 g of mouse testis mRNA was used to construct an * This work was supported in part by National Institutes of Health Grant HD10254 (to D. L. G.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Inverse PCR and Anchor PCR-Inverse PCR was carried out with two gene-specific primers, a 5Ј-end antisense primer and a 3Ј-end sense primer. The Expand TM Long Template PCR kit (Boehringer Mannheim) was used with an Ericomp Twinblock System thermal cycler. The inverse PCR reaction was performed by denaturing at 95°C for 2 min followed by 30 cycles of denaturation at 92°C for 30 s, annealing at 60°C for 1 min and extension at 68°C for 6 min. The PCR product was diluted 100-fold before use as a template for anchor PCR. Anchor PCR was performed using a sequence (5Ј-ATTACGCGCTCTAATACGACT-CACTATAGG-3Ј) on the vector pSPORT1 as the sense primer and a gene-specific primer, upstream of the 5Ј-end antisense primer used in the inverse PCR, as the antisense primer. The anchor PCR reaction was performed using an air thermal cycler (Idaho Technologies) by denaturing at 94°C for 1 min, followed by 30 cycles of denaturation at 94°C for 0 s, annealing at 55°C for 0 s, and extension at 68°C for 1 min. A mixture of Taq and Pfu (9:1) was used as polymerases for the anchor PCR (10) in a standard PCR reaction buffer (10 mM Tris-HCl, pH 9.0, 50 mM KCl, 2 mM MgCl 2 , 0.1% Triton X-100, 0.2 M primers, and 0.2 mM dNTPs). PCR products were then purified on a 1% agarose gel and subcloned into a TA cloning vector (Invitrogen). The above process of combination of inverse and anchor PCR was repeatedly performed until no additional sequence could be obtained from the library. A total of an additional 6-kb composite sequence of cDNA was obtained.
Gene-specific RT and 5ЈRACE-PCR-To obtain the full-length sequence of mouse zonadhesin, gene-specific reverse transcriptions (RT) and 5Ј-RACE (rapid amplification of cDNA ends) were performed using the Marathon TM cDNA amplification kit (CLONTECH). 2 g of mRNA was used for synthesis of cDNA according to the manufacturer instructions. Gene-specific primers 5Ј-AGCCCCATCGAACGTCAGGTAGTG-3Ј and 5Ј-ATGCCATCTGACTTCAGGTTGTCG-3Ј were used for the first and second gene-specific reverse transcriptions, separately. RACE-PCR was carried out by nested PCR with two sets of primer pairs, each of them containing one gene specific primer and one adapter primer of the Marathon TM cDNA amplification kit. The RACE-PCR product was diluted 100-fold before use as a template for the nested PCR. The Advantage TM PCR kit (CLONTECH) was used for the RACE-PCR reactions. RACE-PCR reactions were performed by denaturing at 94°C for 1 min, followed by 30 cycles of denaturation at 94°C for 0 s, annealing at 55°C for 0 s, and extension at 68°C for 1 min. PCR products were then purified on a 1% agarose gel and subcloned into a TA cloning vector (Invitrogen). An additional 7-kb composite sequence of cDNA was obtained by these methods.
Preparation and Purification of GST Fusion Proteins-The plasmid pGEX-KG (11) was used to express GST-D3c or GST-D3p17 fusion proteins, in which D3c represented the C terminus of the D3 domain while D3p17 was the 17th D3 partial domain of zonadhesin (see Fig. 1). Primer pairs containing an EcoRI site at the 5Ј-end of the sense primer and a HindIII site at the 3Ј-end of the antisense primer were used so that PCR products could be subcloned into the pGEX-KG vector in the correct reading frame. PCR reactions were performed using an air thermal cycler (Idaho Technologies) under the following conditions: 1 cycle of denaturation at 94°C for 1 min; 1 cycle of denaturation at 94°C for 0 s, annealing at 55°C for 0 s, and extension at 68°C for 1 min; 25 cycles of denaturation at 94°C for 0 s, annealing and extension at 68°C for 1 min; and 1 cycle at 72°C for 2 min. PCR products were then separated on a 1% agarose gel, and DNA bands were purified using the QIAEX method (Qiagen). The purified PCR products were digested overnight with a mixture of EcoRI and HindIII in the core buffer of Promega, prior to ligation into EcoRI/HindIII-linearized pGEX-KG vector and subsequent transformation into DH5␣ bacteria. Recombinants were selected on ampicillin/LB plates. The sequences of the subcloned D3c and D3p17 products were confirmed before purification of the fusion proteins of GST-D3c and GST-D3p17.
The purification of bacterial expressed GST fusion proteins was performed according to Li et al. (12). The fusion proteins, expressed in bacterial strain DH5␣, were purified in brief as follows: 50 ml of overnight culture was inoculated into 1 liter of LB and incubated for 3 h at 37°C before adding isopropyl-1-thio-␤-D-galactopyranoside (50 M) and incubation at room temperature for another 8 h. The culture was centrifuged at 5000 ϫ g for 30 min, and the bacterial pellet was resuspended in 10 ml of ice-cold PBS containing 10 mg/ml of lysozyme, 1% Triton X-100, and protease inhibitor mixture composed of 1 mM EDTA, 1 M leupeptin, 1 M pepstatin, 4 mM Pefabloc SC (Boehringer Mannheim) and lysed by sonication on ice. Sonication was performed using a microtip at output scale 7 for 3 ϫ 30 s with 2-min intervals. After centrifugation at 20,000 ϫ g for 0.5 h, the supernatant fluid was transferred to a 50-ml tube and incubated with 1 ml of 50% glutathioneagarose beads on a rocker overnight at 4°C. The beads were washed four times with 40 ml of cold PBS, and the GST fusion protein was eluted with 4 ϫ 1 ml of freshly made 10 mM reduced glutathione (in 50 mM Tris, pH 8.0) (Sigma). The purity and concentrations of the purified proteins were examined by SDS-PAGE.
Production of Antiserum-15 ml of preimmune serum was collected from each rabbit before injection of antigens. 1 ml of 0.2 mg/ml GST-D3c or GST-D3p17 fusion protein was mixed with 1 ml of complete adjuvant (Difco). Booster injection was once at Day 14 with a mixture of 500 l of 0.2 mg/ml fusion protein and 500 l of incomplete adjuvant (Sigma). 15 ml of blood was collected from each rabbit after another 2 weeks and stored at 4°C overnight. Antisera were separated from blood cells by centrifugation at 1000 ϫ g for 10 min and were then stored at Ϫ80°C as 1-ml aliquots.
Preparation of Mouse Sperm Membrane Proteins-The preparation of mouse sperm membrane protein was based on Hardy et al. (2) and Bleil et al. (13) with modifications. 40 male mice (ICR) were used for one preparation. Caudal epididymal and vas deferens were dissected from the mice, transferred to 2 ml of pre-warmed (37°C) Earle's modified medium 199 (Life Technologies, Inc.) containing 4 mg/ml BSA, 30 g/ml pyruvate, and 4 mM EGTA (M199-ME), covered with mineral oil, and then minced with a scissors and placed in a cell culture incubator for 10 min. The spermatozoa were transferred to 30 ml of M199-ME and capacitated at 37°C for 1 h. The spermatozoa were then centrifuged down at 1500 ϫ g for 15 min and resuspended in 25 ml of ice-cold HE/diisopropyl fluorophosphate (DFP) buffer (50 mM NaHEPES, 1 mM NaEDTA, 1 mM DFP, pH 7.5) before being transferred to a Parr Bomb (30 ml) for N 2 cavitation at 0°C and 650 p.s.i. for 30 min. Disrupted cells were centrifuged at 1500 ϫ g for 10 min, and the supernatant fluid was centrifuged again at 51,000 rpm (Beckman Ti-70) for 45 min. The crude membrane pellet was resuspended in 5-ml of ice-cold HE/DFP, and washed twice by centrifugation at 100,000 rpm (TL-100.3 rotor) for 20 min. The membrane pellet was resuspended in 1 ml of ice-cold HE/DFP, sheared with 18 -30-gauge needles, and stored at Ϫ20°C in a 100-l aliquot. The concentration of the isolated sperm membrane was determined as 4 mg/ml by the BCA assay.
Solubilization of Sperm Membranes-100 l of 4 mg/ml sperm membrane protein was mixed with 100 l of 2% CHAPS in 300 mM NaCl and 1 mM EDTA, 1 mM leupeptin, 1 mM pepstatin, and 4 mM Pefabloc SC and rocked at 4°C for 1 h before centrifugation at 100,000 rpm (TL-100.2 rotor) for 15 min. The supernatant fluid was transferred to a new tube and stored at Ϫ20°C.

SDS-PAGE and Western
Blots-A 4 -12% polyacrylamide gel (NOVEX TM ) was used for protein analysis. Solubilized sperm membrane proteins (5 l) were mixed with equivalent amounts of modified 2 ϫ sample buffer (250 mM Tris-HCl, pH 6.8, 10% SDS, 10% glycerol, and 0.01% bromphenol blue) and heated at 65°C for 5 min before being loaded on a gel. Electrophoresis was performed at 60 volts for 5-10 h before blotting to nitrocellulose at 25 volts overnight at 4°C. The blot was then air dried and washed twice in TBST (20 mM Tris-HCl, pH 7.5, 150 mM NaCl, 0.1% Tween 20) and once in TBST containing 100 mg/ml BSA before incubation with 1:10000 diluted, purified antibody S117 (against GST-D3c) for 1 h. The blot was then washed three times in 10 ml of TBST for 15 min each and incubated with 1:20000 diluted horseradish peroxidase-conjugated goat anti-rabbit IgG (Biosource) for another 1 h followed by washing three times in TBST. The peroxidase activity was detected with enhanced chemiluminescence (ECL, Amersham Life Science, Inc.).
Cytoimmunofluorescence-Spermatozoa were collected in 200 l of M199-ME medium (M199 with Earle's salts, 30 g/ml pyruvic acid, 4 mg/ml BSA, and 1 mM EDTA) (13) from the caudal epididymis and washed twice with 15 ml of PBS by centrifugation at 1000 ϫ g for 10 min. Sperm cells were then fixed either in a 4% paraformaldehyde/PBS solution or in 100% methanol. The fixation in 4% paraformaldehyde was performed on ice for 30 min followed by washing twice with 40 ml of PBS. After incubating in a 0.1 M glycine/PBS solution overnight at 4°C, samples were washed in PBS twice before being smeared on slides. When fixed in methanol, sperm cells were directly smeared on slides and air dried. The slides were then placed in ice-cold methanol for 30 min followed by washing in 100% ethanol for 10 min before the immunofluorescence assay. 300 l of blocking reagent (10% goat serum and 1 mM EDTA in PBS) was incubated with spermatozoa on a slide at room temperature for 30 min. Antisera P448 against GST-D3p17 fusion protein or S117 against GST-D3c were diluted in 300 l of blocking reagent at a ratio of 1:50 to 1:200 and then placed on the samples and incubated in a moisture chamber overnight. The slides were washed in 500 ml of PBS for 10 min before incubation with 300 l of 1:200 to 1:500 diluted Texas Red-conjugated goat anti-rabbit antibody for 60 min at room temperature. The slides were washed in 500 ml of PBS for 10 min before being mounted with Fluoromont G. Pictures were taken with double exposures of fluorescence and Nomarski.

RESULTS AND DISCUSSIONS
The finding of a pig sperm protein that bound to the egg extracellular matrix in an apparently species-specific manner raised the important questions of whether homologs of this protein would exist in other species and, if so, if they would display significant species variation in structure. A mouse testis cDNA library was initially screened at low stringency with a 3Ј fragment of the pig zonadhesin cDNA. Three clones were obtained, the largest being 1.2 kb. By use of the 1.2-kb fragment from mouse, a larger clone of 2.5 kb was then obtained from the same testis cDNA library (Fig. 1, Phage cDNA library).
To obtain the 5Ј-sequence, an oligo(dT)-primed plasmid cDNA library was synthesized as described above, and a combination of inverse and anchor PCRs was used to screen the plasmid cDNA library. A combination of inverse/anchor PCRs instead of nest/anchor PCRs was used since the two genespecific primers in the inverse PCR significantly increased PCR reaction specificity. In many cases, multiple bands were detected on agarose gels after anchor PCR, and each band was cloned separately into a TA cloning vector. Sequence comparisons of these subclones showed that they shared the same 3Ј-sequence with different 5Ј-sequences. By these methods, the majority of the cDNA sequence was obtained (Fig. 1). Genespecific reverse transcriptions and 5Ј-RACE were then performed to obtain the last 7 kb of mouse zonadhesin cDNA. We did two reverse transcription reactions with the gene-specific primers described above, followed by nested 5Ј-RACE-PCR. The nested 5Ј-RACE-PCR was repeated until new sequence was undetectable from the products of reverse transcriptions. By the use of nested 5Ј-RACE-PCR, we obtained a 4-kb composite sequence of cDNA from the first reverse transcription product and a 3.5-kb from the second. Again, we cloned each band on agarose gels from PCR reactions to confirm sequences.
The criteria we used that we had obtained the full-length cDNA were as follows. First, the predicted initiation Met coincided with a Kozak consensus sequence (14). Second, an apparent signal peptide followed the proposed Met (15). Third, all three reading frames upstream of the proposed initiation codon contained at least one stop codon. Fourth, as shown below, the size of the mRNA from Northern blots closely matched the size of the cDNA clone.
The 16,380 base composite sequence of cDNA contains a single major open reading frame of 16,131 bases with a deduced protein sequence of 5376 amino acids. A 17-amino acid putative signal peptide (15) is present at the N terminus of the deduced sequence (Fig. 2). The sequence also predicts a 5293-amino acid extracellular region, a single transmembrane segment, and a 39-amino acid intracellular C terminus with 30% basic residues. Cleavage of the putative signal peptide at Gly 17 would produce a 5359-amino acid mature polypeptide chain with a calculated molecular mass of ϳ578,000 Da (Fig. 2). Mouse zonadhesin has 40 potential N-linked glycosylation sites (NXT/S) and a large number of potential O-linked glycosylation sites.
Like pig zonadhesin, mRNA of mouse zonadhesin is only detectable in testis, but the 16 -17-kb mRNA is about twice as large (Fig. 3A).
The localization of zonadhesin is of particular importance since 1) the predicted structure to be discussed below suggests zonadhesin is a transmembrane protein, and 2) zonadhesin contains multiple extracellular domains, suggesting it potentially functions in multiple cell/cell interactions. By the use of antibodies (S117 or P448) generated to zonadhesin fusion proteins, expression was demonstrated as exclusively on the apical region of the sperm head (Fig. 3B), regardless of whether methanol-or paraformaldehyde-fixed sperm cells were used. The localization is consistent with zonadhesin interacting with the extracellular matrix of the egg or with other cells and also supports the model of it as a transmembrane protein.
Zonadhesin was processed on mature and capacitated sperm cells as evidenced through detection with antiserum S117 on a Western blot (Fig. 3C). One band at 100 kDa and two major bands between 200 and ϳ250 kDa are seen on non-reducing SDS-PAGE. The reason that S117 recognized three bands on the gel is very likely due to the high sequence similarities between the D3 partial domains. Processing of the pig zonadhesin also occurred, yielding 45 and 105 kDa mass proteins on reducing SDS-PAGE (2). The processing of zonadhesin and its relationship to function is not yet known, but presumably the unprocessed forms function prior to encountering the egg.
Analysis of the sequence of mouse zonadhesin revealed that it contained the domains present in the pig zonadhesin, including MAM domains (see below), a mucin-like domain, D-domains, an EGF-like domain, a transmembrane segment, and a short intracellular region. The similarity of the primary amino acid sequence within each domain (see below) together with Northern blot and cytoimmunofluorescence data strongly indicate that the composite cDNA sequence we obtained from mouse testis is a homolog of pig zonadhesin.
Mouse zonadhesin contains three tandem repeats of a domain known as the MAM domain, a new finding that strongly suggests a role for this region of zonadhesin in cell-cell interactions. Subsequently, we found that pig zonadhesin also contains one full and one partial MAM domain (Fig. 4). A MAM domain contains about 160 amino acids and four conserved cysteine residues, as well as conserved hydrophobic and aromatic amino acids (Fig. 5). The MAM domain is suggested to function in cell adhesion and is found in a diverse number of membrane proteins including meprins (16), A5 protein (17), and receptor protein-tyrosine phosphatases (18 -20). The MAM-bearing meprins are zinc-metalloproteases that contain ␣and ␤-subunits (16,21). The MAM domain has been suggested to mediate the dimerization or oligomerization of meprin subunits (5). A5 protein is a developmentally regulated cell surface molecule involved in neurite outgrowth or axonal guidance, and its MAM domain is critical for cell-cell interactions (5). Some of the receptor protein-tyrosine phosphatase family members, such as RPTP, RPTP, RPTP, and PCP-2, contain MAM domains, where they have been suggested as essential for homophilic cell-cell interaction and the specificity of those interactions (5,18). Another MAM-containing protein (gene B product) from Xenopus has been reported by Brown et al. (22). The function of this MAM-containing protein is unknown, but the protein is induced by thyroid hormone during Xenopus laevis metamorphosis (22). Interestingly, there is much higher sequence similarity within the MAM domains of a given protein group, suggesting slightly different functions within the different protein groups. The function of the MAM domains of zonadhesin is not known, but since this region can be eliminated by proteolysis with retention of zona pellucida binding properties (1,2), it may function during sperm development, for example in interactions with Sertoli cells.
To the carboxyl side of the MAM domains is a mucin-like domain, The domain is very rich in Thr, Ser, Pro, Glu, and Val, containing 26,8,15,17, and 14%, respectively. The mucin domain is composed of 7-amino acid imperfect repeats with a consensus sequence of PTE(E/V)(P/T)TV (Fig. 2). The similar heptapeptide repeats were also found in pig zonadhesin (2). It has been suggested that absolute length is not crucial to mucin function but rather that the core protein exists in an extended form as a scaffold for O-linked carbohydrate (23). The relatively short mucin-like domains present in some receptors have been suggested to lift the ligand-binding site above the glycocalyx (24). Additionally, membrane-associated mucins have been found in various instances to serve as ligands for cell surface receptor selectins (25). Therefore, the mucin-like domain in zonadhesin could serve to lift the MAM domain above the glycocalyx and thereby facilitate cell-cell interactions in the male reproductive tract and/or act as a repulsive barrier to prevent nonspecific interactions between spermatozoa and other cells in the male or female reproductive tract (e.g. adhesion of spermatozoa within the oviductal isthmus).
Following the mucin-like domain, there is a partial D-domain (D0), 3 full D-domains (D1-D3), 20 tandem repeats of partial D3 domains (D3p1-20), and finally a D4 domain (Fig.  4). The D1, D2, D3, and D4 domains show relatively high similarity to the counterpart domains of pig zonadhesin (61, 68, 47, and 63, respectively). However, unlike pig zonadhesin which has only one partial D-domain (D0), mouse zonadhesin has at least 21 partial D-domains, a D0 domain, and 20 tandem repeats of a D3 partial domain. The newly identified 20 tandem repeats of D3p1-20 make up more than 40% of the protein mass of mouse zonadhesin. Like the D0 domain, D3p1-20 are very rich in cysteine, with 18 cysteines within each 120-amino acid repeat (15%). Each cysteine is conserved, suggesting that these residues are critical for the integrity of the protein structure (Fig. 2). We designated these 120-amino acid repeats as D3 partial domains instead of D0 domains since they have higher similarities to the C terminus of the D3 domain (56 -67%) than to the D0 domain (26 -37%) (Fig. 2). These partial D-domains are homologous to the DЈ domain of vWF, in which DЈ appears to support a specific binding site for procoagulant Factor VIII (3).
In addition to their presence in ppvWF, D-domains have recently been found in several functionally diverse extracellular proteins including some of the secreted mucins such as human, mouse, and rat intestinal mucin MUC2, mMuc2, and rMuc2 (23,26,27), Xenopus laevis integumentary mucin FIM-B.1 (28), insect humoral lectin hemocytin (29), mouse inner ear matrix protein ␣-tectorin (30) and SCO-spondin, a glycoprotein secreted from the subcommissural organ (31). In Fig. 4, the domain structures of some of the above proteins are illustrated. The arrangement of D-domains can be divided into two groups with zonadhesin and ␣-tectorin in one group and vWF, MUC2, hemocytin, and SCO-spondin (not shown) in another. Of particular interest are two proteins, hemocytin and ␣-tectorin. Hemocytin is a humoral lectin cloned from Bombyx mori that plays an important role in a nonspecific self-defense mechanism; it possesses hemagglutinating activity. The carbohydrate-recognition domain in hemocytin overlaps with the D3 domain, and thus D3 domains could serve as lectins (29). Mouse ␣-tectorin has been recently cloned from a mouse cochlea cDNA library (30), where it is one of the two major non-collagenous proteins of the mouse tectorial membrane of the inner ear. It interacts with another non-collagenous protein ␤-tectorin in the tectorial membrane. The ␣-tectorin contains the D-domain repeats in the same order as that of pig zonadhesin. ␣-Tectorin contains a zona pellucida (ZP) domain (32) which also exists in ␣-tectorin and thus contains components of the egg extracellular matrix as well as zonadhesin within the same molecule. The ZP domain may serve as a recognition site for filament formation. In the tectorial membrane, there are two distinct filament types, a light and dark staining filament, suggesting that the ␣and ␤-tectorins may form homomeric filaments via their ZP domain while the two filament types interact with one another to form a striated sheet matrix via D-domain interactions. Thus, in the inner ear, the ZP domain and the D-domains can exist and potentially interact within the same extracellular matrix, whereas during fertilization, this interaction occurs between the different germ cells.
There are a total of 28 full-length D-domains in the proteins discussed above. The sequence alignment among those 28 Ddomains indicates that a majority of Cys and two 4-amino acid motifs TFDG and GLCG are highly conserved to preserve a proper domain structure, even though the average amino acid similarities between D-domains of zonadhesin and those of other different functional proteins are only around 30 -40%. Like the D-domains in the pig, D1-D3 domains of mouse zonadhesin also contain a CXXC motif. The CXXC motif is responsible for self-oligomerization of the ppvWF (3) and MUC2 (27). Therefore, it is likely that the zonadhesin is also present as a multimer (1). The peptide of WREPSFCALS in bovine ppvWF D2 domain is able to inhibit binding of ppvWF to collagen (33). A similar sequence is also found in the mouse (WREPQFCPLV) and the pig (WRGPQFCPLA) zonadhesin D1 domains. Therefore, zonadhesin may also bind to collagen or collagen-like proteins.
During the cloning, we have also found variant fragments that could result from alternative splicing or gene rearrangement events. There was one PCR product missing the D3 partial domains from the 9th to 12th and another missing the 12th to 15th. One extra D3 partial domain was found on one occasion between the D3 domain and the second D3 partial domain. These products could be the result of alternative splicing. Additionally, rare PCR products were obtained that contained the C terminus of the D2 domain fused to the N terminus of the mucin-like domain and the C terminus of the D4 domain fused to the N terminus of the 16th D3 partial domain. These products suggest possible occasional gene rearrangement events. However, we cannot rule out possible cloning artifacts among these rare PCR products.
Zonadhesin has an EGF-like domain at the N terminus of the transmembrane segment (Fig. 4). The EGF-like repeat is one of the most shuffled domains in various animal extracellular proteins and has been identified in more than 70 different vertebrate and invertebrate proteins (34). It has been suggested that the EGF repeat may have a particular affinity for proteases because of the recurring role of proteolysis in the excision of growth factor from its precursor and the activation of many of the secreted soluble proteins such as factors VII, IX, and X, proteins C and S, and t-PA and u-PA (31). Therefore, the EGF domain in zonadhesin may also play a role in facilitating protease binding to zonadhesin since ZP affinity-purified pig zonadhesin contains two disulfide-bonded fragments, p45 and p105, where p45 is composed of a D0 and D1 domain while p105 contains the D2-D4 domains (1). Mouse zonadhesin was also processed based on Western blots (Fig. 3C). There are several dibasic residues in mouse zonadhesin that could serve as cleavage sites for endoproteinases (35). Extracellular proteinases have been suggested to play an important role in mammalian fertilization (36).
Based on the protein domain structure, zonadhesin is a multifunctional mosaic protein, typical of an adhesion molecule. The domains present in zonadhesin appear to be present within the extracellular regions of various proteins. Furthermore, the MAM, mucin, D-, and EGF-like domains are known to be involved in cell-cell, cell-matrix, and/or protein-protein interactions in the other proteins possessing these domains. Thus, zonadhesin likely serves in multiple cell-cell interactions as a sperm-specific membrane protein.