Structure of Insoluble Rat Sperm Glyceraldehyde-3-phosphate Dehydrogenase (GAPDH) via Heterotetramer Formation with Escherichia coli GAPDH Reveals Target for Contraceptive Design*

Sperm glyceraldehyde-3-phosphate dehydrogenase has been shown to be a successful target for a non-hormonal contraceptive approach, but the agents tested to date have had unacceptable side effects. Obtaining the structure of the sperm-specific isoform to allow rational inhibitor design has therefore been a goal for a number of years but has proved intractable because of the insoluble nature of both native and recombinant protein. We have obtained soluble recombinant sperm glyceraldehyde-3-phosphate dehydrogenase as a heterotetramer with the Escherichia coli glyceraldehyde-3-phosphate dehydrogenase in a ratio of 1:3 and have solved the structure of the heterotetramer which we believe represents a novel strategy for structure determination of an insoluble protein. A structure was also obtained where glyceraldehyde 3-phosphate binds in the Ps pocket in the active site of the sperm enzyme subunit in the presence of NAD. Modeling and comparison of the structures of human somatic and sperm-specific glyceraldehyde-3-phosphate dehydrogenase revealed few differences at the active site and hence rebut the long presumed structural specificity of 3-chlorolactaldehyde for the sperm isoform. The contraceptive activity of α-chlorohydrin and its apparent specificity for the sperm isoform in vivo are likely to be due to differences in metabolism to 3-chlorolactaldehyde in spermatozoa and somatic cells. However, further detailed analysis of the sperm glyceraldehyde-3-phosphate dehydrogenase structure revealed sites in the enzyme that do show significant difference compared with published somatic glyceraldehyde-3-phosphate dehydrogenase structures that could be exploited by structure-based drug design to identify leads for novel male contraceptives.

GAPDH (1)(2)(3) and the sole GAPDH enzyme in sperm. GAPDS is highly conserved between species showing 94% identity between rat and mouse and 87% identity between rat and human. Within a particular species, GAPDS also shows significant sequence identity to its GAPDH paralogue, 70, 70, and 68% for rat, mouse, and human, respectively. The most striking difference between GAPDS and GAPDH is the presence of an N-terminal polyproline region in GAPDS, which is 97 residues in rat (accession number AJ297631), 105 in mouse (3), and 72 in human (2). GAPDS is restricted to the principal piece of the sperm flagellum (1,2,4) where it is localized to the fibrous sheath (5), an association proposed to be mediated via the N-terminal polyproline extension.
GAPDS first came to prominence as a contraceptive target during the 1970s (6 -8). Investigations showed that treatment of sperm with ␣-chlorohydrin or a number of related compounds could inhibit GAPDS activity (9 -11), sperm motility (9 -13), and the fertilization of oocytes in vitro (14). The metabolite of these compounds, 3-chlorolactaldehyde (15)(16)(17), selectively inhibited GAPDS, having no effect on the activity of somatic cell GAPDH (18,19), providing the specificity required for a potential contraceptive. Questions surrounding these particular compounds were raised when a number of side effects were evident from in vivo trials (7, 20 -22); however, the design of small molecule inhibitors of GAPDS may provide a viable alternative. Its potential as a contraceptive target was supported by data from mice where GAPDS Ϫ/Ϫ males (23) were infertile because of defects in sperm motility.
Glyceraldehyde-3-phosphate dehydrogenases are tetrameric enzymes that catalyze the oxidative phosphorylation of D-glyceraldehyde 3-phosphate (Glc-3-P) into 1,3-diphosphoglycerate in the presence of an NAD cofactor via a two-step chemical mechanism (24). The first models of substrate binding were proposed on the basis of crystal structures of the holoenzyme from lobster (25) and Bacillus stearothermophilus (26), and Moras and co-workers (25) identified two anion-binding sites postulated to correspond to those binding the C-3 phosphate group of D-Glc-3-P (P s site) and the inorganic phosphate ion (P i site).
Structure-based design of small molecules to inhibit GAPDH is not unprecedented. GAPDH has been targeted from protozoan parasites (27)(28)(29)(30), as the bloodstream forms rely solely on glycolysis for energy production (31,32). A number of mammalian GAPDH structures have also been solved, including rabbit muscle (33,34), human liver (35), and human placenta (36); however, no structures are available for sperm-specific isoforms of this enzyme.
Active heterotetramers of GAPDH between different species have been reported and biochemically characterized previously, both in ratios of 2:2 and 3:1 (37)(38)(39)(40). In this study we have successfully obtained crystals of rat recombinant GAPDS as a heterotetramer with Escherichia coli GAPDH in a 1:3 ratio. To understand the basis of inhibition of the sperm isoform by substrate analogue 3-chlorolactaldehyde, a metabolite of ␣-chlorohydrin, a structure was also determined in the presence of the substrate glyceraldehyde 3-phosphate. The sperm-specific structure was compared with the human placental GAPDH structure (PDB entry 1U8F; Ref. 36) to identify differences that may provide a target for the design of inhibitors specific to the GAPDS protein. The unique structural features identified offer potential candidates for further investigation as inhibitor targets.

EXPERIMENTAL PROCEDURES
Cloning and Expression-RNA was extracted from rat testis with TRIzol reagent (Invitrogen) and used as a template for single-stranded cDNA synthesis using Expand TM reverse transcriptase (Roche Diagnostics). Gene-specific PCR primers were designed to nucleotides 307-325 with an engineered EcoRI site and the first 9 nucleotides of rat gapdh (accession number NM_017008; encoding amino acids 1-3), and nucleotides 1276 -1302, including an engineered XhoI site, of the rat gapdh-2 sequence (accession number AJ297631). The sequence encoding the polyproline region of rat GAPDS (nucleotides 1-306) was omitted as our previous work attempting to express the whole recombinant enzyme was unsuccessful in yielding soluble protein. PCR products (Expand TM High Fidelity PCR system, Roche Diagnostics) were cloned into a PET28a vector (Novagen, Merck), with an N-terminal hexahistidine tag. Transformed E. coli BL21(DE3)pLysS cells (Novagen) were induced with 0.3 mM isopropyl 1-thio-␤-D-galactopyranoside for 4 h at 30°C.
Protein Purification-All columns were obtained from GE Healthcare. Soluble lysate was loaded onto a Ni 2ϩ -charged HiTrap affinity column, and unbound protein was removed with wash buffer (20 mM sodium phosphate buffer, pH 7.4, 150 mM NaCl, 2 mM 2-mercaptoethanol). Protein was eluted using a linear gradient of 0 -0.5 M imidazole (in wash buffer), and fractions containing GAPDS protein, identified using SDS-PAGE and a modified GAPDH activity assay (48), were pooled. The pooled samples were buffer-exchanged into ion exchange buffer (20 mM Tris, pH 7.4, 5 mM 2-mercaptoethanol, 1 mM EDTA) using Vivaspin 20 ultrafiltration concentrating tubes (Vivascience, Germany) and loaded onto a Mono S (FPLC) 5/5 ion exchange column. Bound protein was eluted with a linear gradient of 0 -0.5 M NaCl in ion exchange buffer. Fractions were analyzed by SDS-PAGE, and those containing GAPDS were further enriched by gel filtration. The sample was reduced in volume to 0.5 ml using a Centricon-10 spin column (Amicon Bioseparations, Bedford, MA) prior to loading onto a Superdex 75 16/60 gel filtration column in gel filtration buffer (50 mM HEPES, pH 7.4, 150 mM NaCl, 2 mM 2-mercaptoethanol, 1 mM EDTA). Calibration of the column for estimation of complex size was performed using molecular weight protein standards.
Immunoblot-Proteins were fractionated by SDS-PAGE and transferred to polyvinylidene difluoride (GE Healthcare). Membranes were probed with a GAPDS-specific antibody raised against residues 293-306 of the mouse GAPDS sequence (kindly donated by D. A. O'Brian and E. M. Eddy) and shown not to cross-react with mouse (4), rat (1), or E. coli (data not shown) GAPDH, followed by horseradish peroxidase-conjugated secondary antibody (DakoCytomation, Germany) and enhanced chemiluminescence (GE Healthcare). N-terminal Edman sequencing was performed using a Procise CLC protein sequencer (Applied Biosystems, UK), and peptide mass fingerprinting was performed using a high throughput MALDI-MS Voyager DE STR mass spectrophotometer (Applied Biosystems, UK).
Crystallization and Data Collection-Following gel filtration, GAPDS-E. coli GAPDH complex was concentrated to 4.5 mg⅐ml Ϫ1 using a Vivaspin ultrafiltration tube with a 10-kDa molecular mass cutoff. An initial screen using Hampton Crystal Screen TM , Crystal screen 2 TM , and polyethylene glycol/Ion TM crystal screens was set up in 96-well Intelli-plates (Hampton Research) using a vapor diffusion sitting drop method and kept at 18°C, and a number of conditions were identified that produced crystals in the presence of various polyethylene glycols and ions.
Data were collected from a single crystal with mother liquor 50 mM HEPES, pH 7.4, 2 mM 2-mercaptoethanol, 1 mM EDTA, 5 mM NAD ϩ , grown over a reservoir containing 0.2 M sodium formate, 20% polyethylene glycol 3350, 0.1 M HEPES, pH 7.5, after a brief soak in reservoir solution augmented with 20% glycerol. X-ray diffraction data were collected at BM14, ESRF (Grenoble, France), using a wavelength of 1 Å. The crystallographic data were indexed and processed using HKL-2000 (HKL Industries, Plc). Data were collected at 100 K, using a 0.5 o oscillation (Table 1). For the second crystal, the data were 99.9% complete to 2.9 Å, but the higher resolution data up to 2.4 Å was of good quality and was therefore included in the refinement.
Structure Determination and Refinement-Molecular replacement was performed using the program Phaser (49) using the E. coli GAPDH structure (PDB code 1GAD (41)) as the model. The model was built using Coot (51) and refined using RefMac5 (50). The structures and data were validated using PROCHECK (52) and SFCHECK (53). The refinement statistics (Table 1) are consistent with a well refined, high quality geometric structure with 88.4% of non-glycine residues located in the most favored regions of the Ramachandran plot, 10.8% in the additionally allowed regions, 0.4% in generously allowed regions, and 0.3% in disallowed regions. Those residues in generously allowed and disallowed regions are found within similar regions in a number of GAPDH structures (PDB codes 1U8F, 1ZNQ, and 1JOX).
Accession Numbers-The enzyme collection number for the glyceraldehyde-3-phosphate dehydrogenase is EC 1.2.1.12. The nucleotide sequence for the rat GAPDS sequence has been deposited in the EMBL data base under accession number AJ297631. The amino acid sequence of this protein can be accessed through UNIPROT data base under accession number Q9ESV6_RAT. The atomic coordinates and structure factors for the crystal structure of this protein are available in the Protein Data Bank under accession numbers 2vyn and r2vynsf (without substrate) and 2vyv and r2vyvsf for the substrate complex.

RESULTS
Characterization of the GAPDS Complex-Native GAPDS has proved intractable to isolation from sperm in both detergents and urea. We therefore sought to express recombinant GAPDS in a bacterial expression system. Following lysis of recombinant E. coli, the majority of GAPDS was found in the insoluble fraction with only a small amount of soluble protein, which was not increased when culture conditions were altered or in the presence of solubilizing agents. The small soluble protein fraction obtained was subjected to nickel affinity and ion exchange chromatography to isolate GAPDS; however, an ϳ36-kDa protein was found to persistently co-purify (Fig. 1), and staining of the mixed protein sample on gels consistently showed a preponderance of the unidentified protein over GAPDS. N-terminal Edman sequencing was performed on both of the protein bands identified by SDS-PAGE. The band with higher molecular weight was confirmed to be GAPDS, and the more heavily stained ϳ36-kDa protein band was identified as E. coli GAPDH, the results of which were further confirmed by peptide mass fingerprinting. Gel filtration of the sample resulted in a single protein peak with a molecular mass indicative of a tetrameric GAPDS-E. coli GAPDH complex.
Crystals were successfully formed from the GAPDS-E. coli GAPDH protein complex, and their protein content was analyzed by SDS-PAGE after extensive washing in crystallization buffer. One gel (3 crystals) was silver-stained, and the other (1 crystal) was used for Western blot. Following silver staining, two protein species could clearly be seen, and the band with higher molecular weight was confirmed to be GAPDS (Fig. 2) using a specific antibody. Sequence alignment of E. coli GAPDH and GAPDS revealed a sequence identity of 65% across the whole subunit, with five regions with insertions/deletions and a C-terminal extension in GAPDS (Fig. 3).

TABLE 1 Data collection and refinement statistics for His-GAPDS:E. coli GAPDH
The data set was indexed and processed using the HKL suite (54). Values shown in parentheses correspond to the higher resolution shell. Structure Determination-The crystals both belonged to the space group P2 1 2 1 2 1 . The asymmetric unit contained a putative heterotetramer of GAPDS and E. coli GAPDH (Fig. 4a) giving a solvent content of 42%. To determine the identity of the component monomers, weighted electron density maps and difference electron density maps were examined for each monomer (chains A-D). Analysis of chains A-C revealed only a small number of differences between the observed and calculated electron density based on the E. coli GAPDH model (41), which were consistent across all three subunits and could be satisfied by changing the rotamers of individual amino acid side chains. It was therefore concluded that chains A-C were E. coli GAPDH monomers. However, comparison of the E. coli model with electron density observed in chain D conclusively showed differences that could not be explained by adjustments to the conformation of the model, only by changes in sequence (supplemental Fig. 1). Six regions were identified as containing amino acid insertions or deletions between the E. coli GAPDH and GAPDS primary sequences (Fig. 3 and supplemental Table  1), which all altered the course of the peptide backbone in the final refined structure ( Fig. 4 and supplemental Fig. 2).

Holo complex Gly-3-P soak
The final model (PDB code 2vyn) therefore consisted of three subunits of E. coli GAPDH, and a single subunit of GAPDS, organized as a tetramer that displays approximate 222 symmetry (Fig. 4a). Subunits A and C form a homodimer of E. coli GAPDH, whereas subunits B (E. coli GAPDH) and D (GAPDS) form a heterodimer. The backbones of the individual subunits were compared using LSQKAB (42). Subunits A and B of E. coli GAPDH are very similar (r.m.s. difference in C-␣ positions, 0.18 Å, and subunit C has a few more differences, r.m.s. difference in C-␣ positions, 0.35 Å) whereas the main chain of the D subunit has an r.m.s. difference in C-␣ positions of 3.1 Å (Fig. 4b).
Structural Evaluation of the GAPDS-E. coli GAPDH Tetramer-The secondary and tertiary structures of the GAPDS-E. coli GAPDH tetramer are highly similar to those of other structurally characterized GAPDHs. Each subunit contains an NAD ϩ binding domain and catalytic domain. The NAD ϩ binding domain is formed from residues 1-148 and 315-335, including the characteristic Rossmann dinucleotide binding fold (43), and each binding site is fully occupied by an NAD ϩ molecule bound in an extended conformation. The catalytic domain of each subunit is made up of residues 149 -314 and includes the catalytic Cys 149 at the start of ␣-helix 4 and the corresponding Glc-3-P-binding site. As has been observed in previous GAPDH structures (33,35), the catalytic cysteine was doubly oxidized in each subunit, along with another cysteine (Cys 267 ), despite the presence of 2 mM ␤-mercaptoethanol during purification and crystallization.
Substrate Binding in E. coli GAPDH-GAPDS Heterotetramer-The model described above was used as a molecular replacement model for the data collected after a 5-min soak with 5 mM substrate, glyceraldehyde 3-phosphate. Initial electron density maps revealed extra electron density in the D subunit (GAPDS) in the P s -binding site. In the D site, the model that best satisfies the electron density is one in which the catalytic cysteine remains oxidized and glyceraldehyde 3-phosphate has bound in the P s site to form a substrate complex, without reacting (PDB code 2VYV). Fig. 5 shows the electron density in the active site of the D subunit with and without a substrate soak, contoured at 1.0 in each case to allow comparison. Electron density maps for the remaining subunits also revealed extra electron density that was more consistent with phosphate binding as a result of breakdown or as a contaminant of the substrate glyceraldehyde 3-phosphate.
Analysis of GAPDS-E. coli GAPDH Interface-The observation of heterotetramer in the ratio 3:1 was initially unexpected. The differences in sequence between GAPDS and E. coli GAPDH are concentrated on the surface of the tetramer, in the NAD binding domain (Fig. 4e). The sequence identity rises from 65% overall to 85% in the dimerization domain (residues 149 -314) with just 25 residues not identical. Of the residues that are not identical, just one is found on the dimerization interface for formation of the B-D dimer, residue Cys 244 in GAPDS, which is a Val in E. coli, and rather than interacting across the interface is actually pointing into the hydrophobic core of the D subunit. Thus despite 65% sequence identity, the interactions at the interface between B (E. coli GAPDH) and D (GAPDS) are identical to those between A and C, which are both E. coli GAPDH. When the whole tetramer is inspected, three further residues are close to interfaces formed when two dimers come together to form the tetramer (Fig. 6). Residue Asp 200 , a histidine, replaces a serine in E. coli GAPDH. Again, this lies at the interface, in this case between A and D, and the side chains are oriented in such a way that an intersubunit hydrogen bond is maintained. Residue Asp 109 , a lysine, replaces a histidine in E. coli. The lysine N-lies 3.6 Å away from the carbonyl oxygen of Ala 33 across the subunit interface. The remaining residue that is close to an interface is Asp 277 , a glutamine that replaces an asparagine Cys 276 in E. coli. This side chain makes an intramolecular hydrogen bond with Lys-Asp 295 analogous to that observed between Cys 276 and Cys 294 , also a lysine, in E. coli.
Comparison of GAPDS with Human Placental GAPDH-To design anti-fertility agents that specifically target GAPDS, comparison with somatic GAPDH is essential for identifying nonconserved regions potentially available for exploitation. Therefore, the GAPDS structure was compared with a human GAPDH structure (PDB code 1U8F (36)). The overall sequence conservation between rat and human GAPDS is extremely high (87% identity and 95% similarity) (Fig. 6). Hence, residues that differ between GAPDS and human GAPDH, and that present as potential targets for drug design, can be matched in the human GAPDS sequence.
Secondary Structure Elements-A subunit of the human placental GAPDH was superimposed on the rat-His-GAPDS using Secondary Structure Matching as implemented in the program COOT (51), with an r.m.s. deviation between corresponding C-␣ positions of 0.6 Å supporting the observation that the structures are very similar, including the location of bound water molecules. Water molecules within 3.5 Å of the NAD ϩ molecule were compared with those within the human GAPDH structure, and 12 of 14 of those modeled in the human structure were also observed in a similar position in the GAPDS structure (within 0.6 Å) indicating a conserved hydrogen bonding pattern within the structures. . c, C-terminal extension in D subunit. The final refined model for the sequence 328 YMFS-REK 334 His-GAPDS is shown in blue bonds representation, with final weighted electron density shown at 1. The final refined model for 327 HISK 330 E. coli GAPDH (red) is superimposed. On the right-hand side, the final refined coordinates for the C terminus of E. coli GAPDH, 327 HISK 330 , are shown in red bonds representation with final weighted electron density shown at 1. d, final refined model for the sequence 138 NPGSMTV 146 His-GAPDS is shown in blue bonds representation with final weighted electron density shown at 1, and the final refined model for the corresponding sequence in E. coli from subunit A, 139 AGQDI 143 E. coli GAPDH, is superimposed in red bonds representation. On the right-hand side, the same region 139 AGQDI 143 E. coli GAPDH (red) is shown with its own final weighted electron density shown at 1. e, trace of tetramer with residue positions that differ in sequence between E. coli GAPDH (A, red; B, yellow; C, green) and GAPDS (blue) shown as blue spheres in the D subunit. Residues that lie at the interfaces between subunits are colored magenta.

Identification of the Position of Sequence Differences between GAPDS and Human GAPDH-GAPDS (excluding N-terminal
His tag extension) and human GAPDH sequences contain 105 amino acid differences, the majority of which are located within the NAD ϩ binding domain (residues 1-148 and 315-335) and located predominantly on the outer surface of the GAPDS protein subunit (Fig. 7). A small number of nonconserved residues were identified in areas of potential interest for therapeutic development and are described below.
NAD ϩ -binding Site-The adenine group of the NAD ϩ molecule in GAPDS is displaced by up to 0.6 Å, and the nicotinamide group is displaced by ϳ0.4 Å from the Glc-3-P-binding site and catalytic region toward the adenine region of the molecule compared with the human structure, which is consistent with the r.m.s. deviation between corresponding C-␣ positions as described above. Five amino acids located within 5 Å of the NAD ϩ molecule bound to GAPDS are not conserved in human GAPDH, Lys 76 , Ala 94 , Tyr 98 , Leu 99 , and Thr 118 . Ala 94 and Thr 118 are conservative substitutions that make the same interactions as their counterparts in human GAPDH, and more importantly for human contraceptive design, their counterparts are conserved between human GAPDS and human GAPDH. The loop containing the substitution GAPDS Lys S76 -GAPDH Arg H80 showed significant (Ͼ0.5 Å) atom displacement between GAPDS and human GAPDH structures; however, it is involved in crystal contacts, and the side chains extend outward away from the NAD ϩ molecule. Substitution of GAPDS Tyr S98 , also tyrosine in human GAPDS, for GAPDH Phe H102 occurs adjacent to the adenine region of the NAD ϩ molecule. The tyrosine hydroxyl causes a reduction in the space surrounding NAD ϩ in this region and introduces an additional polar group into the vicinity of the adenine base. The adjacent Leu S99 residue, also Leu in human GAPDS, is substituted for Thr H103 within human GAPDH. Examination of this residue reveals that both side chains point inward toward the NAD ϩ molecule in the vicinity of the 3Ј-hydroxyl of the nicotinamide ribose. The pocket around the 3Ј-hydroxyl is therefore smaller and less polar in GAPDS than in somatic GAPDH. . Sequence alignment of mammalian GAPDH and GAPDS isoforms. The catalytic cysteine residue is indicated by C and is conserved between all mammalian somatic GAPDH and spermatogenic GAPDS and GAPDH2 proteins. *, NAD ϩ -binding residues. Residues that differ between RatHisGAPDS (solved structure) and human GAPDS are shown with a gray background and written in white text. Residues that are shown on a gray background in black text differ between human GAPDH and human GAPDS, possibly important for specificity of chlorohydrin binding.
Substrate-binding Site-The proven activity of the substrate analogue 3-chlorolactaldehyde and its apparent specificity for the sperm isoforms of GAPDH make the active site a region of particular interest. As 3-chlorolactaldehyde cannot be synthesized in vitro, it was modeled into the active site of GAPDS, into the P i site based on the E. coli ternary structure (PDB code 1ML4) and into the P s site based on the B. stearothermophilus ternary structure (PDB code 3CMC), and the current structure of Glc-3-P bound at that site. The structures of rabbit muscle GAPDH, human placental GAPDH, and human liver GAPDH were superimposed on the structure of GAPDS and inspected in the vicinity of the 3-chlorolactaldehyde. The low resolution human muscle structure is substantially different in conformation, but this structure is largely unrefined, and the protein sequence given only has 94% identity with the currently accepted human somatic GAPDH sequence, so this structure has not been analyzed further. There are two residues within 7 Å of substrates binding in the P s site that differ between GAPDS or human GAPDS and somatic GAPDH, Ala 177 (Ser in human GAPDS) and Tyr 178 (Ala in GAPDH) (Fig. 8). However, the side chains of both residues point away from the substrate binding pocket, and the structures are very well conserved between GAPDS and human GAPDH. When chlorolactaldehyde is modeled in the P i site, a single residue within a 7 Å radius of any atom is different in the rat His sperm compared with the other mammalian structures. The side chain of this residue, Ser 206 , an alanine in all the GAPDH sequences, lines the pocket containing binding site P i . Because it is also an alanine in human sperm GAPDS, it cannot be responsible for a difference in specificity of the inhibitor 3-chlorolactaldehyde between the human somatic GAPDH and human sperm GAPDH.
Although there are no significant differences close to the catalytic center of the active site, the P i site is an extended binding pocket (Fig. 8), which we examined further. The site is bound at one end by a loop (residues 189 -192, GAPDS) that shows conformation variation in the somatic GAPDH structures currently available (rabbit muscle GAPDH, human liver GAPDH, and human placental GAPDH) and whose sequence is not completely conserved in sperm GAPDS (190 Arg in human sperm and Gly in somatic sequences; 192 Ala in human sperm and Leu in somatic sequences). Differences at the end of this binding pocket therefore make it a possible target for design of spermspecific inhibitors. This loop interacts on its other side with residues across a subunit contact 32-37 (GAPDS), which also show a greater variation in main chain conformation and sperm-specific changes in sequence (position 36, Pro in sperm and Leu in somatic sequences, position 37, Glu in sperm and Asn in somatic sequences). The potential to exploit this subunit interface, known as the "selectivity cleft," as a target is explored more below.
Analysis of Subunit Interfaces within a GAPDS Model-Structure-based inhibitors designed against trypanosomatid GAPDHs have specifically targeted a narrow intersubunit selectivity cleft near the NAD ϩ -binding site (44,45). Structural analysis of human GAPDH shows this cleft is 4 -5 Å wide (36), FIGURE 7. Sphere diagram of His-GAPDS highlighting sequence differences between His-GAPDS and human placental GAPDH. Chain D of the His-GAPDS-E. coli GAPDH tetramer is shown in stereo with NAD ϩ represented as a blue ball and stick structure. Residue positions that differ in sequence between human GAPDH and GAPDS (blue) are shown as spheres. Residues that differ in sequence in the selectivity cleft are shown as red spheres, and Tyr 98 and Leu 99 , which interact with the NAD ϩ adenine, are shown as yellow spheres. and within Leishmania mexicana GAPDH it is 7-8 Å wide (PDB code 1GYQ (46)). This difference was exploited in the design of the disubstituted adenosine derivative NAD ϩ analogues (such as N 6 -(1-naphthalenemethyl)-2Ј-deoxy-2Ј(3,5-dimethoxybenzamido)adenosine. These compounds bind as an NAD ϩ molecule while inserting the modified group into the selectivity cleft of trypanosomal GAPDH, thereby occupying the active site of the enzyme while preventing catalysis from taking place (45). In silico docking studies indicate the reduced cleft size observed in human GAPDH would lead to steric clashes with these substituted groups thereby preventing access of the molecule into the NAD ϩ -binding site and inhibition of human GAPDH activity (36).
To examine this selectivity cleft in GAPDS, a GAPDS tetramer was assembled by sequentially superimposing the GAPDS D chain onto chains A-C of the GAPDS-E. coli GAPDH tetramer. Overlay of GAPDS tetramer model on the GAPDS-E. coli GAPDH tetramer revealed that some small unexplained difference electron density peaks could correspond to GAPDS amino acid side chains suggesting that a small number of GAPDS-E. coli GAPDH tetramers may be oriented differently within the crystal lattice, hence aligning the GAPDS subunit in chains A-C. It is also possible a small number of tetramers within the crystal contained two or more GAPDS subunits. These observations confirmed the validity of the GAPDS model as accurately representing a GAPDS tetramer and supported its use for structural analysis.
Within human GAPDH, the selectivity cleft is lined by residues Phe 37 , Ile 38 , Asp 39 , Leu 40 , Asn 41 ,Pro 191 ,Ser 192 ,Gly 193 ,Lys 194 , and Leu 195 of an adjacent subunit and includes a number of solvent molecules that mediate inter-subunit hydrogen bonds (36,44,45). Examination of the selectivity cleft showed it to be highly conserved being 4 -5 Å wide in human GAPDH and GAPDS structures. Three of the four water molecules involved in maintaining closure of the cleft in human GAPDH are conserved in this GAPDS-E. coli GAPDH multispecies subunit interface and therefore highly likely to be present within a GAPDS tetramer. There are, however, differences on both sides of the cleft at the top, with Leu 40 being replaced by Pro and Asn 41 being replaced by Glu in human GAPDS. At the opposite side of the cleft, Gly 191 is replaced by arginine in human GAPDS (Lys in GAPDS), and Leu 195 is replaced by an alanine (Fig. 8). These differences will affect the size of the selectivity cleft within GAPDS and human GAPDS and also the charge distribution. In addition, the loop containing this side chain varies in conformation considerably between known structures of somatic GAPDH in contrast to other parts of the structure, as noted above.
Examination of the GAPDS Surface Potential-The polyproline region of native GAPDS is not present in the protein construct that was expressed and therefore is not represented in our model of GAPDS. However, this must project outwards from the protein subunit at the N terminus of the construct that we have crystallized. The electrostatic surface potential of the modeled GAPDS monomer and human somatic GAPDH was calculated within CCP4MG (47) (Fig. 9). There are a number of small differences between the charge distribution in GAPDS and GAPDH, consistent with the majority of the differences in sequence being located on the surface of the molecule. A prominent feature of the charge distribution in GAPDS is a patch of positive charge density around the attachment site of the polyproline extension. Although this is the location of the His tag, the positive charge observed is not because of this as the His tag is not observed in the crystal and therefore not contained in the model.

DISCUSSION
We have successfully solved the crystal structure of a subunit of rat GAPDS at a 2.2 Å resolution providing the first reported structure of a sperm-specific isoform of GAPDH. The crystals obtained were formed from a heterotetramer of GAPDS complexed with E. coli GAPDH in a ratio of 1:3. To our knowledge this is the first reported structure of such a unique complex.
Analysis of the dimerization interfaces within the tetramer revealed that the sequences of GAPDS and E. coli GAPDH are very highly conserved at the interfaces. All of interactions in the more intimate dimerization interface are completely conserved between the two proteins. One single interaction is modified at the interface formed when two dimers come together to make a tetramer, where a Ser-Ser hydrogen bond is replaced with a Ser-His hydrogen bond. Kochman et al. (40) report the observation of heterotetramers between GAPDH isolated from parasitic roundworm and rabbit in every possible ratio. They also report that hybrids with a 1:3 and 3:1 ratio were always preceded by formation of dimers. In this work hybrids were observed predominantly in a 1:3 ratio, judged by analysis of the crystallized complex and the reduced level of staining of GAPDS in gels. It seems likely that this is the result of the very low levels of expression of soluble sperm GAPDS, such that any dimers formed from sperm GAPDS would be in rapid equilib- rium with the E. coli GAPDS and would therefore tend toward evolution to a 1:3 hybrid. It is also possible that the exposed surface charge on the sperm enzyme tends to make 2:2 hybrids less soluble and therefore lost at an early stage in purification.
To gain understanding of the discrimination shown by the inhibitor 3-chlorolactaldehyde for the sperm-specific isoform of GAPDH, the active site was compared with the available structures of somatic GAPDH in the vicinity of substrate both in the P s site and the P i site. This analysis reveals that the environment in close vicinity of any adduct between the 3-chlorolactaldehyde and the catalytic cysteine is very highly conserved. Hence the apparent preferential inhibition of the sperm isoform by chloro-analogues of glyceraldehyde 3-phosphate does not have a structural explanation. Instead, it is likely that the difference in metabolism of the chloro-analogues to 3-chlorolactaldehyde, the difference in the relative proportions of NAD ϩ and NADH, or the difference in the balance between inhibitor and substrate between sperm and somatic cells results in the observed specificity. It is also interesting to note that it has been proposed (15) that the effect of the above compounds and abolition of the gapds gene on sperm motility are not through direct inhibition of GAPDS and glycolysis but are due to the accumulation of glycolytic intermediates that impair oxidative phosphorylation in some way, possibly by sequestering available phosphate.
Nonetheless, by which ever means, binding of inhibitor to GAPDS does inhibit fertility. We therefore performed further detailed comparative analysis of the GAPDS structure to identify alternative regions that could potentially be exploited for the design of small molecule inhibitors. Three potential sites were identified. The NAD ϩ -binding site proved to be highly conserved between the sperm and somatic cell isoforms, but slightly different binding environments were observed in the region of the NAD ϩ adenine ring and the nicotinamide ribose group, resulting in a smaller and less polar pocket in the sperm isoform. Structure analysis does not reveal any significant differences in the P s -binding site. Although most of the residues lining the P i pocket are conserved, there are differences both in sequence and in conformation in a loop that both forms the end of the pocket and also interacts across a subunit interface or selectivity cleft with another loop that shows some variation between sperm and somatic isoforms. The selectivity cleft, successfully targeted in structure-based design against trypanosomes (44, 45), shows differences on either side between the sperm GAPDS and mammalian GAPDH, differing in size and charge distribution between the isoforms. From these data we suggest that both the extended P i binding pocket and the adjacent selectivity cleft offer the greatest potential for targeted inhibitor design.
An alternative approach to the design of a GAPDS-specific inhibitor would be to target a noncatalytic feature of the GAPDS tetramer. GAPDS is localized to the fibrous sheath of the sperm flagellar via an extended N-terminal polyproline tail. Directing compounds to this or other surface regions may prevent association of GAPDS with the fibrous sheath or other proteins, thus abolishing function. In conclusion, this paper provides sought after data describing structural differences between sperm and somatic GAPDH that will be of particular interest to researchers in the field of contraceptive design as well as to those studying similar enzyme families.