The Crystal Structure of Progesterone 5β-Reductase from Digitalis lanata Defines a Novel Class of Short Chain Dehydrogenases/Reductases*

Progesterone 5β-reductase (5β-POR) catalyzes the stereospecific reduction of progesterone to 5β-pregnane-3,20-dione and is a key enzyme in the biosynthetic pathway of cardenolides in Digitalis (foxglove) plants. Sequence considerations suggested that 5β-POR is a member of the short chain dehydrogenase/reductase (SDR) family of proteins but at the same time revealed that the sequence motifs that in standard SDRs contain the catalytically important residues are missing. Here we present crystal structures of 5β-POR from Digitalis lanata in complex with NADP+ at 2.3Å and without cofactor bound at 2.4Å resolution together with a model of a ternary complex consisting of 5β-POR, NADP+, and progesterone. Indeed, 5β-POR displays the fold of an extended SDR. The architecture of the active site is, however, unprecedented because none of the standard catalytic residues are structurally conserved. A tyrosine (Tyr-179) and a lysine residue (Lys-147) are present in the active site, but they are displayed from novel positions and are part of novel sequence motifs. Mutating Tyr-179 to either alanine or phenylalanine completely abolishes the enzymatic activity. We propose that the distinct topology reflects the fact that 5β-POR reduces a conjugated double bond in a steroid substrate via a 1–4 addition mechanism and that this requires a repositioning of the catalytically important residues. Our observation that the sequence motifs that line the active site are conserved in a number of bacterial and plant enzymes of yet unknown function leads us to the proposition that 5β-POR defines a novel class of SDRs.

The beneficial effects of cardenolides, also known as cardiac glycosides or cardiotonic steroids, are well documented, and they have been applied for the treatment of cardiac insufficiencies for centuries (1)(2)(3). On a molecular level, these steroids are potent inhibitors of the sodium/potassium pump (Na ϩ /K ϩ -ATPase) that is present in almost all cells in higher organisms (4). Digitalis plants are still the major source for cardenolides, and as a step in the biosynthetic pathway, the Digitalis enzyme progesterone 5␤-reductase (5␤-POR) 2 catalyzes the stereospecific NADPH-dependent reduction of the ⌬ 4 -double bond in progesterone to 5␤-pregnane-3,20-dione. Because all Digitalis cardenolides share the characteristic 5␤-configuration, the enzyme 5␤-POR catalyzes a central step during their biosynthesis (5)(6)(7).
NADH/NADPH-dependent reductases as well as the related dehydrogenases, dehydratases, and epimerases can be classified into two major protein families: the (␣/␤) 8 -barrel containing aldo-keto-reductases (AKRs) (8) and the Rossman fold containing short chain dehydrogenases/reductases (SDRs) (9 -11). Additional families such as the long and medium chain dehydrogenases/reductases are related to SDRs because they share with the latter the dinucleotide-binding double Rossman fold (12)(13)(14). SDRs are about 250 residues long and form a large family with over 2000 members (15). Because their central feature consists of an all-parallel ␤-sheet and their catalytic mechanism evolves around a tyrosine residue, they are also referred to as 7-stranded tyrosine-dependent oxidoreductases (16). The dinucleotide-binding double Rossman fold motif is contained within the N-terminal six ␤-strands of the seven-stranded ␤-sheet (strands ␤A to ␤G). Insertions of up to 100 residues in length occur in many SDRs and are predominantly accommodated within the left-handed crossover connection between strands ␤F and ␤G as well as after strand ␤G toward the C terminus of the protein. These two segments are often collectively referred to as the ligand-binding domain of SDRs, and they are considered the prime determinants of substrate specificity (17).
The SDR family members can be identified at the sequence level based on several conserved motifs, and variations in these motifs have been used to define SDR subfamilies (15,18). These motifs are either involved in NADH/NADPH cofactor binding or cluster around the substrate-binding pocket. SDRs contain in their active site a highly conserved amino acid triad consisting of a serine, tyrosine, and lysine residue, with a possible fourth conserved asparagine residue (19). The conservation of the catalytic triad in almost all SDRs indicates that SDRs share a common reaction mechanism. Moreover, this mechanism seems also to extend to AKRs because they have the conserved tyrosine and lysine in common with SDRs (20).
Based on sequence alignments, 5␤-POR from Digitalis plants has been predicted to be an SDR family member (5,21). This is in contrast to mammalian steroid 5␤-reductase, which is a member of the AKR family (22). Plant 5␤-POR shares the typical NADPH/NADH-binding sequence motifs with other SDRs (18), but intriguingly, none of the sequence motifs that cluster around the substrate-binding site and that contain the catalytically important residues are conserved. The sequence motif that displays the conserved active site serine residue in SDRs, namely GXXXXXSS (or SSXXXXG in some SDRs) (10,18), is missing in 5␤-POR (5,21). Also, the sequence motif YXXXK (or YXXMXXXK) (10,18) that displays the active site tyrosine and lysine residue is absent, and a conserved NFYYXXED motif can be found instead (5,21). Although it has been suggested that one of the tyrosines in this motif corresponds to the typical SDR active site tyrosine, it is not possible to locate the additionally required lysine residue in any of the adjacent sequence segments. Hence, the question arises whether 5␤-POR defines a novel class of SDRs with a different set of sequence motifs and conserved residues in the catalytic site and that might be characterized by a distinct reaction mechanism.
We previously reported the purification and crystallization of 5␤-POR from Digitalis lanata (5,23). Here we present two crystal structures of 5␤-POR, namely of 5␤-POR, in the presence and absence of the cofactor NADP ϩ . The crystal structures show that although 5␤-POR displays the standard SDR fold, the architecture of the active site is unprecedented.

EXPERIMENTAL PROCEDURES
Protein Production and Purification-To produce recombinant 5␤-POR from D. lanata, we slightly modified previously published protocols (5,23). In the pQE-30 expression plasmid (Qiagen, Hilden, Germany), the first 13 residues of the 389residue-long protein were missing and were replaced by a 6-residue-long N-terminal His tag instead. The best expression levels were obtained in Escherichia coli strain M15[pREP4] at low temperatures. Therefore, two 1-liter LB medium bacteria cultures that were initially grown at 37°C to an OD of 0.50 were transferred to 12°C. 5␤-POR production was induced at an OD of 0.7 upon addition of 0.3 mM isopropyl 1-thio-␤-D-galactopyranoside, and the bacteria cultures were incubated for an additional 48 h. The cells were harvested by centrifugation, and the pellet was dissolved in 5 ml of lysis buffer (NaH 2 PO 4 , 300 mM NaCl, 10 mM imidazole, 1 mM (2-aminoethyl)-benzenesulfonyl fluoride hydrochloride, pH 8.0), and the solution was sonicated. Following centrifugation, the filtrated supernatant was loaded onto a 1-ml nickel-Sepharose HP affinity column (GE Healthcare). The protein was eluted with a 20 -500 mM imidazole gradient prepared with the lysis buffer described above.
5␤-POR was further purified by an additional gel filtration step using a Superdex 75 HiLoad 16/60 column (GE Health-care) with a 20 mM Tris/HCl, 150 mM NaCl, pH 8.0 buffer. The protein eluted in two separate peaks. Although the first peak contained high molecular weight disulfide cross-linked oligomers that could be visualized in a nonreducing electrophoretic gel and that failed to crystallize, the fractions covering the low molecular weight peak were pooled and concentrated to a final concentration of 21 mg/ml for subsequent crystallization. During the last step, the buffer components were diluted to 6 mM Tris/HCl and 45 mM NaCl, pH 8.0. The overall protein yield was about 1.5 mg of pure protein from 2 liters of bacterial cell culture.
To analyze 5␤-POR activity, the method described by Stuhlemmer and Kreis (24) was used and slightly modified (5). The assay contained the following in a final volume of 1000 l: 945 l of purified protein fraction (0.2 mg/ml), 6.4 mM NADP ϩ , 32.1 mM glucose 6-phosphate, 42 nanokatals of glucose-6phosphate dehydrogenase, and 0.3 mM progesterone as substrate. Heat-inactivated (10 min, 100°C) samples served as controls. The mixtures were kept in 2-ml Eppendorf tubes and incubated at 30°C and 550 rpm for 2 h prior to extraction, using 1000 l of dichloromethane. Y179F and Y179A mutants were incubated under standard conditions and prolonged conditions (4 h). The organic phase was evaporated and the pellet dissolved in 50 l of methanol for subsequent HPLC and TLC analysis.
Enzyme activity was calculated using the HPLC method published previously (5). The detection limit was shown to be 80 ng of pregnane-3,20-dione. In addition, the TLC system described by Herl et al. (5), which is about 10 times more sensitive than the HPLC method, was used to check enzyme activity qualitatively.
Crystallization and Data Collection-Crystals of 5␤-POR with no cofactor bound were grown using the hanging drop method as described earlier (23). 1 l of protein solution was mixed with 1 l of reservoir solution (15% polyethylene glycol 4000, 0.1 M ammonium acetate, 0.1 M sodium citrate, pH 5.6), and the droplet was suspended over 700 ml of reservoir solution. After 2 days the octahedral crystals with lengths of about 250 m could be isolated from droplets that were covered by a dense protein skin. The crystals were soaked for 5 min in a cryoprotection solution prepared from 80% (v/v) reservoir solution and 20% (v/v) ethylene glycol prior to being shockfrozen in liquid nitrogen.
The binary complex between 5␤-POR and NADP ϩ was prepared by mixing 60 l of protein solution with an 8-fold molar excess of NADP ϩ . In an attempt to produce the ternary complex consisting of protein, substrate, and cofactor, we incubated the protein solution in addition with 1.8 mg of solid progesterone for 48 h. However, we later did not find any evidence for the presence of progesterone in the structure of the binary complex. After removal of insoluble progesterone by centrifugation, crystals of 5␤-POR in complex with NADP ϩ were grown using the containerless batch method (25). 300 l of high density fluorinated silicon oil (FS-1265 Fluid 10000 CST, Dow Corning, Wiesbaden, Germany) were transferred into a well of a cell culture plate and overlaid with 500 l of regular silicon oil (silicon oil M 5, Roth, Karlsruhe, Germany). At the interface between the two liquids, a droplet was deposited that was obtained by mixing 0.4 l of H 2 O, 2.2 l of the binary 5␤-POR complex solution (see above), and 1.8 l of a crystallization solution consisting of 22.9% 2-methyl-2,4-petanediol, 3.5% polyethylene glycol 8000, 0.05 M sodium acetate, 0.02 M CaCl 2 , pH 5.8. Crystals appeared within 48 h and were shock-frozen after being soaked for 5 min in a cryoprotection mixture consisting of the crystallization solution (80%, v/v) supplemented with ethylene glycol (20%, v/v).
Highly redundant diffraction data sets of 5␤-POR alone and in complex with the cofactor were collected at BESSY synchrotron in Berlin covering a total oscillation range of 180°. Data indexing and processing were accomplished with program XDS (26). Diffraction data statistics are summarized in Table 1. Both crystals belonged to the tetragonal space group P4 3 2 1 2 with cell axes almost identical to those previously observed for the selenomethionine-derivatized crystals of 5␤-POR (23).
Structure Determination and Refinement-The phases derived from a three-wavelength MAD experiment of selenomethionine-derivatized 5␤-POR were of such quality that the chain could be traced entirely de novo (for examples of the experimentally phased 2.7 Å electron density see Ref. 23). Because the crystals of 5␤-POR alone and in complex with the cofactor NADP ϩ were isomorphous to the selenomethioninederivatized 5␤-POR, the model built into the MAD-derived density map could be readily transferred to the new diffraction data and completed for missing residues, side chain placements, the bound cofactor, and solvent molecules. An identical set of reflections was used in all structures to validate model building and refinement using R free (27). Model building was performed using program COOT, and A -weighted electron density maps were used for guidance (28,29). The model was refined with program REFMAC (30), and during the final stages of the refinement, rigid body anisotropic B-factors were introduced using TLS refinement (31). Two different rigid body groups were outlined, namely the double Rossman fold that consists of 5␤-POR residues 25-208, 250 -279, and 353-368, and the substrate-binding helical domain formed by residues 209 -249, 280 -352, and 369 -389. In both structures a number of solvent-exposed side chains, namely six in the complex structure and 10 in the structure of 5␤-POR alone, lacked any electron density and were therefore modeled with atom occupancies of zero. Model building was halted after the refinement of 5␤-POR alone and of 5␤-POR in complex with NADP ϩ converged to a final crystallographic R factor of 20.2 and 17%, respectively (Table 1). Residual density at position 298 prompted us to resequence the expression construct, thereby confirming that the amino acid at this position was glutamic acid and not glycine (5). A glutamic residue at this position is also present in other 5␤-POR orthologs (data not shown). At two positions we observed strong electron density in the cofactor-bound 5␤-POR structure that could not be explained by water molecules. One of these could be satisfactorily modeled by a sodium ion and the other as a chloride ion that interacts with Tyr-179 in the enzyme active site.
Molecular Modeling of the Ternary Protein Cofactor Substrate Complex-Because any attempts to produce crystals of 5␤-POR in complex with either progesterone, cortisol, or 4-androstene-3,17-dione remained unsuccessful regardless whether the cofactor NADP ϩ was present or not, we computationally docked the substrate progesterone into the binding site. For this purpose, the protein design algorithms of the inhouse program MUMBO were supplemented with a flexible ligand-handling routine to identify the energetically most favorable interaction between progesterone and the protein (32). In a first step a backbone-dependent rotamer library was used to build multiple side chain conformations into the model (33). In addition, up to several thousand random ligand orientations and positions were generated, starting from the coordinates of a manually placed ligand. In the next step, the energetically most favorable combination of side chain conformations and ligand position were identified using either the dead end elimination or the Metropolis Monte Carlo search algorithm in combination with an empirical force field. The force field included in addition to standard terms also a solvation free energy estimate and an empirical H-bond energy term (32,34,35). The docking procedure resembles that described by Leach (36). A more detailed description of the method and validation calculations are provided in supplemental Figs. 1 and 2 and supplemental Tables I and II).

RESULTS
The Structure of 5␤-POR-The structure of 5␤-POR from D. lanata was solved in complex with the cofactor NADP ϩ at a resolution of 2.3 Å (R factor ϭ 17.0%, R free ϭ 21.3%) and with no cofactor bound at 2.4 Å resolution (R factor ϭ 20.2%, R free ϭ 24.9%) ( Table 1). In both structures, no electron density was visible for the N-terminal residues 14 -25. At present it cannot be decided whether the entire 25-residue-long N terminus in 5␤-POR is highly flexible or whether these residues lack any ordered structure because of the deletion of wild-type residues 1-13 in the expression plasmid. Although in the cofactorbound structure the main chain could be traced contiguously from residues 26 -389, in the cofactor-free structure, the segments 68 -72 and 155-158 could not be built. All residues in both structures lie within allowed regions of the Ramachandran plot with 93.7 and 92.8% of the residues in the mostly favored regions (37).
5␤-POR exhibits a typical SDR fold (10), and the N terminus contains the standard dinucleotide-binding double Rossman fold (9, 38) (Fig. 1). A short insertion that is present in the loop connecting strands ␤E and ␤F as well as two long insertions that occur between the sixth and the seventh ␤-strand (strands ␤F and ␤G) and following strand ␤G are typical for the extended members of the SDR family (18). The latter two insertions are almost exclusively formed by ␣-helices and accommodate close to 130 residues of the total 389 residues of 5␤-POR. Because of the size of these insertions and because the segments that connect them to the central double Rossman fold shape the steroidbinding pocket, it seems appropriate to divide 5␤-POR into an NADPH/NADP ϩ cofactor-binding and a substrate-binding domain (Fig. 1B).
The closest structural relatives to 5␤-POR are according to the DALI-server (39): GDP mannose-4 -6-dehydratase (GMD, PDB code 1DB3 (40)), UDP-galactose-4-epimerase (PDB code 1XEL), and dTDP-glucose-4,6-dehydratase (PDB code 1BXK), all from E. coli. The latter three enzymes can be superimposed onto 5␤-POR with r.m.s. deviations of 2.7, 3.1, and 3.0 Å, respectively, when considering 283, 289, and 284 structurally equivalent C-␣ atoms. These proteins share with 5␤-POR the topological arrangement of the helices in the substrate-binding domain. Sequence identities are low, however; they only range from 12 to 19%. The overall fold of 5␤-POR thus confirms a previous prediction, in which 5␤-PORs from Digitalis plants have been classified into the SDR family of proteins based on sequence alignments (21).  Quaternary Structure of 5␤-POR-All SDR family members are oligomeric proteins. In case of 5␤-POR from Digitalis purpurea, hexamers have been observed in solution (6). For the D. lanata enzyme, which shares 96% sequence identity with the D. purpurea enzyme, we observe in gel filtration experiments and in the crystal structure the preferred formation of dimers (Fig.  1C). Gel filtration experiments also revealed a minor fraction of higher molecular weight oligomers; however, further analyses showed that they very likely resulted from an erroneous formation of disulfide bridges (see "Experimental Procedures").
The dimer in the crystal structure displays C 2 point group symmetry (Fig. 1C), and the monomer surface area buried in the dimer interface is as large as 960 Å 2 . None of the other protein packing contacts is able to explain the dimer formation observed in solution. These crystal packing contacts are much smaller, and in accordance with the space group symmetry none of these contacts gives raise to point group symmetrical dimers. The dimer interface includes the symmetry equivalents of helices ␣FG1 and ␣GE1 as well as additional loop regions. The contact surface is surprisingly hydrophilic, and a water cluster is enclosed at the center of the dimerization interface. Although such an analysis is difficult to complete, it appears that the dimerization mode in 5␤-POR does not resemble any of the oligomerization modes observed in other SDRs. It should be noted that the recombinant protein used in this study misses 13 N-terminal residues, and it cannot be ruled out that in the context of the full-length protein, 5␤-POR from D. lanata forms higher oligomers, similarly to those observed in the homologous D. purpurea enzyme (6).
NADP ϩ Binding to 5␤-POR-NADP ϩ is bound in a syn-conformation along the double Rossman fold (Fig. 2). The cofactor is well defined by its electron density, and its occupancy is estimated to be close to 100%. The sequence motif GXXRR at the C terminus of strand ␤B defines the specificity of 5␤-POR for NADPH (Table 2 and Fig. 1B). In 5␤-POR, the two arginine residues (Arg-63 and Arg-64) contribute a total of six ionic and polar interactions to the binding of the phosphate attached to ribose atom O-2Ј of the adenine nucleotide moiety of NADP ϩ (Fig. 2). As is common for dinucleotide-binding proteins, the adenine ring is selectively recognized by a conserved DXXD motif, which in 5␤-POR contains the sequence Asp-81-Ile-Ser-Asp. The diphosphate group of NADP ϩ interacts with the glycine-rich sequence GXXGXXG at the N-terminal side of ␣B ( Fig. 1B and Table 2). As almost always observed in dinucleotide-binding proteins, a conserved water molecule bridges the interactions between the glycine-rich sequence and the diphosphate group (41,42). This water molecule is numbered water molecule 1 in 5␤-POR in complex with NADP ϩ because it displays the highest electron density of all water molecules.
Whereas dinucleotide binding in 5␤-POR is overall accomplished through well defined motifs (18), 5␤-POR deviates from standard SDRs in one important aspect. In all SDRs an invariant lysine is found that binds to both the O-2Ј and O-3Ј of the nicotinamide ribose (11,19). In 5␤-POR, however, this lysine residue is missing and is replaced by a tyrosine (Tyr-179), which interacts with the O-2Ј hydroxyl group, whereas the O-3Ј hydroxyl group interacts with a water molecule (Fig. 2). Because the missing lysine residue is part of the standard SDR catalytic triad (19), this difference is significant and can be expected to have an impact on the catalytic mechanism of 5␤-POR.
Comparison of 5␤-POR in Presence and Absence of NADP ϩ -The overall r.m.s. deviation between the structure of 5␤-POR with NADP ϩ bound and without cofactor is 1.4 Å for all nonhydrogen atoms. The lower average B-factor and the achieved higher resolution indicate that NADP ϩ binding has an overall stabilizing effect on the structure. Conformational differences between the two structures are small except for a few regions (Fig. 3). The loop segment that follows ␤-strand ␤B and that contains parts of the GXXRR motif is well defined in the cofactor-bound structure because of its multiple interactions with the phosphate group attached to the O-2Ј adenine ribose atom. This segment is disordered in the absence of NADP ϩ . Close to the position of the nicotinamide ring, we

Crystal Structure of Progesterone 5␤-Reductase
observe that the so-called substrate-binding loop, which is formed by residues 210 -216, slightly differs in the two structures. The loop is, however, well ordered, even in the absence of the cofactor NADP ϩ and the substrate (Fig. 3). In many SDRs this region is partially or completely disordered in the absence of cofactors and/or substrate (43). Overall, NADP ϩ binding to 5␤-POR engenders only a few specific rearrangements in the side chains that line the NADP ϩ -binding site. For example, the side chain of Trp-106 that shields the nicotinamide ribose from the solvent in the cofactor complex rotates into the space freed by the cofactor in the absence of the cofactor. The superposition of the two structures clearly demonstrates the absence of a hinge movement between the substrate-binding and the cofactor-binding domain triggered upon cofactor binding. It remains to be determined whether this also holds true for the formation of the substrate-bound complex.
The Modeled Ternary Complex, 5␤-POR-NADP ϩ -Progesterone-To gain insight into the reaction mechanism and to study the atomic determinants that allow for the specific ⌬ 4 double bond reduction in progesterone by 5␤-POR, we made numerous attempts to experimentally elucidate the structure of a ternary protein-cofactor-substrate or product complex. Unfortunately, all these efforts remained unsuccessful, and therefore, we took recourse to molecular modeling instead (Fig. 4). We are confident that the model obtained with program MUMBO reliably reproduces the orientation of the substrate during catalysis (see also Supplemental Material). The substrate-binding site in 5␤-POR is elongated and narrow. Hence, the positioning of the steroid is almost exclusively dictated by spatial restrictions, and van der Waals repulsion energies are among the simplest and most reliable energy terms in any atomic force field description. In addition, of the four possible overall steroid orientations, only the one depicted in Fig. 4 is in agreement with the stereospecific addition of the hydride to atom C-5 of the steroid (see below).
Ligand docking was performed using a rigid protein backbone in combination with flexible side chain rotamers. With the exception of the side chain of Phe-153, which swings away from the steroid, we observe that the substrate progesterone can be accommodated in the binding site without any major adjustments in the side chain orientations (Fig. 4). Although O-3 from progesterone is hydrogenbonded to O-2Ј of the nicotinamide ribose, our model of the complex suggests the formation of an additional hydrogen bond at the opposite end of the steroid, which would enhance the binding affinity of the substrate. Following a peptide flip of the peptide bond that connects residues 351 and 352, a hydrogen bond could be formed between atom O-20 of progesterone and the amide nitrogen of Cys-352 (Fig. 4). The model also shows that with the exception of residues Tyr-179 and Lys-147, the nicotinamide ring of NADP ϩ is buried in a predominantly hydrophobic pocket with residues Trp-106, Phe-153, Met-215, and Phe-343 shielding the end of the steroid that is reduced from the solvent (Fig. 4).
Mutational Analysis of Tyr-179-To assess whether Tyr-179 located in close distance to O-2Ј of the nicotinamide ribose of NADP ϩ and O-3 of the docked progesterone molecule exerts a catalytic function, we mutated this residue to alanine and phenylalanine ( Table 3). The two mutants were inactive. Taking into account the detection limits of the analytical methods used, we estimate that a nondetectable level of enzyme activity corresponds to an activity less than a thousandth part of the activity of the wild-type enzyme.

5␤-POR, a Prototypical Member of a Novel SDR Class-Upon
close inspection of the crystal structures and alignment of the sequence of 5␤-POR with other SDRs, it becomes apparent that 5␤-POR differs from other SDRs in several important points. As discussed above, this is not true for the overall protein fold, because the closest structural and sequential homolog to 5␤-POR, namely GMD from E. coli, is a typical representative of the SDR family. Furthermore, the binding of the cofactor NADP ϩ in 5␤-POR is similar to that in other SDR, and all the characteristic NADPH-binding fingerprints are strictly conserved in 5␤-POR (motifs I to III, Table 2 and Fig. 5).
A different picture arises when focusing on those sequence motifs that contain the SDR-specific catalytic triad residues (19). In standard SDRs, two of these, namely tyrosine and lysine, are presented by a conserved YXXXK motif (in some instances also YXXMXXXK motif, see below) and a third residue, a serine, by a conserved GXXXXXSS or SSXXXXG motif (10,18). By contrast all these motifs are missing in 5␤-POR (Fig. 5). To study the impact of the missing sequence motifs on 5␤-POR, we compared the active site of 5␤-POR to that of 17␤-hydroxysteroid dehydrogenase type 1 (17␤-HSD, PDB code 1FDT) (Fig.   FIGURE 3. Stereo C-␣ representation of the superposition of 5␤-POR in complex with NADP ؉ (in purple) and of cofactor-free 5␤-POR (in slate). Both structures are highly similar (r.m.s. deviation ϭ 1.4 Å). The regions that differ and that are discussed in the text are highlighted in orange. Trp-106, which sits on top of the nicotinamide moiety of the cofactor and only differs in its side chain conformation, is depicted together with NADP ϩ (yellow) as a stick model. 4) (44). As for GMD, 17␤-HSD is a representative SDR family member, but at the same time represents a rare example of an SDR for which experimental structural data of the protein in complex with cofactor and substrate/product are available (44).
In 17␤-HSD the catalytically important tyrosine residue from the YXXXK motif is oriented perpendicular to the nicotinamide ring and points toward the substrate molecule (Fig. 4C). In 5␤-POR, a tyrosine residue is also present in the active site. The side chain of Tyr-179 is presented by the NFYYXXED motif (motif V, Table 2) and is oriented coplanar to the nicotinamide ring (Fig. 4A). Its position is topologically identical to that of the catalytically important lysine residue from the YXXXK motif in standard SDRs (Figs. 4 and 5). In typical SDRs, this lysine residue is suggested to provide binding affinity for the cofactor (45) and to assist in the reprotonation of the tyrosine residue during catalysis (19). In 5␤-POR, a lysine residue (Lys-147) points from a different position into the active site. It is displayed to the "right" of the tyrosine residue from the TGXKHYXGP sequence motif (motif IV, Table 2). Lys-147 does not interact with the cofactor or substrate (Fig. 4). Interestingly, the sequence motif from which the lysine residue is displayed can be structurally aligned with the segment that in standard SDRs displays the third residue of the catalytic triad, namely Ser-142 in 17␤-HSD and Thr-132 in GMD (Figs. 4 and 5) (19). In 5␤-POR these residues are replaced by a glycine (Fig. 5). In the mammalian 5␤-reductase, which is an AKR family member, a glutamic residue has been identified as an important catalytic residue in the active site (46). In 5␤-POR, a glutamic residue is present in the NFYYXXED motif (motif V, Table  2); however, the residue points away from the active site and takes part in the network of ionic interactions initiated by an additional motif VI (see below). Clearly the active site of . Active site and reaction mechanism of 5␤-POR. A, stereo representation of the active site of 5␤-POR in complex with NADP ϩ . Although the crystal structure of the binary complex is shown in dark blue and yellow (NADP ϩ ), the position of the substrate progesterone (orange) is indicated as calculated with program MUMBO (32). Adjustments in the orientations of the side chains that resulted from the docking of progesterone into the active site are shown in light gray. The placement of progesterone suggests that following a peptide flip between residues Glu-351 and Cys-352, a hydrogen bond can be formed between atom O-20 of progesterone and the NH group of Cys-352. B, putative reaction mechanism for the 5␤-POR-catalyzed reaction. Because in our model O-3 of progesterone is more close to the hydroxyl group of the ribose (2.56 Å) than to that of Tyr-179 (3.56 Å), we propose that the ribose participates in the proton relay system. C, for comparison and as an example of a standard SDR, stereo representation of the active site of 17␤-HSD is with the experimentally observed ligand estradiol (44) (PDB code 1FDT). D, catalytic mechanism postulated for 17␤-HSD (44). B and D, adenosine ribose pyrophosphate moiety of the cofactor is abbreviated as ARPP. We reason that the distinct positioning of Tyr-179 in 5␤-POR in comparison with Tyr-155 in 17␤-HSD is the prime molecular determinant for the observed selectivity for the 1-4 hydrogen addition in 5␤-POR versus 1-2 addition in 17␤-HSD. The distance between the C-4 atom of the nicotinamide ring and the C-5 atom of progesterone is 4.0 Å in the 5␤-POR model compared with 3.6 Å between C-4 of NADP ϩ and C-17 of estradiol in 17␤-HSD.
5␤-POR differs from that of other SDRs, as the lysine and tyrosine residues are displayed from alternative sequence motifs (motifs IV and V, Table 2) and assume alternative positions.
The structure suggests an important contribution from a sixth sequence motif consisting of the sequence WSVHRP (motif VI, Table 2). Pro-203 in this motif packs against the nicotinamide ring of the cofactor, and more importantly, Arg-202 takes a central position in a network of ionic interactions that interlink residues from motif IV (His-148 and Tyr-149) and motif V (Tyr-180, Glu-184, and Asp-185) via the bridging Arg-202. The sequence conservation in this motif (Fig. 5) hints that this network is important for architecture of the active site as it very likely helps to correctly orient the active site residues on the protein framework.
A BLAST search (47) of various protein sequence data bases reveals that the 5␤-POR-specific motifs IV to VI occur in many different proteins and that the residues for which we were able to propose important roles are highly conserved throughout these proteins (see Fig. 5 for a selection of proteins). The proteins that can be identified are from prokaryotes as well as from eukaryotes and are present either in plants or bacteria but strik-ingly not in animals (21). The exact function of many of these proteins is not yet clear as they have only tentatively been classified as oxidoreductases or epimerases in data bases. The conservation of the arginines in sequence motif II hints that they are all NADP ϩ /NADPH-dependent enzymes. At present it is not clear whether these proteins are mere orthologs of 5␤-POR. Nevertheless, the distinctness of the active site in combination with the occurrence of highly conserved sequence motifs for which we were able to propose defined functions opens up the possibility that these proteins define a novel class of SDRs.
Catalytic Mechanism of 5␤-POR-In standard SDRs the reduction of the double bond in the substrate molecule is considered to occur via a two-step mechanism (19) (Fig. 4). In a first step a formal hydride ion is transferred via a nucleophilic addition from the pro-4(S) position of NADH/NADPH onto the substrate. The emerging negative charge on the substrate is neutralized in a subsequent protonation step (4). In the majority of reductases, the hydrogenation occurs as a 1-2 addition, and the hydride ion is always added to the atom in the polarized double bond that carries a partial positive charge. This is, for example, the case in 17␤-HSD that catalyzes the reduction of estrone to estradiol. Here, the hydride ion is first added to C-17 before protonation occurs at atom O-17 ( Fig. 4D) (44).
By contrast, in the 5␤-POR substrate progesterone, the carbonyl group at C-3 constitutes together with the double bond between C-4 and C-5 (⌬ 4 -double bond) a conjugated system, and therefore the reaction catalyzed by 5␤-POR is similar to that of enoyl-reductases (48) (Fig. 4B). In 5␤-POR the hydride ion is directed toward atom C-5 as part of a 1-4 addition mechanism (Fig. 4B). Subsequent protonation of O-3 then leads to the formation of an enol intermediate, which then in turn undergoes a spontaneous keto-enol-tautomerization to yield the product 5␤-pregnane-3,20-dione. In analogy to enoyl-reductases, we expect that hydride transfer and protonation constitute the rate-limiting steps in 5␤-POR (45). Because the initial hydride transfer step also defines the stereo configuration at the newly formed stereo center (49), the nucleophilic addition in 5␤-POR has to occur from the ␤-face of the steroid (5,21). In SDRs the active site conserved tyrosine residue is considered the central catalytic residue (19,45). It is generally assumed that the function of this residue is to both polarize the double bond that becomes reduced and to protonate the emerging product. In 5␤-POR, however, the tyrosine that is present in the active site is positioned differently. It is hydrogen bonded to atom O-2Ј of the nicotinamide ribose, which in turn interacts with atom O-3 of pro-FIGURE 5. 5␤-POR is a representative of a novel class of SDRs. Selective segments of a multiple sequence alignment, which show that motifs I to III that are typically present in all NADP ϩ -dependent SDRs, are also conserved in 5␤-POR. In contrast, motifs IV to V appear to be unique for 5␤-POR and related proteins. The sequences listed beneath 5␤-POR from D. lanata are from proteins with yet unknown function and were identified with a BLAST search (47). The sequences are from Oryza sativa (rice) (Uniprot data base accession code A2XIC3), Arabidopsis thaliana (Q39171), Zymomonas mobilis (Q5NRC9), Gluconobacter oxydans (Q5FSG3), and Burkholderia multivorans (A0UJA5). For comparison. the corresponding sequence segments of the closest structural relative of 5␤-POR, namely E. coli GMD (P14061, Protein Data Bank code 1DB3) are also listed. GMD has been structurally aligned with 5␤-POR, and in upper letters are listed residues in the sequence of GMD that lie within 3.8 Å of the corresponding 5␤-POR residues. This alignment clearly shows that whereas in GMD the SDR-specific triad of residues, namely Thr, Tyr, and Lys, are displayed from two different segments, in 5␤-POR and related sequences, three entirely different sequence motifs can be identified, which contain residues that are important for the catalysis. Please note the replacement of the standard SDR active site serine by threonine in GMD. gesterone in our model of the ternary complex (Fig. 4). In case Tyr-179 is mutated to either alanine or phenylalanine, the enzymatic activity is completely abolished (Table 3). This shows that the hydroxyl group is of utmost importance for catalysis and suggests that Tyr-179 from the NFYYXXED motif in 5␤-POR takes over a similar role than the catalytically important tyrosine from the YXXXK motif in standard SDRs. It is not immediately obvious why in 5␤-POR the tyrosine residue is positioned differently than in standard SDRs. We propose that this becomes necessary because 5␤-POR catalyzes a 1-4 hydrogen addition and not a 1-2 addition as in standard SDRs such as 17␤-HSD (48). For a 1-4 addition to occur, the separation between the tyrosine residue that interacts with atom 1 of the substrate and the nicotinamide ring that donates a hydride ion to atom 4 has to increase. Hence, it is necessary to position the hydroxyl group of the catalytic tyrosine residue in the active site further away from the C-4 atom of the nicotinamide ring.
This proposal is corroborated by the observation that in sugar epimerases where the reaction is considered to proceed via a 1-2 hydrogen subtraction reaction by means of a temporary oxidation of a hydroxyl group to a keto-intermediate, the standard SDR orientation of the residues lysine and tyrosine is adopted, although all other substrate protein interactions differ considerably (42). On the other hand in enoyl-reductases (48), which catalyze a 1-4 hydrogen addition reaction similarly to 5␤-POR, the active site tyrosine side chain is also repositioned with respect to the lysine residue and the nicotinamide ring in the active site because the conserved YXXXK motif is replaced by a YXXMXXXK motif. As a result, the tyrosine moves by the distance of one complete helical turn along the helix from which it is displayed. In dienoyl-reductases, which catalyze a 1-6 hydrogen addition, the tyrosine residue is moved to a different framework segment altogether (50). Why in 5␤-POR yet another solution for the active site has been adopted when compared with other enoyl-reductases is again not obvious. Possibly, this is linked to the fact that in 5␤-POR the conjugated double-bonds are part of a steroid ring system, and therefore the orientation of the two double bonds is conformationally fixed.
Taken together, the data presented here in combination with previous observations suggest that differences in the positioning of the tyrosine residue in the active site with respect to the nicotinamide ring of the cofactor are key to understanding the specificity and selectivity of the 1-2, 1-4, and 1-6 hydrogen addition and subtraction reactions catalyzed by SDRs. It is obvious that a detailed understanding of the architecture of the active site of these enzymes will prove extremely valuable if one aims at redesigning the specificity of SDRs.