The X-ray Structure of the PurR-Guanine-purF Operator Complex Reveals the Contributions of Complementary Electrostatic Surfaces and a Water-mediated Hydrogen Bond to Corepressor Specificity and Binding Affinity*

The purine repressor, PurR, is the master regulatory protein of de novo purine nucleotide biosynthesis in Escherichia coli. This dimeric transcription factor is activated to bind to cognate DNA operator sites by initially binding either of its physiologically relevant, high affinity corepressors, hypoxanthine (K d = 9.3 μm) or guanine (K d = 1.5 μm). Here, we report the 2.5-Å crystal structure of the PurR-guanine-purF operator ternary complex and complete the atomic description of 6-oxopurine-induced repression by PurR. As anticipated, the structure of the PurR-guanine-purFoperator complex is isomorphous to the PurR-hypoxanthine-purF operator complex, and their protein-DNA and protein-corepressor interactions are nearly identical. The former finding confirms the use of an identical allosteric DNA-binding mechanism whereby corepressor binding 40 Å from the DNA-binding domain juxtaposes the hinge regions of each monomer, thus favoring the formation and insertion of the critical minor groove-binding hinge helices. Strikingly, the higher binding affinity of guanine for PurR and the ability of PurR to discriminate against 2-oxopurines do not result from direct protein-ligand interactions, but rather from a water-mediated contact with the exocyclic N-2 of guanine, which dictates the presence of a donor group on the corepressor, and the better electrostatic complementarity of the guanine base and the corepressor-binding pocket.

The purine repressor, PurR, is a 341-amino acid residue, dimeric DNA-binding protein from Escherichia coli. In E. coli, PurR functions as the master regulatory protein for de novo purine nucleotide biosynthesis and, to a lesser extent, de novo pyrimidine nucleotide biosynthesis and salvage, by repressing transcription of over 18 genes (1). In addition, PurR regulates the transcription of several genes encoding proteins necessary for polyamine and nucleotide biosynthesis (2,3). Finally, PurR is autoregulated (4,5).
PurR belongs to the highly homologous lactose repressor (LacI) family of transcription regulators (6,7). Nearly all LacI members function by binding to similar pseudo-palindromic operator sites typically 16 -18 base pairs long. Structurally, LacI members can be divided into two functional domains: an N-terminal DNA-binding domain (ϳ60 residues) and a C-terminal oligomerization effector-binding domain (Ͼ250 residues). Effector binding serves as a signal of the metabolic state of the cell, and whereas most LacI proteins bind with high affinity to their operator sites only in the absence of effector binding, the specific interaction of PurR with its DNA operator sites is dependent on corepressor binding. PurR is unique in having two physiologically relevant corepressors, hypoxanthine, to which it binds with a K d of 9.3 M, and guanine, to which it binds with higher affinity (K d ϭ 1.5 M) (8).
To describe the interactions between PurR and its corepressors and operator sites, we determined the structure of a PurR-hypoxanthine-purF operator complex (9). The structure revealed that PurR has a bipartite structure: a C-terminal corepressor-binding domain (CBD), 1 which spans residues 61-341 and is responsible for dimerization and corepressor binding and specificity, and an N-terminal DNA-binding domain, which spans residues 1-60. The latter domain is composed of a classical helix-turn-helix motif that is closed off by a loop and another helix. By engaging in a series of direct and water-mediated contacts, this three-helix bundle provides the major groove binding specificity of PurR. It is followed by another loop and a fourth helix, the hinge helix. The hinge helix, with a 2-fold related mate, inserts into the DNA minor groove and makes additional specific contacts. Furthermore, hinge helix insertion kinks the purF operator by permitting the interdigitation of the side chains of Leu 54 (and its 2-fold mate, Leu 54Ј ) between the central CpG base pair step of the operator (9).
The CBD is oblate in shape and consists of two topologically similar ␣/␤-subdomains (the N-and C-terminal subdomains) that are connected by three crossover connections. The CBD is structurally similar to the bacterial periplasmic binding proteins (10 -12). As observed in the periplasmic binding proteins, PurR binds hypoxanthine in the cleft between the CBD subdomains through a series of polar, nonpolar, and aromatic interactions (9). Direct and water-mediated contacts are provided by the side chains of Tyr 73 , Phe 74 , Arg 190 , Thr 192 , Phe 221 , and Asp 275 . The ability of PurR to utilize hypoxanthine or guanine, but not adenine and xanthine, indicates the critical importance of the chemical nature of the exocyclic 2-and 6-groups in * This work was supported by United States Public Health Service Grants GM49244 (to R. G. B.) and GM24658 (to H. Z.) and a National Defense Science and Engineering Graduate Fellowship in Biosciences (to M. A. S.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The binding specificity. The key PurR residue that specifies an oxygen at the purine exocyclic 6-position is Arg 190 . The guanidinium group of Arg 190 donates two hydrogen bonds to the O-6 acceptor of hypoxanthine (9). Thus, the side chain donors of Arg 190 dictate the requirement for hydrogen bond acceptors at this purine position. However, the mechanism of discrimination at the exocyclic 2-group was unclear, as was the stereochemical basis by which PurR binds guanine more tightly than hypoxanthine (8).
To begin the structural elucidation of the allosteric mechanism of PurR, which transduces the signal of corepressor binding by the CBD to effect DNA binding by the distal DNAbinding domain, the structure of the corepressor-free CBD was determined (13). From this and the hypoxanthine-bound PurR structure, the role of that purine corepressor binding was discerned. Upon binding hypoxanthine, there is a dramatic rearrangement in the tertiary and quaternary structures of the CBD N-terminal subdomain that results in the juxtaposition of the two hinge regions of PurR, which are over 40 Å away (13). Corepressor binding is essential for this juxtaposition as it shifts the equilibrium from the open form of PurR, in which the N-and C-terminal subdomains of the CBD are rotated far apart, to the closed conformation. Although likely, it was not known if guanine binding causes the identical structural changes that result in this allosteric shift.
To address the issues of the corepressor specificity and the higher affinity of PurR for guanine and its possible consequences on the intramolecular signaling pathway and allostery of this protein, we have determined the structure of a PurRguanine-purF operator complex at 2.5-Å resolution. As expected, the PurR-purF operator contacts and DNA bending are nearly identical to those seen in the PurR-hypoxanthine-purF operator complex (9). Strikingly, the structure reveals that the higher affinity of guanine for PurR (over hypoxanthine) and the discrimination against xanthine as a corepressor arise from 1) a corepressor-water-protein hydrogen bonding network, which involves the exocyclic N-2 hydrogen bond donor of guanine and the hydrogen bond acceptors, O-⑀ of Glu 222 and the backbone carbonyl oxygen of Phe 221 , and 2) the intrinsic negative electrostatic environment of the corepressor-binding pocket, which favors the more positive guanine as the corepressor.

EXPERIMENTAL PROCEDURES
Crystallization and Data Collection-PurR was overexpressed in E. coli and purified by DE53 cellulose and heparin-agarose chromatography as described (14). Crystals of the PurR-guanine-purF operator complex were grown by the hanging-drop vapor diffusion method, which was used previously to crystallize the PurR-hypoxanthine-purF operator complex (15). Briefly, protein at a concentration of 0.25-0.50 mM was saturated with guanine and mixed 1:1 (v/v) with a 0.50 mM solution of the 16-base pair purF operator, which has an additional 5Ј-nucleoside overhang on both strands, the identity of which is irrelevant to crystallization. The protein/corepressor/DNA solution was mixed 1:1 with the reservoir solution, which was 25% polyethylene glycol 4000, 0.40 M ammonium sulfate, 25 mM hexamminecobalt chloride, and 0.10 M ammonium phosphate, pH 7.5. Crystals initially grow as thin two-dimensional plates and, within a period of 2 weeks to several months, melt and are replaced by large rhombohedral crystals (typically 0.6 ϫ 0.2 ϫ 0.3 mm). The PurR-guanine-purF operator crystals are isomorphous to the PurR-hypoxanthine-purF operator crystals and take the space group C222 1 , with unit cell dimensions of a ϭ 176.0, b ϭ 95.08, and c ϭ 81.51 Å. There is one PurR monomer, one guanine molecule, and a purF operator half-site in the asymmetric unit.
X-ray intensity data were collected at room temperature with an Area Detector Systems Corp. multiwire area detector (16) and a Rigaku RU200-H rotating anode operating at 40 kV and 150 mA. The data from two crystals were processed and merged with the software provided by Area Detector Systems Corp. The R merge of the data to 2.5 Å is 6.29%, and the data are 87% complete from 10.0 to 2.5 Å and 78% from 2.7 to 2.5 Å. The redundancy of the data is 3-fold. The I/(I) is 7.4 for the entire data set and 1.4 for the data in the resolution shell between 2.7 and 2.5 Å.
Refinement-The previously determined structure of the PurR-hypoxanthine-purF operator complex minus the hypoxanthine and solvent molecules was used as the starting model for refinement (9). Rigid body refinement was carried out on the model followed by XYZ and B factor refinement as implemented in TNT (17). The current model, refined from 10.0 to 2.5 Å, has an R factor of 18.0% and displays excellent stereochemistry with only two residues, Ser 124 and Asp 275 in the unfavorable region of the Ramachandran plot (18). As observed previously, the unfavorable / angles of Asp 275 are necessary for its interaction with the N-9 atom of the corepressor (9). Although not directly in contact with the corepressor, Ser 124 is located directly above the corepressor-binding pocket and takes this conformation to avoid steric clash with the side chain of Arg 190 . 2 The root mean square deviations of bond lengths and bond angles are 0.02 Å and 2.23°, respectively. An F o Ϫ F c omit map, in which the guanine base, residues 221-222, and Wat 1 were omitted from the model and followed by 20 cycles of positional and thermal parameter refinement, is shown in Fig.  1   The map was calculated using structure factors from a structure in which guanine, Wat 1 , Phe 221 , and Glu 222 were removed, and an additional 20 cycles of refinement were carried using TNT (17). The contour level of the map is 3.5 . The hydrogen bonds formed between the water molecule (Wat 1 ) and the carbonyl oxygen of Phe 221 and the carboxyl side chain oxygen of Glu 222 are shown. Such a donor-acceptor arrangement leaves only hydrogen bond acceptors and results in a water-specific interaction with the exocyclic N-2 of guanine and discrimination against O-2 of xanthine.

RESULTS AND DISCUSSION
The current model of the PurR-guanine-purF operator complex includes residues 3-340 of the PurR monomer, one guanine molecule, a purF operator half-site, and 20 solvent molecules (Fig. 2). The atomic structures of PurR and the purF operator are essentially identical to those seen in the PurRhypoxanthine-purF operator complex and display a root mean square deviation of 0.25 Å for all atoms except solvent (9). Therefore, guanine and hypoxanthine induce the same DNA binding conformation of PurR. The N-terminal DNA-binding domain (residues 3-60) consists of a helix-turn-helix-loop-helix motif, which is responsible for DNA major groove binding specificity, and a hinge helix, which binds, with its 2-fold related mate, in the minor groove of the DNA. The insertion of two ␣-helices into the narrow minor groove is made possible by the partial interdigitation of the hinge helix residue Leu 54 (and Leu 54Ј ) within the central CpG base pair step. DNA bending, which was calculated with CURVES (19) and was exclusive of the first and last base pairs of the 16-base pair operator because of their poor electron density, revealed a 55°kink in the PurR-guanine-purF operator complex resulting from this side chain partial insertion. This bend is essentially identical to the 54°bend observed in the structure of the PurR-hypoxanthine-purF operator complex (9,20). Moreover, no significant changes in any base pair or base pair step parameter were observed.
The CBD consists of two subdomains: the CBD N-terminal subdomain, which consists of a core of six ␤-strands (␤A-␤E and ␤J) surrounded by four ␣-helices (I-III and IX), and the CBD C-terminal subdomain, which consists of a core of five ␤-strands (␤F-␤I and ␤K) surrounded by five ␣-helices (IV-VIII) (9). Three crossover peptides connect the two subdomains and serve as a floor to the corepressor, which is bound in the cleft between the two CBD subdomains. As seen in the periplasmic binding proteins, these peptidic linkages act as a hinge to allow movements of subdomains upon ligand association and dissociation (9 -12). All contacts observed between PurR and the corepressor hypoxanthine in the PurR-hypoxanthine-purF operator complex are also present in the PurR-guanine-purF operator complex. Stacking interactions occur between guanine and the side chains of Tyr 73 and Phe 221 , which sandwich the corepressor, and Phe 74 . Hydrogen bonds are provided by Thr 192 , Asp 275 , and Arg 190 . O-␥ of Thr 192 hydrogen bonds with N-7 of the guanine ring (2.73 Å, O-␥-N-7) and also makes van der Waals contact with the corepressor C-8 via its side chain methyl group. Asp 275 makes a hydrogen bond with N-9 of the purine (2.92 Å, O-␦2-N-9). The importance of Arg 190 in corepressor binding specificity at the exocyclic 6-position was revealed in the PurR-hypoxanthine-purF operator complex (9). In the PurR-guanine-purF operator complex, Arg 190 also specifies an oxygen at the exocyclic 6-position by providing two hydrogen bonds to this exocyclic amine from its side chain N-and NH-1 atoms (3.03 Å, N--O-6; and 2.72 Å, NH-1-O-6). Arg 190 contributes further to corepressor binding affinity by hydrogen bonding to a water molecule that is simultaneously hydrogen-bonded to N-1 of the guanine ring (3.15 Å, H 2 O-N-1; and 2.72 Å, H 2 O-NH 2 ). The essentially identical binding observed between PurR and hypoxanthine and guanine in the PurR-hypoxanthine-purF operator and PurR-guanine-purF operator complexes and the identical corepressor-mediated conformations of both complexes reveal how PurR utilizes two related molecules as corepressors. However, these interactions do not explain the nearly 7-fold higher affinity of guanine for PurR or the mechanism by which PurR discriminates at the purine exocyclic 2-position, i.e. against xanthine.
On the basis of the location of the side chain of Glu 222 in the PurR-hypoxanthine-purF operator complex (9), we anticipated that upon guanine binding by PurR, this side chain would swing into the corepressor-binding pocket, eject a tightly bound water molecule (Wat-1), and make a direct hydrogen bond with the exocyclic N-2 atom. However, in the PurR-guanine-purF operator complex, the position of the side chain of Glu 222 has changed little, and the water molecule is still present, although slightly shifted in position (Fig. 1). Rather, now the water molecule is positioned perfectly to hydrogen bond to the exocy-  (Fig. 1). This leaves only the two acceptor lone pairs of this water molecule, one of which interacts with the guanine N-2 atom and presumably the other with a hydrogen bond donor from the bulk solvent. Such a hydrogen bonding network would disfavor the binding of xanthine, which has an exocyclic acceptor atom (O-2) at this position. Consistent with this hypothesis is the in vitro binding of xanthine to PurR, which is ϳ200-fold weaker than guanine binding (data not shown). Remarkably, this water molecule is found not only in the PurR-guanine-purF operator and the PurR-hypoxanthine-purF operator complexes, but also in the corepressor-free form of PurR, where the identical water-protein hydrogen bonds are made (13). This demonstrates that this tightly bound water

FIG. 2. Structure of the PurR-guanine-purF operator complex.
Labeled are the helix-turn-helix-loop-helix (HTHLH) domain, which is responsible for major groove specificity (colored red); the hinge helix motif, which mediates minor groove specificity (colored blue); the Nterminal subdomain of the corepressor-binding domain, residues 61-159 and 294 -310 (colored yellow), which is critical in intramolecular signaling; and the C-terminal subdomain of the corepressor-binding domain (colored green), residues 160 -298 and 311-341, which imparts dimer stability. The guanine corepressor is shown as balls and sticks. The figure was generated with MOLSCRIPT (39). molecule acts as an extension of the protein.
The importance of water-mediated contacts in ligand binding specificity is now well documented. Notable examples include members of the periplasmic binding protein family such as the arabinose-binding protein (21,22), lysine/arginine/ornithinebinding protein (23), and oligopeptide-binding protein (OppA) (24); enzymes, e.g. ␣-chymotrypsin (25); and DNA-binding proteins as exemplified in the TrpR-tryptophan-trp operator complex (26,27). In the arabinose-binding protein, two hydrogenbonded water molecules, located in the binding pocket, aid in L-arabinose binding while creating an unfavorable interaction with the C-5 methyl group of D-fucose, thus explaining the low affinity of the latter. The near equal affinity of the arabinosebinding protein for D-galactose is explained by the replacement of one of these water molecules with the D-galactose -CH 2 OH group (21,22). A similar situation is observed in the lysine/ arginine/ornithine-binding protein, where water molecules assist in the binding of lysine, ornithine, and arginine. However, one of the water molecules must be displaced to permit binding of the larger arginine ligand (23). In contrast, OppA provides an example of the use of solvent in decreasing ligand binding specificity through the accommodation of multiple substrates with different peptide side chains in voluminous hydrated cavities from which water molecules are readily gained or lost (24). In the structure of ␣-chymotrypsin complexed with a turkey ovomucoid inhibitor, it was demonstrated that a water molecule, also present in the apo structure, mediated an ionic in-teraction that was critical for inhibitor binding specificity (25). Thus, as in the case of PurR, this water molecule can be considered as a constitutive element of the protein. Finally, the TrpR-tryptophan trp operator complex reveals quite dramatically the use of water molecules to nearly completely specify protein-DNA binding specificity (26,27). Thus, through its small molecular size, which allows it to pack around ligands of different size and shape, and its ability to provide hydrogen bond donors as well as acceptors, water provides a key element in the formation of a wide variety of protein-ligand complexes.
In addition to the direct water-mediated discrimination against xanthine, an examination of the electrostatic potential of the PurR corepressor-binding pocket and its proximity reveals a second possible source of unfavorable xanthine binding as well as the chemical basis of the more favorable binding of guanine over hypoxanthine. PurR displays a negative electrostatic potential surface near the exocyclic 2-position of the guanine ring (Fig. 3). Glu 220 , Glu 222 , and Glu 70 and Glu 70Ј from the other subunit contribute to the negative potential of the binding pocket of PurR. This surface is complemented by N-2 of guanine, which imparts a significant positive potential to this base. Hypoxanthine, which lacks any exocyclic group at the ring 2-position, is nearly neutral, but is still weakly positive. In contrast to the physiological corepressors, xanthine possesses a negative electrostatic potential at its exocyclic O-2 position, which likely contributes further to its very low affinity (Fig. 4). A similar electrostatic complementarity is observed between Shown is a cut-away surface representation revealing the electronegative potential of the PurR corepressor-binding pocket, which encompasses the electropositive exocyclic N-2 of the guanine. This figure was generated with GRASP (40). the exocyclic N-2 atom of the guanine base of GDP and ARF-1 (28), where the N-2 atom is enclosed in a positive pocket in the protein. This suggests that such electrostatic complementarity may play a significant role in purine-protein binding and recognition. In addition to the patch of negative electrostatic potential near the exocyclic 2-atom of guanine, there is another locus of negative potential within the corepressor-binding pocket, which is contributed by Glu 248 and Asp 275 and complements the positive potential of the N-9 atom of the purine rings of guanine and hypoxanthine (Fig. 4). Beyond its contribution to binding, the charged environment of PurR could serve to steer its guanine and hypoxanthine corepressors into their bound positions within the binding pocket. Indeed, electrostatic focusing by narrow clefts appears to be a general phenomenon, having been perhaps best documented for copper,zinc-superoxide dismutase (29) and trypsin (30). Thus, a tight water-mediated hydrogen bond and the negative electrostatic potential of the corepressor-binding pocket, which complements guanine better than hypoxanthine, provide a thermodynamic rationale for the increased affinity of PurR for guanine over hypoxanthine while allowing PurR to maintain the same activated conformation for both corepressors. Combined, these interactions are consistent with the small favorable free energy change (⌬⌬G Ϸ Ϫ1 kcal/mol, as calculated by the relationship ⌬⌬G ϭ ϪRT ln(K dGua /K dHyp ) observed for PurR-guanine binding over PurR-hypoxanthine binding.
PurR is unique in being the only repressor with two physiologically relevant corepressors to have had their three-dimensional structures as well as its corepressor-free form determined at atomic resolution (9,13). Structures are known for the corepressor-free and -bound forms of only two other repressors: the tryptophan repressor (TrpR) (27, 31-33) and the methionine repressor (MetJ) (34 -36). The TrpR and MetJ repressors are similar in that they both bind their corepressors within their DNA-binding domains. In the case of TrpR, the L-tryptophan corepressor reorients the flexible DNA-binding helixturn-helix reading heads to allow them to fit into successive major grooves. Moreover, the L-tryptophan makes direct contacts with the DNA phosphate backbone. For MetJ, the binding of its corepressor, S-adenosylmethionine, which is located ϳ12 Å from the DNA backbone, does not appear to result in any conformational change or structural rearrangement of the MetJ protein (34,35). Rather, the electrostatic potential of MetJ becomes decidedly more positive and might effect, albeit indirectly, the high affinity DNA binding (36). Clearly, PurR is different and binds its corepressors within a corepressor-binding pocket, which is at least 40 Å from the DNA-binding domain. Corepressor binding activates DNA binding allosterically by repositioning the hinge regions of each PurR monomer such that they may interact, undergo a coil-to-helix transition (37, 38), and bind the DNA minor groove.
In summary, the structure of the PurR-guanine-purF operator complex has established that binding of either hypoxanthine or guanine induces the same activated conformation of PurR, which can then bind the pur regulon operators with high affinity. The structure also reveals the likely basis for the higher affinity of PurR for its guanine corepressor versus hypoxanthine and the stereochemical mechanism of xanthine discrimination: 1) water-mediated readout of the exocyclic 2-position of the purine and 2) better protein-corepressor electrostatic complementarity, notably about the exocyclic 2-position of the purine ring.