Structural Analysis of the DNA-binding Domain of the Helicobacter pylori Response Regulator ArsR*

The Helicobacter pylori ArsS-ArsR two-component signal transduction system, comprised of a sensor histidine kinase (ArsS) and a response regulator (ArsR), allows the bacteria to regulate gene expression in response to acidic pH. We expressed and purified the full-length ArsR protein and the DNA-binding domain of ArsR (ArsR-DBD), and we analyzed the tertiary structure of the ArsR-DBD using solution nuclear magnetic resonance (NMR) methods. Both the full-length ArsR and the ArsR-DBD behaved as monomers in size exclusion chromatography experiments. The structure of ArsR-DBD consists of an N-terminal four-stranded β-sheet, a helical core, and a C-terminal β-hairpin. The overall tertiary fold of the ArsR-DBD is most closely related to DBD structures of the OmpR/PhoB subfamily of bacterial response regulators. However, the orientation of the N-terminal β-sheet with respect to the rest of the DNA-binding domain is substantially different in ArsR compared with the orientation in related response regulators. Molecular modeling of an ArsR-DBD-DNA complex permits identification of protein elements that are predicted to bind target DNA sequences and thereby regulate gene transcription in H. pylori.

transit of H. pylori through the gastric lumen, the bacterium thrives in the gastric mucus layer, where the pH is thought to be 4.5-6.5 (11). Within the mucus layer, the bacterium is still subject to considerable pH fluctuations as a consequence of the changing luminal pH. Therefore, H. pylori has evolved mechanisms to survive severe acid shock and grow under moderately acidic conditions. One of the important mechanisms by which H. pylori responds to acidic pH involves regulation of gene expression through a two-component signal transduction system (12)(13)(14)(15)(16)(17)(18)(19)(20).
Two-component systems (TCS) 2 are stimulus-response coupling mechanisms that are used primarily by prokaryotic organisms to regulate cellular functions in response to changing environmental conditions (21). TCSs are composed of a sensor histidine kinase protein and a response regulator (RR) protein.
In a typical TCS, the histidine kinase detects and monitors an environmental condition and transmits information via a phosphotransfer event to the cognate RR. Most RRs consist of two domains, an N-terminal regulatory domain and a C-terminal effector domain. Phosphorylation of the RR induces a conformational change in the regulatory domain, which results in activation of the effector domain. The majority of RRs are transcription factors with DNA binding effector domains (DBDs) (22). In contrast to the Escherichia coli genome, which contains 62 open reading frames (ORFs) encoding TCS component proteins (23), the H. pylori genome contains only 10 ORFs encoding TCS-component proteins (24). This paucity of regulatory genes in the H. pylori genome could reflect the pathogen's tight adaptation to its restricted ecological niche and also the lack of competition from other microorganisms in the acidic gastric environment (12,25).
Several recent studies reported that an H. pylori TCS, ArsS-ArsR (acid-responsive signaling sensor/response regulator), responds to acidic conditions (12)(13)(14)(15)(16)(17)(18)(19)(20). This TCS consists of the histidine kinase protein, ArsS, and the RR protein, ArsR. Evidence supporting a cognate relationship between ArsS and ArsR includes immediate proximity of the two genes in the H. pylori chromosome and the demonstration that the two purified proteins can participate in a phosphotransfer reaction in vitro (12). H. pylori ArsS null mutants are viable, but such mutants are impaired in the ability to grow in low pH in vitro, and these mutant strains are unable to colonize mice (14,15). Attempts to generate ArsR null mutants have been unsuccessful, which suggests that ArsR is essential for H. pylori viability (12).
One approach for identifying members of the ArsRS regulon has been to isolate DNA sequences that bind to ArsR. This approach resulted in the identification of two operons (designated HP1408 -1412 and HP427-423 in H. pylori strain 26695) and a family of paralogous genes (exemplified by HP0119) that are regulated by the ArsRS system (13). These genes encode proteins of unknown function. As another approach to identify members of the ArsRS regulon, whole-genome transcriptional profiles of wild-type H. pylori and ArsS mutant strains have been compared. Such studies have been done following growth of the bacteria at either neutral pH or acidic pH. Whole-genome transcriptional profiling of H. pylori strains cultured in low pH conditions identified more than 100 genes that were differentially expressed in the wild-type strain compared with an ArsS-deficient mutant (17). Transcriptional profiling of H. pylori cultured in neutral pH conditions identified a smaller number of genes that were differentially expressed in the wild-type and ArsS null mutant strains (26). Acid-responsive H. pylori genes that are differentially expressed in wild-type and arsS mutant strains include amidases (amiE, amiF) and members of the urease gene cluster (ureA, ureB, ureE, ureF, ureG, ureH, and ureI) (9, 10, 16 -18). Gel-shift and DNA-footprinting analyses have shown that ArsR binds directly to the promoter regions of these genes (17,18). Thus, the ArsRS TCS has an important role in allowing H. pylori to regulate gene expression in response to changes in pH and also has an important role in allowing H. pylori to colonize the gastric mucosa (15).
It has been proposed that the periplasmic domain of ArsS detects low pH conditions (16,27), triggering an ATP-dependent autophosphorylation reaction at a conserved histidine residue in the protein cytoplasmic domain. This model proposes that the signal from ArsS is then transduced to the cognate RR protein, ArsR, via a phosphotransfer reaction from the phosphohistidine kinase to a conserved aspartate residue (Asp-52) in ArsR, which leads to modification of the ArsR regulatory functions (for review, see Refs. 22 and 27). A derivative of ArsR with a D52N mutation is not phosphorylated in vitro by ArsS. In contrast to an arsR null mutant (which is non-viable) (28), a mutant H. pylori strain expressing the D52N form of ArsR is viable. These findings suggest that this mutant form of ArsR has a function that is sufficient for cell viability. Based on these data, it seems likely that there are two sets of target genes for ArsR. One set of genes in the ArsR regulon is presumed to be essential for cell viability, and regulation of these genes can be accomplished by a non-phosphorylated form of ArsR (exemplified by the D52N mutant protein) (12,28). A second set of genes in the ArsR regulon is not required for cell viability, and the regulation of these genes occurs by a pathway involving a phosphorylated form of ArsR (i.e. requiring the cognate histidine kinase, ArsS) (13,17,18,20,26,29). The products of various genes in the latter group contribute to acid acclimation, and thus host colonization, by H. pylori. It should be noted that genes belonging to the former set (required for cell viability) have not yet been identified.
Although several lines of evidence indicate that the ArsS-ArsR TCS has an important role in H. pylori, many features of this TCS are still not understood. Specifically, very little is known about how the structure and function of ArsR are altered in response to phosphorylation, little is known about how this TCS contributes to acid adaptation, and virtually nothing is known about what functions of ArsR are required for H. pylori viability. Herein, we report the structure of the DNA binding effector domain of ArsR as determined by solution NMR.

EXPERIMENTAL PROCEDURES
Plasmid Construction-The ArsR protein encoded by H. pylori strain J99 comprises 225 amino acids (GenBank TM accession number Q9ZMR6). The DNA-binding domain of this protein, mapped based on comparison to other known response regulators such as PhoB and OmpR, comprises 103 amino acids, beginning at Glu-123 and ending at Tyr-225. The fulllength arsR gene and a fragment encoding the DNA-binding domain were amplified by PCR from H. pylori strain J99 (gene JHP0152) using genomic DNA as template and pairs of primers including 5Ј BamHI and 3Ј KpnI restriction endonuclease sites (5Ј-CGGGATCCATGATAGAAGTTTTAATGATAGAAG-ATGATATAG and 5Ј-CGGGTACCTCAGTATTCTAATTT-ATAACCAATCCCTCT for full-length ArsR and 5Ј-GCGGG-ATCCGAAGAGGTGAGTGAGCCAGGC and 5Ј-GCGGG-TACCTCAGTATTCTAATTTATAACCAATCCCTCT for ArsR-DBD). Purified PCR products were digested with KpnI and BamHI restriction enzymes (Promega, Madison, WI) and ligated into linearized pET-BNK, a modified pET vector (Novagen, Madison, WI) developed specifically for expressing NMR protein targets. The vector contains a 5Ј-coding sequence for an N-terminal purification tag MRGSHHHHHHGS in-frame with the insert coding for the desired proteins.
Expression and Purification of ArsR and ArsR-DBD-For preparation of NMR samples, transformed E. coli BL21 (DE3) cells were grown in 4.2 liters of LB media containing glucose (2 g/liter) and ampicillin (50 mg/liter). When the culture reached an A 600 of 1.0, protein expression was induced for 4 h with 0.8 mM isopropyl-␤-D-thiogalactopyranoside. Cells were collected by 10 min of centrifugation at 12,000 ϫ g, suspended in 0.02 M Na 2 HPO 4 , 0.5 M NaCl, pH 7.6, and then lysed by sonication 6 times for 30 s on ice. Both 1 mM phenylmethylsulfonyl fluoride and 5 mM Tris(2-carboxyethyl)phosphine hydrochloride were added before sonication. After sonication, the preparation was centrifuged at 31,000 ϫ g for 20 min. The supernatant (ϳ100 ml) was collected and applied to a 30-ml metal affinity chromatography column (His-Bind, Novagen) charged with Ni 2ϩ . The column was washed with a 0 -80 mM imidazole gradient to remove proteins bound non-specifically to the column, and the protein of interest was eluted with a 0.08 -1.0 M imidazole gradient over 45 ml. 5-ml fractions corresponding to peaks of interest were collected, pooled, and concentrated to a volume of 1 ml in Amicon Ultra-15 centrifugal filters (10-kDa molecular weight cut-off (MWCO) membrane for the full-length protein and a 5-kDa MWCO membrane for the DBD). The purity of the proteins was assessed by Tricine SDS-PAGE on 10% gels. Protein samples estimated to be Ͼ95% pure were analyzed by NMR. 15 N-and 13 C, 15 N-labeled samples were produced by a process similar to that described above, but cells were grown in M9 medium with the addition of 13 C-labeled glucose and 15 Nlabeled ammonium chloride (CIL, Andover, MA).
Analytical Size Exclusion Chromatography-The proteins were analyzed by gel filtration on a Sephacryl S-100 fast protein liquid chromatography column (GE Healthcare) run at 4°C in 0.02 M Na 2 HPO 4 , 0.5 M NaCl, 1 mM Tris(2-carboxyethyl)phosphine hydrochloride (pH 7.6) and a flow rate of 0.4 ml/min, with ultraviolet absorption measured at 214 nm. ArsR and ArsR-DBD molecular weights (M r ) were determined by calculating their partition coefficient (K av ) values and using a calibration curve, plotting logarithmic values of M r against calculated K av values for a set of standard globular monomeric proteins (bovine serum albumin (66 kDa), ovalbumin (45 kDa), and myoglobin (16.7 kDa)), run under identical conditions. K av values were calculated using the formula NMR Structure Calculation-NMR spectra were acquired for ArsR and ArsR-DBD on an Avance 600 Bruker (Billerica, MA) spectrometer with a triple resonance gradient probe and a cryoprobe. Sample protein concentrations were ϳ0.2 mM (full-length protein) and ϳ0.5 mM (DBD) in 0.5 M NaCl, 0.02 M Na 2 HPO 4 , 1 mM Tris(2-carboxyethyl)phosphine hydrochloride (pH 7.6). Spectra were collected at 25°C for unlabeled (natural abundance), 15 N-labeled, and 13 C, 15 N-labeled protein samples. The data were processed using XWINNMR and TOPSPIN software (Bruker) and analyzed with the Sparky suite (Goddard TD and Kneller DG, Sparky 3, University of California, San Francisco). The assignment of backbone resonances was completed using data from twodimensional 1 H, 15 N HSQC, two-dimensional NOESY, and three-dimensional CBCANH and CBCA(CO)NH experiments. The side chain resonance assignments were completed using data from three-dimensional HCC total correlation spectroscopy (TOCSY), HHC-TOCSY, H(CC)(CO)NH, HCC(CO)NH, HBHA(CO)NH, 15 N-edited NOESY, and 13 Cedited NOESY experiments (for references, see Ref. 30). The chemical shifts of H ␣ , C ␣ , C ␤ , and CЈ were analyzed with chemical shift index software (31) to produce a prediction of secondary structure elements. J-coupling constants calculated from HNHA experiments were used to determine angle constraints for structure calculations. The chemical shifts of H ␣ , C ␣ , C ␤ , CЈ, and N were also analyzed to calculate angle constraints to be used for structure calculations using TALOS software (32). The structures were calculated using the CYANA Version 2.1 software package (33). Automatic calibration was used to convert the NOE peak intensities into distance constraints. The final calculations were performed for 1000 structures with 40,000 annealing steps for each. The 50 structures with the lowest target functions (Յ0.6) were minimized with AMBER (Version 9) (34). 20 structures with the lowest energy were visualized with InsightII (Accelrys, San Diego, CA), Chimera (35), and MOLMOL (36). Electrostatic potentials calculated using Delphi program (37, 38) were used to generate a surface potential map for ArsR-DBD in Chimera. The stereochemistry of the structures was analyzed with PROCHECK-NMR (39).
NMR Analysis of ArsR-DBD in Complex with the Promoter Region of a Target Gene-Two-dimensional 1 H, 15 N HSQC spectra were collected for purified 15 N-labeled ArsR-DBD alone (0.1 mM) and ArsR-DBD combined with a 13-bp dsDNA fragment (5Ј-CGCATCATTAACC) (0.1 mM) from the promoter region of a well characterized ArsR target gene, hp1408 (13). This DNA fragment corresponds to the 5Ј half of a DNA binding region identified by footprinting analysis (13).

ArsR and ArsR-DBD Biochemical Characterization-The
full-length His 6 -tagged ArsR protein (237 residues) and a His 6tagged ArsR fragment corresponding to the DBD (115 residues) were overexpressed in E. coli, recovered from soluble fractions of the cell lysate, and purified by metal affinity chromatography. Gel filtration experiments conducted on ArsR and ArsR-DBD demonstrated that the proteins migrated with molecular masses of ϳ32 kDa (expected monomer size, ϳ27 kDa) and ϳ19 kDa (expected monomer size, ϳ13 kDa), respectively (data not shown). The larger than expected apparent molecular sizes were most likely caused by the 17-residue disordered N termini of proteins. These data suggest that both ArsR and ArsR-DBD are monomeric in solution at concentrations below 0.1 mM. ArsR and ArsR-DBD protein samples were concentrated to ϳ0.2 and ϳ0.5 mM, respectively, and analyzed by NMR. The directly detected 31 P NMR experiments conducted on purified full-length ArsR (in Tris buffer), which contains a putative phosphate-receiving aspartate residue, did not reveal any detectable phosphate signal, suggesting that the majority of the purified protein is in the unphosphorylated form (data not shown). One-dimensional 1 H NMR spectra of ArsR showed well dispersed peaks in regions characteristic of amide protons (6 -10 ppm) and aliphatic protons (0 -4 ppm), indicative of folded protein. ArsR-DBD was produced in natural abundance and in 15 N-and 13 C, 15 N-enriched forms. Fig. 1A shows that the 1 H, 15 N correlation NMR spectrum of ArsR-DBD exhibited good chemical shift dispersion. A comparison of 1 H, 15 N HSQCs of full-length ArsR and ArsR-DBD showed that ArsR-DBD amide proton peaks aligned with ArsR amide peaks, indicating that the structure of the isolated ArsR-DBD was very similar to that domain in the fulllength protein (Fig. 1B).  MARCH 6, 2009 • VOLUME 284 • NUMBER 10

JOURNAL OF BIOLOGICAL CHEMISTRY 6539
ArsR-DBD Structural Determination-Backbone NH resonance assignments were obtained (Fig. 1A) for all residues of the ArsR-DBD, except for the first 11 residues of the 12-residue N-terminal His 6 tag, 4 proline residues, and 2 isoleucine residues (Ile-176 and Ile-192, numbered based on the sequence of untagged full-length ArsR). The solution structure of ArsR-DBD was determined by NMR based on 1265 distance constraints, including 264 intraresidue, 329 sequential, 360 medium range, and 312 long range distance constraints, and 155 dihedral angle constraints calculated using TALOS (32) and J-coupling constants calculated from HNHA experiments. Table 1 summarizes the structural statistics for the calculations. The first 17 residues of the protein (including the 12 residues from the His 6 tag) showed very few NOE interactions and chemical shifts, a finding that is characteristic of disordered peptides. Residues 128 -225 form a well structured domain. The 20 structures presented in the final ensemble ( Fig. 2A)  As discussed further below, the ArsR-DBD contains a winged helix-turn-helix (wHTH) fold that is predicted to mediate binding of the protein to DNA. This wHTH fold is formed by the ␣2 and ␣3 helices, the loop connecting them, and the loop connecting ␤ strands 6 and 7 as the "wing." The electrostatic surface potential map of ArsR-DBD (Fig. 3) reveals two distinct regions with opposite charge distribution. A surface with a largely positive electrostatic potential (including residues Arg-173, Lys-190, Arg-198, Arg-200, Lys-202, Lys-205, and Arg-217) is shown in Fig. 3A. Based on comparisons with other RRs (41)(42)(43), this surface is predicted to bind to the negatively charged phosphate backbone of DNA. Fig. 3B displays a surface with a highly negative electrostatic potential (including residues Asp-130, Glu-158, Glu-174, Glu-179, Glu-181, and Glu-186), with the majority of the charge contribution from the residues at the C-terminal end of ␣2.
Structural Comparison of ArsR-DBD with Related Structures-A BLAST search of GenBank TM (44) using the ArsR amino acid sequence indicates that ArsR is most closely related to members of the OmpR/PhoB subfamily of response regulators, but the levels of amino acid sequence identity are fairly low (32% (OmpR), 28% (PhoB), and 32% (DrrD)) (Fig. 4A). Despite these low levels of sequence identity, an analysis of ArsR with DALI software and the Families of Structurally Similar Proteins data base (45) which catalogs known protein structures, showed that  the ArsR-DBD structure was closely related to the structures of members of the OmpR/PhoB subfamily, including OmpR (PDB code 1opc) (43), PhoB (PDB code 1gxq) (41), and DrrD (PDB code 1kgs) (46), with z scores (depicting strength of structural similarity) of 3.8, 6.1, and 5.3, respectively. The secondary structural elements of ArsR-DBD are positioned by internally packed hydrophobic residues that stabilize the protein fold (Fig.  4B). These core hydrophobic residues of ArsR-DBD (corresponding to residues Phe-134, Val-l136, Leu-150, Ile-159, Leu-160, Leu-163, Ile-164, Ile-176, Ile-192, Ile-196, Leu-199, Ile-203, and Ile-213 in ArsR) are conserved across members of the OmpR/PhoB subfamily (Fig. 4, A and B) (47), which suggests that they are responsible for the shared structural features. PhoB, an RR from E. coli, is a well studied prototype of the OmpR/PhoB subfamily. Despite the low 28% amino acid sequence identity of PhoB with ArsR, the DNA-binding domain of PhoB exhibits a high degree of structural similarity to the ArsR-DBD, as was shown by the DALI analysis (see above). Thus, we performed a detailed comparison of the ArsR and PhoB structures (Fig. 5). The arrangement of the three ␣-helical bundle is highly similar in the ArsR and PhoB DBDs and was used to superimpose the two structures. When tracing along the N, C ␣ , CЈ, and atoms, residues Ala-155-Lys-166 (␣1), Arg-173-Glu-179 (␣2), and Lys-190 -Lys-205 (␣3) of ArsR-DBD superimpose with residues of the PhoB-DBD helical core with an r.m.s.d value of 2.3 Å. The orientation of the loop between ␣2 and ␣3 (which is predicted to interact with subunits of RNA polymerase) and the length of the ␣3 helix (which is predicted to be involved in DNA recognition) are known to vary among members of the OmpR/PhoB subfamily, but these structural features are similar in ArsR and PhoB. However, a notable difference is that the N-terminal ␤-sheet in ArsR-DBD is rotated almost 45°with respect to the N-terminal ␤-sheet in PhoB-DBD. This antiparallel ␤-sheet functions as a platform for interactions with the N-terminal receiver domain in many OmpR/ PhoB RRs (46,48).
Comparison of ArsR with Close Orthologs-A BLAST search using the ArsR amino acid sequence identified ArsR orthologs in all H. pylori strains for which genome sequences are available as well as in closely related species. A pair-wise sequence comparison of ArsR from H. pylori strains J99, 26695, and HPAG1 as well as strains from related species, H. acinonychis, H. hepaticus, Wolinella, and Campylobacter species, is shown in Fig.  4A. The ArsR proteins encoded by multiple H. pylori strains exhibited 99% pair-wise amino acid sequence identity. ArsR from another gastric Helicobacter species, H. acinonychis, also exhibited a high level of relatedness (94% amino acid identity) to H. pylori ArsR. ArsR from H. hepaticus, an intestinal Helicobacter species, was 71% identical to H. pylori ArsR. ArsR orthologs from Wolinella and Campylobacter were 54 -67% identical to H. pylori ArsR (Fig. 4A). Comparisons of the ArsR-DBD orthologs from these species showed a high degree of conservation in core hydrophobic residues belonging to the major secondary structural elements of the protein. Furthermore, a high degree of sequence identity was maintained in the wHTH motif, most notably in the surface-exposed residues of the ␣3 recognition helix and the wing of the C-terminal ␤-hairpin (corresponding to residues Arg-173, , and Arg-217 in ArsR) (Fig. 4, A and C).

DISCUSSION
In this paper, we present a detailed three-dimensional structure of the ArsR DNA-binding domain, determined using NMR spectroscopy. The ArsR-DBD structure is most closely related to the structures of proteins classified in the OmpR/PhoB subfamily of response regulators. Like other members of this subfamily, the ArsR-DBD comprises two anti-parallel ␤-sheets flanking a core of three ␣-helices and contains a winged HTH motif that is predicted to bind DNA.
Despite similar tertiary structural organization, members of the OmpR/PhoB subfamily of RRs differ in their dependence on phosphorylation and oligomerization for transcriptional regulation. For example, upon phosphorylation, some RRs dimerize in solution, whereas others remain monomeric and subsequently dimerize after binding to specific DNA sequences (49 -52). At present, relatively little is known about the effects of phosphorylation and oligomerization on ArsR activity.
The orientation of the N-terminal ␤-sheet in the ArsR-DBD differs markedly from the orientations of N-terminal ␤-sheets observed in previously solved effector domains of related RRs. Relative to the orientation of the N-terminal ␤-sheet in PhoB-DBD (41), the N-terminal ␤-sheet of ArsR-DBD is rotated about 45°to align in the direction of the ␣1 helix. As a consequence of this difference, we speculate that the interdomain interactions between the N-terminal receiver domain and the C-terminal DBD will differ in ArsR compared with what is observed in other members of the OmpR/PhoB subfamily.
Based on the conservation of structural features in ArsR and members of the OmpR/PhoB subfamily of RRs, ArsR-DBD is predicted to interact with DNA in a manner similar to that of other members of the OmpR/PhoB subfamily. To test this hypothesis, the interaction of ArsR-DBD with DNA was analyzed by mapping spectral changes in 15 N-labeled ArsR-DBD in the presence of an equimolar concentration of a 13-bp fragment of the promoter sequence of hp1408 (13), a member of the ArsRS regulon (see "Experimental Procedures" for details). Because OmpR/PhoB dimers bind ϳ24 bp of DNA, it is predicted that this 13-bp fragment would bind only one molecule of ArsR-DBD. A comparison of the two-dimensional 1 H, 15 N FIGURE 4. Comparison of ArsR-DBD with related proteins. A, sequence alignment of ArsR-DBD with orthologs from closely related species and with closely related proteins from the OmpR/PhoB subfamily (see "Experimental Procedures" for a detailed description of the aligned sequences). Secondary structural elements of ArsR-DBD are displayed above the sequence with horizontal arrows indicating ␤ strands and cylinders indicating ␣ helices. Core hydrophobic residues that are conserved within the DBDs of the OmpR/PhoB subfamily are shaded gray. Surface-exposed residues in the wHTH motif that are identical in ArsR, and orthologs from closely related species are boxed. B, backbone C ␣ tracing of ArsR-DBD depicting side chains of conserved hydrophobic core residues. The hydrophobic residues conserved in ArsR-DBD and the DBDs of OmpR, PhoB, and DrrD are indicated in ball-and-stick style on the line tracing of the ArsR-DBD backbone. C, backbone C ␣ tracing of ArsR-DBD and side chains of conserved surface-exposed residues in the wHTH motif. Surface-exposed residues in the wHTH motif that are identical in ArsR and orthologs from closely related species are indicated in ball-and-stick style on the line tracing of the ArsR-DBD backbone.
HSQC spectra of ArsR-DBD alone and combined with this DNA revealed changes in the intensities of peaks corresponding to specific amino acid residues (Fig. 6). This indicates moderately strong affinity between protein and DNA molecules, resulting in intermediate exchange regime in the NMR experiments. A detailed comparison of the peak intensities from the two spectra indicated that several residues exhibiting the most prominent reductions in peak intensities mapped to surfaceexposed regions of the wHTH motif (residues Arg-173, . Additionally, several core residues of the wHTH motif and other elements of the protein that make close contact with the helices of the wHTH motif demonstrated diminished peak intensities (residues Ala-155, Ile-159, Ile-183, Ile-195, Ile-196, Gly-197, and Gly-220) (Fig.  6). These data combined with the analyses shown in Fig. 4 support our prediction that ArsR-DBD interactions with DNA involve molecular surfaces homologous to those identified for other members of the OmpR/PhoB subfamily.
To date the only solved structure of an OmpR/PhoB subfamily member in complex with its target DNA binding site corresponds to the DBD of an E. coli RR, PhoB-DBD, on the pho box of the phoA operon promoter (41). Analysis of the electrostatic surface potential map of the PhoB-DBD revealed a positively charged surface in the region that binds DNA (41), and the electrostatic surface potential map of ArsR-DBD reveals a similar basic surface (Fig. 3A). We used the structure of the PhoB-DBD in complex with a target promoter sequence (41) as a template to generate a model of the ArsR-DBD binding to DNA (Fig. 7). Thus far, a consensus sequence for ArsR binding sites has not been identified, and therefore, to generate this model we retained the PhoB-specific binding site (pho box) as the target DNA; we did not attempt to model specific protein-DNA interactions when positioning ArsR-DBD on the DNA. Two ArsR-DBD domains were placed in tandem orientation (Fig. 7A) in the major groove of the target DNA fragment, ϳ10 bp apart. The two ArsR-DBD molecules superimpose on the two PhoB-DBD molecules (tracing along the backbone N, C ␣ , CЈ atoms of the corresponding ␣2 and ␣3 helices), both with an r.m.s.d. value of 2.1 Å.
The PhoB-DBD binds to the direct repeat-containing pho box as a tandemly arranged ("head to tail") dimer (41). However, several binding sites of ArsR that have been reported thus far (from promoters of genes arsR, ureA, ureI, amiE, amiF, rocF, hp1408, and hp1186 (carbonic anhydrase)) do not contain conserved symmetrical sequences (13,17,18,20). The degeneracy of ArsR binding sites leaves open the possibility that ArsR may bind to DNA in a different manner than that observed with PhoB. To construct an alternative model of ArsR-DBD mole-  (41)). The superimposition was based on the N, C ␣ , and CЈ atoms of the three core helices with an r.m.s.d. value of 2.3 Å. FIGURE 6. Analysis of HSQC spectra of ArsR-DBD alone and ArsR-DBD combined with a 13-bp dsDNA fragment derived from the promoter region of an ArsR target gene (hp1408). For each residue a ratio of peak intensity in the spectrum of ArsR-DBD with DNA to the peak intensity in the spectrum of ArsR-DBD without DNA is shown. Ratios substantially less than 1 indicate a decrease in peak intensity when the protein is combined with DNA. Residues for which intensity ratios were not available (due to absence of visible peaks or peak overlap) are depicted by gray symbols. Residues Ala-155, Ile-159, Arg-173, Ile-183, Asp-193, Ile-195, Ile-196, Gly-197, Arg-200, and Gly-220 showed the greatest decrease in intensity. ArsR-DBD secondary structure elements are shown above the plot.
cules arranged symmetrically on inverted repeat sequences, one-half of the dimeric PhoB-DBD-DNA complex was inverted and superimposed on the original copy of DNA using only the DNA backbone phosphates for alignment, with an r.m.s.d. value of 0.96 Å. Two ArsR-DBD molecules were superimposed on the PhoB-DBD molecules (tracing along the C ␣ atoms of the corresponding ␣2 and ␣3 helices) with r.m.s.d. values of 2.1 Å (Fig. 7B). In both orientations, residues Arg-173, , and Arg-200 from the ArsR HTH motif and residue Arg-217 from the "wing" between ␤-strand 6 and ␤-strand 7 of the C-terminal hairpin make contacts with the phosphate backbone and/or the bases of the target DNA. These interactions are supported by our NMR experimental data (Fig. 6). The functional groups of these ArsR residues are similar for the corresponding residues in PhoB, suggesting a conservation of their DNA recognition functions.
In the model shown in Fig. 7, residues , and Arg-198 protrude into the major groove of the DNA to make specific contacts with bases. The side chains of lysine and arginine and the hydroxyl group of Ser-191 provide potential hydrogenbonding partners. In addition the methyl groups of valine can make specific van der Waals contacts with the methyl groups of thymines. The reported ArsR binding sites are A-T-rich (13,17,18,20,29), suggesting that the valine-thymine contacts may be important determinants of sequence-specific protein-DNA interactions. A distinctive feature of the "tail to tail" symmetric orientation of the ArsR-DBD-DNA complex model (Fig. 7B) is the interaction of hydrophobic patches on the two ArsR-DBD molecules formed by residues Ile-214 and Val-216 of each C-terminal ␤-hairpin. The surface-exposed hydrophobic residues on one ArsR-DBD molecule are stabilized by the hydrophobic residues exposed on the opposite ArsR-DBD molecule. This interaction could contribute to the overall stabilization of ArsR dimers on target DNA sequences.
In summary, the results of this study allow the classification of ArsR into a subfamily of DNA-binding proteins that contain a conserved wHTH motif and yet exhibit diversity in their interactions with DNA and diversity in interdomain and protein-protein interactions. The structure of the ArsR-DBD provides a basis for future experimental studies designed to understand these interdomain, protein-protein, and protein-DNA interactions.