Complex Structure of the DNA-binding Domain of AdpA, the Global Transcription Factor in Streptomyces griseus, and a Target Duplex DNA Reveals the Structural Basis of Its Tolerant DNA Sequence Specificity*

Background: AdpA regulates over 500 genes in Streptomyces griseus because of its tolerant DNA specificity. Results: The complex structure of the DNA-binding domain of AdpA and a target DNA was solved. Conclusion: Only two arginine residues of the DNA-binding domain of AdpA recognize directly the target DNA sequence. Significance: The complex structure reveals how AdpA acts as a global transcription factor controlling hundreds of genes. AdpA serves as the global transcription factor in the A-factor regulatory cascade, controlling the secondary metabolism and morphological differentiation of the filamentous bacterium Streptomyces griseus. AdpA binds to over 500 operator regions with the consensus sequence 5′-TGGCSNGWWY-3′ (where S is G or C, W is A or T, Y is T or C, and N is any nucleotide). However, it is still obscure how AdpA can control hundreds of genes. To elucidate the structural basis of this tolerant DNA recognition by AdpA, we focused on the interaction between the DNA-binding domain of AdpA (AdpA-DBD), which consists of two helix-turn-helix motifs, and a target duplex DNA containing the consensus sequence 5′-TGGCGGGTTC-3′. The crystal structure of the AdpA-DBD-DNA complex and the mutant analysis of AdpA-DBD revealed its unique manner of DNA recognition, whereby only two arginine residues directly recognize the consensus sequence, explaining the strict recognition of G and C at positions 2 and 4, respectively, and the tolerant recognition of other positions of the consensus sequence. AdpA-DBD confers tolerant DNA sequence specificity to AdpA, allowing it to control hundreds of genes as a global transcription factor.

The filamentous bacterial genus Streptomyces is characterized by morphological differentiation culminating in sporulation and by its ability to produce various valuable secondary metabolites such as antibiotics, immunosuppressants, and other biologically active substances. In Streptomyces griseus, AdpA is a global transcription factor in the A-factor regulatory cascade and controls both the secondary metabolism and mor-phological differentiation (1). Within the cascade, a chemical signaling molecule called A-factor binds as an effector to the A-factor receptor protein (ArpA) that is bound to the operator region of the adpA gene and causes ArpA to dissociate from the DNA, resulting in transcriptional activation of the adpA gene (2).
Within the A-factor regulatory cascade, ArpA regulates the adpA gene exclusively, whereas the transcription factor AdpA activates a number of genes required for secondary metabolism and morphological differentiation. For instance, strR encodes the pathway-specific transcription activator for streptomycin biosynthesis (3); adsA encodes an extracytoplasmic function factor essential for aerial mycelium formation (4); ssgA encodes an acidic, small protein essential for spore septum formation (5); sgmA encodes a metalloendopeptidase that is probably involved in the apoptosis of substrate hyphae during development of aerial hyphae (6); amfR encodes a response regulatorlike protein essential for aerial mycelium formation (7); sprT and sprU encode trypsin-type proteases (8); sprA, sprB, and sprD encode chymotrypsin-type proteases (9); and the regulatory tRNA gene bldA is essential for both morphological development and secondary metabolism (10). These genes are collectively termed the "AdpA regulon." Recent ChIP sequencing and chromatin affinity-precipitation assays, as well as transcriptome analysis, have revealed that AdpA directly controls more than 500 genes in cooperation with other regulatory proteins (11).
The AdpA protein, which consists of a ThiJ/PfpI/DJ-1-like dimerization domain at its N-terminal portion and a DNAbinding domain with two helix-turn-helix motifs at its C-terminal portion (12,13), belongs to the AraC/XylS family of transcription factors. AdpA is thought to bind to DNA as a homodimer in two different ways, i.e., via type I and II binding (14). Type I binding, the bivalent binding, occurs when an operator region contains an inverted repeat of two AdpA consensus sequences with an optimal interval of 13, 14, or 2 bp, and both of the subunits in the AdpA dimer bind to the consensus sequences. There are multiple optimal intervals between the two consensus sequences, probably because of the flexible linker that connects the dimerization domain and the DNAbinding domain of AdpA. Type II binding, the monovalent binding, occurs when an operator region contains only one AdpA consensus sequence and either of the subunits in the AdpA dimer binds to the consensus sequence.
The consensus AdpA-binding sequence is 5Ј-TGGCSNG-WWY-3Ј, where S is G or C, W is A or T, Y is T or C, and N is any nucleotide (15). In the consensus sequence, the C at position 4 is the most important for AdpA-DNA binding in vitro (15), and the G and C at positions 2 and 4, respectively, are more highly conserved than the other nucleotides.
No structural basis for the tolerant DNA sequence specificity of AdpA has been revealed so far. In this study, we have focused on the analysis of the interaction between a single DNA-binding domain of AdpA and its target DNAs, which contain a 10-bp consensus sequence. Here, we report the 2.95 Å crystal structure of AdpA-DBD 2 in complex with a 14-bp duplex DNA with two-nucleotide overhangs at the 5Ј-ends containing an AdpA consensus sequence derived from the adsA operator region (5Ј-TGGCGGGTTC-3Ј). The crystal structure and mutation analyses revealed that the two arginine residues of AdpA form hydrogen bonds with the two highly conserved nucleotide bases (G and C at the positions 2 and 4 in the consensus sequence) and that these interactions are essential for the AdpA-DBD-DNA complex formation. In addition to these core recognitions, other features for the tolerant DNA sequence recognition were reasonably discussed based on our crystal structure: the interactions with the DNA backbone that stabilize the protein-DNA complex, the recognitions of the nucleotide bases at the moderately conserved positions (T, G, and G at positions 1, 3, and 5 in the consensus sequence), and the "shape readout" mechanism for the narrowed minor groove generated by the sequence WW (positions 8 and 9 in the consensus sequence).

MATERIALS AND METHODS
Preparation of the Protein and the Duplex DNAs-The DNAbinding domain of AdpA (residues 215-340; AdpA-DBD) and the duplex DNAs for the crystallization were prepared as described previously (16). The gene encoding AdpA-DBD was cloned into the NdeI/BamHI site of pET-28a (Novagen) to express N-terminally His-tagged AdpA-DBD. AdpA-DBD was expressed in Escherichia coli Rosetta (DE3) at 25°C overnight. AdpA-DBD with a His tag was purified with nickel-nitrilotriacetic acid (Qiagen), Resource S 6-ml columns (GE Healthcare), and Superdex 75 10/30 HR (GE Healthcare) columns at 4°C. The oligonucleotide purification cartridge-purified single-stranded DNAs for crystallization (Table 1) were purchased from Operon Biotechnologies (Tokyo, Japan). A pair of 16-base oligonucleotides was annealed together by heating at 95°C for 5 min in 10 mM MES (pH 6.0), 100 mM NaCl, and 10% (v/v) glycerol followed by a slow cooling to 4°C, resulting in a 14-bp duplex DNA with two-nucleotide overhangs at the 5Ј-ends.
The plasmids for the expression of the two mutants of AdpA-DBD, R262A and R266A, were obtained by the site-directed mutagenesis method (17) using the plasmid to express the wildtype AdpA-DBD described above as the template of the PCR. The wild type and mutants of AdpA-DBD for the interaction assays were expressed and purified with nickel-nitrilotriacetic acid, as described for the crystallization sample. The purified proteins were dialyzed against a buffer containing 50 mM Tris-HCl (pH 8.0), 100 mM NaCl, 10% (v/v) glycerol, and 1 mM DTT and concentrated. The oligonucleotide purification cartridgepurified single-stranded DNAs for interaction assays (Table 1) were purchased as described for the crystallization sample. A 30-base sense strand with 5Ј-FITC label and a 30-base antisense strand were mixed; annealed, resulting in a 30-bp duplex DNA with a FITC label at the 5Ј-end of the sense strand; and used for interaction assays.
Crystallization and Structure Determination-The crystallization and preliminary x-ray analysis of the AdpA-DBD-duplex DNA was performed as described previously (16). The complex of AdpA-DBD-duplex DNA was prepared by adding 1.5 mg/ml of AdpA-DBD solution to the duplex DNA solution in a 1:1.2 molar ratio and incubating the mixture at 4°C for 10 min. The complex solution was then concentrated to a protein concentration of 8 mg/ml (corresponding to 0.5 mM) and crystallized by the sitting drop vapor diffusion method at 4°C with a reservoir solution of 0.1 M Na-HEPES (pH 7.1), 10% (v/v) 2-propanol, and 16% (w/v) PEG 4000, as described previously (16). Crystals were cryoprotected with a final concentration of 20% (v/v) ethylene glycol and cooled in a nitrogen stream. Diffraction data were collected on Beamline NE-3A at Photon Factory-Advanced Ring (Tsukuba, Japan) with an x-ray source at a wavelength of 1.0000 Å, an ADSC Quantum 270 charge coupled device, a distance between the crystal and the detector of 300.1 mm, a rotation angle of 1°, and an exposure time of 5 s/image.
The indexing, integration and scaling of the diffraction data were performed with XDS (18). The space group of the crystal was determined to be C222 1 with one protein-DNA complex in the asymmetric unit. The structure was solved by molecular replacement using MOLREP (19) with the atomic coordinates of the DNA-binding domain of Rob (25% sequence identity; Protein Data Bank code 1D5Y) (20) and those of the duplex DNA of the MarA-DNA complex (Protein Data Bank code 1BL0) (21) as search models. Model building was carried out using Coot (22,23), and the structure was refined with REF- MAC5 (23). Data collection and refinement statistics are given in Table 2. Curvesϩ was used to analyze the duplex DNA structure (24). Molecular graphics were generated using PyMOL (25).
Electrophoretic Mobility Shift Assay for Analyzing the Protein-DNA Interaction-The purified AdpA-DBD (the wild type or a mutant, R262A or R266A) was serially diluted in buffer containing 10 mM Tris-HCl (pH 7.5), 100 mM NaCl, 2.5% (w/v) glycerol, and 0.25 mM DTT and used for EMSA. A series of protein-DNA complex solutions were prepared by mixing of 2 l of buffer containing 200 mM Tris-HCl (pH 7.5), 1 M KCl, 2.5 mg/ml bovine serum albumin, 1% Nonidet P-40, 2 l of 1.0 mg/ml poly(dI-dC), 4 l of 2.0 M 30-bp duplex DNA, 2 l of water, and 10 l of the protein solution and incubated for 1 h on ice. The electrophoresis was performed in TBE buffer at 120 V on 12% acrylamide gel for 120 min. The fluorescence of FITClabeled DNAs was visualized using LAS4000 Mini System (Fujifilm). The concentration of AdpA-DBD that bound 50% of the DNA was used to estimate the apparent dissociation constant (K D ). Assays were performed in duplicate and had variances of Ͻ1 dilution, i.e., 2.5-fold for the assay using the wild-type AdpA-DBD and the wild-type duplex DNA or 1.5-fold for the other assays.

RESULTS
Overall Structure of the AdpA-DBD-DNA Complex-The crystal structure of the AdpA-DBD-DNA complex was determined at 2.95 Å resolution with an R factor of 23.6% and a free R factor of 27.8% ( Table 2). The duplex DNA used contained an AdpA consensus sequence, 5Ј-TGGCGGGTTC-3Ј. The final model contained all of the 32 nucleotides, which constitute a 14-bp duplex DNA with two-nucleotide overhangs at the 5Ј-ends, a protein molecule in a residue range of 230 -340, and 9 water molecules. Fifteen amino acid residues in the N-terminal region (residues 215-229) failed to show electron density and were not modeled, presumably because of the high flexibility of this region, which corresponded to the loop connecting the dimerization domain and the DNA-binding domain. The nucleotides at the 5Ј-and 3Ј-ends of each strand in the duplex DNA were assigned the nucleotide numbers Ϫ3 and 12, respectively, taking the first nucleotide of the consensus sequence as ϩ1.
The 14-bp duplex DNA bound to AdpA-DBD adopted a distorted B-form structure (Fig. 1, B and C). The helix ␣3 in the N-terminal HTH motif (HTH1) is deeply inserted into the major groove of the duplex DNA and binds to the bases (Figs. 1A and 2) and DNA backbone, whereas the corresponding helix ␣6 in the C-terminal HTH motif (HTH2) is distant from the nucleotide bases of the nearest major groove and only binds to the backbone phosphate groups of DNA (Fig. 3, A and B). The linker helix ␣4 restricted the relative orientation of and the distance between the two HTH motifs, resulting in a distortion of the bound DNA with a distance between the two adjacent major grooves of 28 Å instead of 34 Å as seen in an ideal B-form DNA (Fig. 1C).
Interactions between AdpA-DBD and Duplex DNA Base Pairs-Specific base recognition is achieved only by HTH1 in AdpA-DBD, and hence the N-terminal HTH motif is more important for the nucleotide base recognition than the C-terminal HTH motif. This kind of DNA recognition is observed in many AraC/XylS family members (12). The ␣3 is engaged in the major groove of the DNA, and the two arginine residues from the helix, Arg-262 and Arg-266, anchor the bases of the DNA.
The G2 and C4, the most highly conserved nucleotides, are recognized by Arg-262 and Arg-266, as shown by the electron densities of the side chains of these two arginine residues (Fig. 2,  A and B). The guanidinium group of Arg-262 forms hydrogen bonds to both the O6 and N7 atoms of G7Ј nucleotide base, the complementary base of C4 ( Fig. 2A). The guanidinium group of Arg-266 forms hydrogen bonds with the O6 and N7 atoms of G2 nucleotide base (Fig. 2B). In addition, the C␤ atom of Arg-262 forms a van der Waals interaction with C8 atom of G2 nucleotide (Fig. 2C).
The nucleotide bases at positions 1 and 5 of the consensus sequence are also recognized by Arg-266 and Arg-262. The methyl group of T1 nucleotide and the C5 atom of C6Ј nucleotide form van der Waals interactions with the guanidinium groups of Arg-266 and Arg-262, respectively, which would also stabilize the AdpA-DBD-DNA complex (Fig. 2, D and E).
Interactions between AdpA-DBD and DNA Backbone-The backbone of the duplex DNA form extensive electrostatic and is the intensity of reflection hkl, and ϽI(hkl)Ͼ is the average intensity over all equivalent reflections.
where ͉F o ͉ and ͉F c ͉ are the observed and calculated structure factor amplitudes, respectively. d R free is the same as the R factor but for a 5% subset of all reflections that were not used in the refinement. e Core/allowed/generously allowed/disallowed. OCTOBER 25, 2013 • VOLUME 288 • NUMBER 43 van der Waals interactions with the residues in both of the HTH motifs and the long connecting helix ␣4 (Fig. 3, A and B). The specific base recognition determines the position of DNA recognized by AdpA-DBD, and the following extensive interactions with DNA backbone make an important contribution to the overall stability of the complex (Fig. 3, A and B).

DNA Recognition Mechanism by AdpA DNA-binding Domain
In the HTH1 motif, the guanidinium groups of Arg-261, Arg-267, and Arg-269 form salt bridges with the phosphate groups of C4Ј, T1, and C6Ј nucleotides, respectively. The carboxyl group of Asp-249 and the main chain NH group of Val-250 form hydrogen bonds with the phosphate group of C4Ј nucleotide. In addition, the C␤ atom of Ser-260 forms van der Waals interaction with 2Ј-deoxyribose of G2 nucleotide. In the long helix ␣4, the amide nitrogen atom of the side chain of Gln-278, the main chain NH group of Gln-278, and the main chain NH group of Leu-277 interact with the phosphate groups of C10, C5Ј, and C5Ј nucleotides, respectively. The C␤ atom of Ala-275 and the C␦ atom of Leu-277 form van der Waals interactions with the 2Ј-deoxyriboses of C5Ј and C10 nucleotides, respectively. In the HTH2 motif, the side chain guanidinium and imidazole groups of Arg-320, His-317, and Arg-309 form electrostatic interactions with the phosphate groups of C10, A11, and C12 nucleotides, respectively. In addition, the guanidinium group of Arg-315 forms a salt bridge with the phosphate group of A10Ј in the adjacent duplex DNA in the crystal lattice (Fig. 3,  A and B). The interaction with the adjacent DNA suggests that AdpA-DBD could bind to phosphate groups of the additional nucleotides added to the 3Ј-end of the consensus sequence in a longer duplex DNA or the genomic DNA. Because of these interactions, the molecular surfaces of HTH1 and ␣6 in HTH2 are complementary in shape to the surfaces of DNA (Fig. 3C).
Interaction between AdpA-DBD and Duplex DNAs Determined by Electrophoretic Mobility Shift Assay-To validate the intermolecular interactions observed in the crystal structure, we performed EMSA (Fig. 4). The apparent dissociation constant K D between the wild-type AdpA-DBD and the wild-type duplex DNA was estimated to be 2 M (Fig. 4A). In an extremely high concentration of AdpA-DBD (350 M; ϳ900-fold molar excess over the concentration of duplex DNA), a drastic mobility shift occurred. The supershifted band probably corresponds to a nonpreferred protein-DNA complex composed of one duplex DNA and two or more AdpA-DBD molecules, generated by nonspecific interactions.
The K D values of the wild-type AdpA-DBD to the mutant DNAs were estimated to be 60, 80, and 80 M for DNA mutants 1, 2, and 3, respectively, confirming the significant roles of the G2 and C4 nucleotides for the complex formation (Fig. 4, B-D). These results are consistent with the previous results of EMSA using the full-length AdpA and a longer duplex DNA containing an inverted repeat of two consensus sequences (15), indicating that EMSA results obtained in this study represent the DNA binding property of AdpA-DBD. The high mobility shifts with the excess amount of protein were also observed for these mutant DNAs, suggesting that nonspecific interactions became dominant under such conditions.
The K D values of the R262A and R266A mutants of AdpA-DBD for the wild-type DNA were estimated as 90 and 60 M, respectively, confirming the significant roles of Arg-262 and Arg-266 side chains for the consensus sequence recognition (Fig. 4, E and F). The smear bands observed with these mutant proteins could result from heterogeneous protein-DNA complexes formation because of their decreased DNA sequence preferences. Thus, the EMSA results are consistent with the crystal structure of the AdpA-DBD-DNA complex, showing that Arg-262 and Arg-266 are important for the DNA sequence preference of AdpA-DBD.   (Fig. 5). One of these AdpA-binding sequences, adsA (Ϫ) at the regulatory region of adsA gene, has been used for the quantitative analysis of DNA binding specificity in detail (15). The crystal structure of AdpA-DBD-DNA in this paper suggested a reasonable nucleotide sequence recognition model of AdpA-DBD for positions 1, 2, 3, 4, and 5 of the consensus sequence as described below, which shows good consistency with these previous results, although these results were obtained using a dimeric full-length AdpA. Our recognition model is most actual under the situation that one molecule of AdpA-DBD binds to one consensus sequence of a duplex DNA. When a full-length AdpA, which forms and works as a dimer, binds to DNA in the type I binding manner, the stable AdpA-DNA complex can be formed under the condition that the total binding affinity of the two sets of protein-DNA binding is enough, even if the binding affinity of one set of protein-DNA binding is too weak to form a complex between a monomeric AdpA-DBD and a consensus sequence. Thus, the nucleotide sequence preference of a full-length of AdpA would be less strictly conserved than that of one molecule of AdpA-DBD.
The nucleotide preference at position 1 of the consensus AdpA-binding sequence was predicted as T Ͼ C Ͼ A/G based on our crystal structure (Fig. 6A). If T1 nucleotide base was replaced with cytosine, the six-member pyrimidine ring would weakly form a van der Waals interaction with the guanidinium group of Arg-266. Contrastingly, if T1 nucleotide base was replaced with purine base, the smaller five-member ring could not form such interaction with Arg-266 because of the too long interatomic distance. These assumptions are well coincident with the previous results that the nucleotide base preference is T Ͼ C Ͼ A Ͼ G at position 1 (15). Of 45 confirmed AdpAbinding sites, 39 AdpA-binding sequences possess T or C at position 1, indicating that our sequence recognition model is quite reasonable.
Position 2 of the consensus sequence is predicted to be exclusively preferred as guanine based on our crystal structure, in consistent with previous data (Fig. 6B) (15). If G2 nucleotide was replaced with adenine or cytosine, the amino group would repulse the guanidinium group of Arg-266. If G2 nucleotide was replaced with thymine, the C7 atom of the methyl group would cause steric hindrance with the guanidinium group of Arg-266. Thus, nucleotide base 2 is preferred as G exclusively, and A/T/C at this position would destabilize the protein-DNA complex. Of 45 confirmed AdpA-binding sequences, 43 sequences with guanine at the position are consistent with our recognition model, and the other two sequences, ssgA site 1 (ϩ) (5) and sprA site B (Ϫ) (15), possess thymine at the position. These two irregular sequences are one of the two AdpA-binding sites for type I binding manner, and the other AdpA-binding sites fulfill the requirement for AdpA binding. Therefore, these two irregular sequences appear to be acceptable. Previously, Akanuma et al. (28) suggested an AdpA-binding 10-bp sequence, TTGCGACAAC, in which the second nucleotide was T not G, in the upstream region of SGR2648. This sequence was selected as the AdpA-binding sequence from a 40-bp sequence that was bound by full-length AdpA with an apparent K D of 140 nM. Taking our nucleotide recognition model into consideration, however, this binding sequence should be revised. We propose another candidate of AdpA-binding sequence, GGGCCGTTTT, in the 40-bp sequence (8 bp upstream of the previous one). This sequence perfectly fits our  OCTOBER 25, 2013 • VOLUME 288 • NUMBER 43 JOURNAL OF BIOLOGICAL CHEMISTRY 31023 nucleotide recognition model. This is a good example of the advantage of obtaining structural information.

DNA Recognition Mechanism by AdpA DNA-binding Domain
Our crystal structure suggested that the nucleotide preference at position 3 is G Ͼ A/T Ͼ C, although its structural basis is rather complicated (Fig. 6C). As for sense strand, the base moiety of G3 nucleotide locates near the C␤ and C␥ atoms of Arg-262, suggesting that the replacement of G3 nucleotide base with pyrimidine base causes steric hindrance between the car-bon atoms of Arg-262 and the atoms of the pyrimidine base exposed to the major groove (N4, O4, and C7 atoms). Cytosine and thymine are not preferred at position 3, based on this assumption. On the other hand, as for the antisense strand, the amino group of C8Ј nucleotide, the complementary base of position 3, locates near the guanidinium groups of Arg-262 and Arg-266, suggesting that the amino group repulses these positively charged guanidinium groups to their proper positions.   Table 1. The asterisk in each panel indicates the estimated protein concentration that bound 50% of the duplex DNA. A, EMSA using the wild-type AdpA-DBD and the wild-type DNA. B-D, EMSA using the wild-type AdpA-DBD and the mutant DNA. E and F, EMSA using the mutant AdpA-DBD and the wild-type DNA.
Based on this assumption, guanine and thymine are preferred at position 3. In summary, the nucleotide recognition model at position 3 is consistent with the previous results, G Ͼ A Ͼ T Ͼ C (15). Of 45 confirmed AdpA-binding sequences, 36, 3, and 6 sequences possess G, A, and T at the position, respectively, and no sequence possess C at the position, suggesting that our recognition model is reasonable.
Position 4 is assumed to be exclusively preferred as cytosine based on our crystal structure, consistent with previous data (Fig. 6D) (15). If C4 nucleotide base was replaced with guanine or thymine, the amino group of the complementary base would repulse the guanidinium group of Arg-262. If C4 nucleotide base was replaced with adenine, C7 atom of methyl group of the complementary base would cause steric hindrance with the guanidinium group of Arg-262. Thus, the fourth nucleotide base is preferred as C exclusively, and A/T/G at this position would destabilize the protein-DNA complex. All 45 confirmed AdpA-binding sequences conserved cytosine at the position, indicating that our recognition model is quite reasonable.
The nucleotide preference at position 5 is predicted as G/C/ T Ͼ A based on our crystal structure (Fig. 6E). The C6Ј nucleotide base, the complementary base of position 5, locates near the guanidinium group of Arg-262, suggesting that the replacement of C6Ј nucleotide base with thymine causes steric hindrance between the guanidinium group and the methyl group of thymine. Position 5 of each of the 45 confirmed AdpA-binding sequences shows good consistency with our recognition model. The only one exception is the adpA site 2B (ϩ), which possesses adenine at the position. This irregular sequence is one of the three AdpA-binding sequences in an unusually long binding site (site 2) in the operator region of adpA, and therefore it appears to be acceptable. In summary, the nucleotide sequence recognition model derived from our crystal structure of the AdpA-DBD-DNA complex is reasonable for all the confirmed AdpA-binding sites with no exception.  (21), the MarA-DNA complex is compared with the AdpA-DBD-DNA complex as discussed below. However, the crystal structure of Rob-DNA complex (20) is not pertinent in this paper, because it contains, to some extent, artifacts derived from crystal packing as previously mentioned (31).

Comparisons of AdpA with Other Transcriptional Regulatory Proteins of the AraC/XylS Family with a Highly Degenerate
MarA is a transcriptional activator of ϳ40 genes in E. coli, and its consensus binding sequence is a highly degenerate 20-bp sequence, 5Ј-AYNGCACNNWNNRYYAAACN-3Ј (where Y is T or C, W is A or T, R is A or G, and N is any nucleotide) (31,32). On the other hand, AdpA directly regulates more than 500 genes in S. griseus, and the consensus sequence of AdpA-binding sites is also a highly degenerate 10-bp sequence, 5Ј-TGGCSNGWWY-3Ј (where S is G or C, W is A or T, Y is T or C, and N is any nucleotide). Both AdpA and MarA are regarded as global transcriptional factors because of the large number of their target genes. It suggests that their low nucleotide sequence specificities are the key to the characteristics as global transcription factors. To establish the relationship between the sequence specificity and number of genes regulated, we have compared the nucleotide sequence recognition manners of these two global transcription factors.
The nucleotide base recognition manners of AdpA-DBD and MarA are summarized in Fig. 7. Probable polar and van der Waals interactions in the MarA-DNA complex are identified based on the rules previously applied (21); hydrogen bond and van der Waals interaction within 3.5 and 4.0 Å interatomic distances, respectively. AdpA-DBD forms two pairs of hydrogen bonds; between Arg-266 and the second nucleotide base and between Arg-266 and the complementary base of the fourth nucleotide. These four hydrogen bonds strictly recognize G2 and C4 bases. MarA forms five hydrogen bonds with nucleotide bases, resulting in the strict recognition of five bases of base pairs 4, 5, 13, 14, and 17 in the consensus sequence. As for nonpolar interaction, which is expected to recognize nucleotide bases more ambiguously than hydrogen bonds, AdpA forms van der Waals interactions with base pairs 1, 2, and 4, whereas MarA forms van der Waals interactions with base pairs 5, 6, 12, 16, and 17. In summary, four base pairs are recognized by two pairs of hydrogen bonds and three van der Waals interactions in the AdpA-DBD-DNA complex, whereas eight base pairs are recognized by five hydrogen bonds and eight van der Waals interactions in the MarA-DNA complex (Table 3). Thus, the nucleotide sequence specificity of MarA is higher than AdpA-DBD, consistent with the fact that AdpA can regulate much more genes than MarA (Table 3).
The mode of DNA binding is another difference between AdpA and MarA. Although MarA exists as a monomer and binds to DNA as a monomer (21), AdpA forms a dimer and binds to DNA in two different ways (14): a bivalent binding using both of the two DNA-binding domains (type-I binding manner) and a monovalent binding using one of the two DNAbinding domains (type-II binding manner). Furthermore, with the type I binding manner, the linker connecting the inverted repeat of two consensus sequences is varied in length. These characteristics, together with the low sequence specificity of the DNA-binding domain, provide AdpA a variety of nucleotide sequence recognition manners and probably, as a result, enable AdpA to regulate more than 500 genes directly.    Generally, the nucleotide steps of AA, TT, and AT are known to show a negative roll angle and narrow minor groove (33)(34)(35). As a result, the helical axis of the duplex DNA is bent, and the narrowed minor groove shows condensed negatively charges (35). In the crystal structure of the AdpA-DBD-DNA complex, the minor groove formed by nucleotides 10 and 11 in the sense strand and nucleotides 4 and 5 in the antisense strand of the consensus sequence is narrower (2.9 Å) than that of ideal B-DNA (5.9 Å), probably because of the two thymine nucleotides at positions 8 and 9 of the duplex DNA (Fig. 8, A and B). The helical axis of the duplex DNA bends moderately in the latter half of the consensus sequence (Fig. 1C), presumably because of the TT at positions 8 and 9. In addition, the narrowed minor groove is recognized by the positively charged surface between the two HTH motifs of AdpA-DBD (Fig. 8C). The recognitions of backbone phosphates are mainly conducted around the narrowed minor groove (Fig. 3B). These structural features suggest that AdpA-DBD performs shape readout for the position 8 and 9 TT.

Contribution of WW at the Eighth and Ninth Positions to
The consensus sequence of MarA-binding sites contains a 7-bp A/T-rich spacer between two recognition elements (nucleotides 4 -7 and 15-18 in the 20-bp consensus sequence) (31,36). The bending of duplex DNA of the MarA-DNA complex was possibly induced by the A/T-rich spacer. The A/T-rich spacer of MarA-binding sites and WW at positions 8 and 9 of AdpA-binding sites may be a common feature of these two binding sites.
In this paper, we have shown that Arg-262 and Arg-266, the residues from HTH1 motif, are essential for the recognition of the consensus sequence, especially for the recognition of G2 and C4 nucleotides. AdpA would recognize nucleotide bases of the consensus sequence directly only by HTH1 motif. However, we could not obtain enough structural information of probable interaction between AdpA-DBD and the nucleotides outside the consensus sequence at 3Ј-side, in contrast to the MarA-DNA complex, because the length of the duplex DNA used in our crystallization was shorter than that of the MarA-DNA complex. The possibility and biological significance of such interactions, which might regulate the binding affinity of AdpA-DBD, remain to be elucidated.