Domain Structure of Virulence-associated Response Regulator PhoP of Mycobacterium tuberculosis

The PhoP and PhoR proteins from Mycobacterium tuberculosis form a highly specific two-component system that controls expression of genes involved in complex lipid biosynthesis and regulation of unknown virulence determinants. The several functions of PhoP are apportioned between a C-terminal effector domain (PhoPC) and an N-terminal receiver domain (PhoPN), phosphorylation of which regulates activation of the effector domain. Here we show that PhoPN, on its own, demonstrates PhoR-dependent phosphorylation. PhoPC, the truncated variant bearing the DNA binding domain, binds in vitro to the target site with affinity similar to that of the full-length protein. To complement the finding that residues spanning Met1 to Arg138 of PhoP constitute the minimal functional PhoPN, we identified Arg150 as the first residue of the distal PhoPC domain capable of DNA binding on its own, thereby identifying an interdomain linker. However, coupling of two functional domains together in a single polypeptide chain is essential for phosphorylation-coupled DNA binding by PhoP. We discuss consequences of tethering of two domains on DNA binding and demonstrate that linker length and not individual residues of the newly identified linker plays a critical role in regulating interdomain interactions. Together, these results have implications for the molecular mechanism of transmission of conformation change associated with phosphorylation of PhoP that results in the altered DNA recognition by the C-terminal domain.

Bacterial adaptation usually takes the form of transcriptional regulation of genes, products of which specifically help the bacterium cope with the given microenvironment. One of the systems bacteria most frequently use to sense chemical or physical changes in the environment is the two-component regulatory system. Such a system typically consists of a histidine protein kinase that functions as an environmental sensor and a response regulator that mediates the transcriptional processes (1,2). The sensor kinase is usually autophosphorylated in response to environmental signals and subsequently transfers the phosphate to the regulator. In general, the response regulator functions as phosphorylation-activated switches that often mediate a cellular response through transcriptional regulation.
A number of investigations show that inactivation of phoP of Mycobacterium tuberculosis phoP-phoR system leads to significant growth attenuation (3)(4)(5). Also, biochemical results reveal that PhoP regulates biosynthesis of sulfolipids (SL), 5 diacyltrehaloses (DAT) and polyacyltrehaloses (PAT) and absence of these complex lipid molecules in the phoP mutant is a major reason for its attenuated growth in a mouse model (4 -5; for a review, see Ref. 6). While two independent studies show that a point mutation in PhoP contributes to avirulence of M. tuberculosis H37Ra (7) and also accounts for the absence of polyketide-derived acyltrehaloses in M. tuberculosis H37Ra (8), more recently PhoP has been implicated in the ESAT-6 secretion and specific T-cell recognition during virulence regulation of the bacilli (9). Thus, growing evidences strongly suggest that PhoP is a key regulator in M. tuberculosis. However, molecular mechanism of how the regulator functions remains largely unknown.
M. tuberculosis PhoP belongs to the OmpR/PhoB subfamily, the largest subfamily of response regulators comprising two domains, an N-terminal regulatory domain (also called a receiver domain) and a C-terminal transactivation domain (also called an effector domain) that binds specific DNA sequence within the target promoters and interacts with the cellular transcription machinery. The N-terminal domain shares a conserved, doubly wound (␣/␤) 5 fold which include a highly conserved aspartate residue that is the site of phosphorylation (Asp 71 in PhoP) (10). The C-terminal domain has been characterized and its crystal structure has been determined (Protein Data Bank code 2PMU) (11). The structural analysis reveals overall folds similar to those of other OmpR family proteins with a winged helix-turn-helix DNA binding motif involved in DNA binding (12)(13)(14)(15)(16). Despite strong structural homology within the family members, there exist significant differences in the mechanism of DNA binding functions. For example, although C-terminal structures of OmpR and PhoB are readily superimposed (17), phosphorylation of OmpR enhances DNA binding affinity for specific sites (18), whereas phosphorylation of PhoB results in a relief of inhibition of DNA binding (19).
Moreover, while the C terminus of OmpR binds to DNA only weakly (20), the C terminus of PhoB has a higher affinity for specific DNA than the unphosphorylated full-length protein (21). Interestingly, biochemical studies reveal that one of the major differences between OmpR and PhoB reside in the interdomain linker region that tethers together the N-terminal domain with the C-terminal domain (22). Consistent with this view, a previous study had shown that C-terminal DNA binding by OmpR could influence phosphorylation of the N terminus in which the linker region underwent a conformational change (23), thus suggesting a key role of the linker region in regulation of inter-domain interaction(s).
We are interested in the molecular mechanism responsible for transmission of the conformational change associated with phosphorylation of PhoP that is expected to influence DNA binding by the C-terminal domain. To better understand the functions of the N-terminal domain and inter-domain interaction(s) in effector domain regulation, we sought to investigate domain structure of PhoP. Results reported here identify an 11-residue long inter-domain linker that tethers two functionally-independent domains of PhoP together and regulates inter-domain interactions. While the newly-identified linker region is not required for either domain functions of PhoP, most strikingly, it plays an essential role for phosphorylationdependent DNA binding to msl3 promoter, previously suggested to be regulated by PhoP (4). These observations invite speculation on additional level(s) of complexity in the regulation of PhoP-controlled transcription processes. Together, our results suggest that although the DNA binding energy and specificity of regulator-promoter interactions is contributed primarily (but not entirely) by the C-domain, linker region of the protein likely allows the regulator to adopt a different phosphorylation-dependent conformation enabling it to discriminate target promoters while it regulates a vast array of genes to either activate or repress transcription.

EXPERIMENTAL PROCEDURES
Recombinant DNA Techniques-All enzymatic manipulations of DNA were performed using standard procedures (24) with reagents purchased from New England Biolabs (restriction endonucleases, T4 DNA ligase, alkaline phosphatase, and Deep Vent DNA polymerase). Plasmid DNA isolation, recovery, and purification of DNA fragments or PCR products from agarose gels were carried out using Qiagen spin columns and procedures (Qiagen). PAGE-purified oligonucleotides were synthesized by Sigma.
Preparation, Amplification, and Cloning of DNA and Sitedirected Mutagenesis-Escherichia coli DH5␣ was used for all cloning procedures. The gene encoding M. tuberculosis H37Rv PhoP was amplified by PCR and cloned into expression vector pET15b (Novagen) as described previously (25). For overproduction and purification of PhoPN and PhoPC domains in E. coli, parts of the phoP gene extending from Met 1 to Lys 141 of the N-terminal end and from Lys 141 to Arg 247 of the C-terminal end of PhoP were PCR-amplified using oligonucleotide pairs phoPstart and RPphoPN 141 or FPphoPC 141 and phoPstop as primers (supplemental Table S1) and the wild-type phoP gene (pET-phoP) as template and introduced between an NdeI and a BamHI site of pET15b (Novagen). The cloning strategy resulted in truncated constructs of PhoP with natural C termini and an N-terminal His tag.
DNA fragment comprising regions upstream of msl3 was PCR-amplified from the M. tuberculosis genomic DNA using ProofStart DNA polymerase (Qiagen) and oligonucleotide pair FPmsl3up/RPmsl3up (supplemental Table S1). Site-directed mutagenesis of individual PhoP residues (described under "Results") was carried out in the phoP gene (pET-phoP) by a PCR-based two-stage overlap extension method (26) using complementary oligonucleotides with the mutated codon (supplemental Table S2) and Deep Vent DNA polymerase. All mutants were verified by DNA sequence analysis to confirm the presence of the desired mutation and absence of any unintentional mutations.
Protein Purification-The mutant proteins or the truncated variants of PhoP were overexpressed in E. coli BL21(DE3) cells and purified by immobilized metal affinity chromatography (nickel-nitrilotriacetic acid, Qiagen) as described previously (10,12,25). The molecular masses determined by mass spectrometry were in agreement with the values predicted from primary sequence analysis. When appropriate, His tag was cleaved using the Thrombin Clean Cleavage kit (Sigma). The purity of all of the proteins used here was greater than 95% as judged by Coomassie Blue staining of overloaded SDS-polyacrylamide gels. Protein concentrations were estimated by Bradford reagent using bovine serum albumin as a calibration standard.
Trypsin Digestion of PhoP and Identification of Tryptic Fragments-Reaction mixtures (20 l) containing 4 g of PhoP were incubated in the digestion mixture containing 50 mM Tris-HCl, pH 7.90, 100 mM NaCl, 3 mM magnesium acetate, and 400 ng of trypsin at 37°C for 15 min, keeping the ratio of PhoP to trypsin at 100:1 (w/w). Proteolysis was terminated by the addition of SDS, and the aliquots from the digestion mixtures were resolved by 12% SDS-polyacrylamide gel electrophoresis and visualized by Coomassie Blue staining. To determine the N-terminal sequence of the tryptic fragments, resolved polypeptides from an SDS-polyacrylamide gel were transferred electrophoretically onto a polyvinylidene difluoride membrane (Millipore) and visualized by Amido Black (Sigma) staining. Membrane slices containing individual proteolytic products were excised and subjected to automated Edman sequencing using an Applied Biosystems protein sequencer.
PhoP Phosphorylation-Phosphorylation of PhoRC (C-terminal end of M. tuberculosis PhoR comprising residues Thr 193 -Pro 485 ; see Ref. 10) was carried out in phosphorylation buffer (50 mM HEPES, pH 7.5, 50 mM KCl, 10 mM MnCl 2 ) containing 25 M of [␥-32 P]ATP (BRIT, Hyderabad, India) at 25°C for 1 h. For phosphotransfer assays, PhoP or its truncated variants were added in phosphorylation buffer at 18°C for the indicated times. The reactions were terminated by the addition of 10 mM EDTA, and the products were resolved by 12% SDSpolyacrylamide gels. After electrophoresis, the gels were dried and visualized by autoradiography. When required, acetyl phosphate (AcP of Ն85% purity; Sigma) was used to phosphorylate recombinant PhoP in vitro. Previously, we provided a direct demonstration of phosphorylation of PhoP using AcP as the phospho-donor (27). Briefly, purified PhoP was added to phosphorylation buffer supplemented with 50 mM AcP and 10 mM MgCl 2 , and the mixtures were incubated at 37°C for 60 min.
CD Spectroscopy-Circular dichroism measurements were carried out with a JASCO spectropolarimeter, model J-810 (Jasco, Tokyo, Japan). All measurements were performed using a 0.1-cm cell. Residue molar ellipticity () was defined as 100 obs (lc) Ϫ1 , where obs was observed ellipticity; l was length of the light path in cm, and c was residue molar concentration of each protein. Scan speed was 100 nm/min, and 10 scans were signal-averaged to increase signal/noise ratio.
Oligodeoxyribonucleotide Probes-Concentrations of PAGEpurified oligonucleotides (Sigma) were determined from absorbance at 260 nm, using calculated extinction coefficients. The 60-bp DR1,2 DNA fragment comprising a 9-bp direct repeat motif was used to assess sequence-specific DNA binding by PhoP (or its variants) as described previously (12,25). This sequence of the phoP upstream region is within the PhoP-protected DNase I footprint (10,28). A nonspecific probe was derived from a distal region of phoP promoter and lacked any PhoP binding site (sequence as described in Ref. 25). The target binding site(s) in each set of oligonucleotides were flanked by matching nonspecific sequences, chosen to avoid fortuitous binding. 5Ј-32 P labeling of oligonucleotide probes was carried out using T4 polynucleotide kinase and [␥-32 P]ATP. Unincorporated nucleotides and labeled oligonucleotide were separated using a Sephadex-G50 quick spin column (GE Healthcare). DNA probes for EMSA were generated by annealing the labeled strand to its unlabeled complement in 10 mM Tris, pH 8.0, containing 50 mM NaCl following slow cooling to room temperature after heating it to 95°C for 5 min. Annealing efficiency was verified by native PAGE.

Phosphorylation-coupled DNA Binding by PhoP-Although
PhoP regulates 114 genes, until now it has only been reported to bind to its own promoter (10,25,28). Of the several PhoPregulated genes associated with lipid metabolism, one cluster of genes includes msl3, a polyketide ␤-ketoacyl synthase that is involved in the synthesis of polyacyltrehaloses (4). The finding that a PhoP knock-out mutant of M. tuberculosis H37Rv lacks diacyltrehaloses and polyacyltrehaloses in the cell envelope also provides direct evidence in support of the role of PhoP in regulating expression of msl3 (4,5). To determine if PhoP plays a regulatory role by direct binding to the target promoter region, the ability of PhoP to bind to DNA sequence upstream of msl3 (msl3up1 comprising Ϫ350 to ϩ60 with respect to the GTG start site) was investigated by EMSA. Purified PhoP was unable to form a slower moving complex stable to gel electrophoresis ( Fig. 1A). To investigate if phosphorylation of PhoP was required for in vitro binding to the msl3 upstream region, binding experiments were carried out with PhoP preincubated in phosphorylation mix with AcP as a phospho-donor. Strikingly, PhoP preincubated in phosphorylation mix containing AcP showed efficient DNA binding leading to formation of a stable complex with reduced electrophoretic mobility (compare lanes 3 and 4 with lanes 5 and 6). A quantitative analysis suggested that under identical binding conditions, at least 10-fold stimulation in DNA binding by phosphorylated PhoP was observed with the msl3 promoter DNA compared with the unphosphorylated protein (based on the limits of detection in this assay and based on other gels (data not shown)). As a control experiment, the PhoPD71N mutant (with impaired phosphorylation) preincubated in phosphorylation mix with or without AcP failed to generate a detectable PhoP-DNA complex formation with msl3 promoter DNA (lanes 7-10). From these results, we conclude that a specific interaction(s) between PhoP and the msl3 upstream region is dependent on PhoP phosphorylation. It is also noteworthy that phosphorylated PhoP showed effective msl3 promoter DNA binding at ϳ5-fold lower protein concentration. However, these results are in striking contrast with phosphorylation-independent DNA binding of PhoP to the phoP promoter region, as reported previously (10,12). In agreement with these reports, Fig. 1B shows that wild-type PhoP and PhoPD71N are comparably effective in their relative abilities to bind radiolabeled DR1,2 DNA (phoP promoter-derived oligonucleotide-based DNA substrate) (compare lanes 3-6 and 8 -11). Thus, our results suggest for the first time that PhoP retains the ability to utilize its phosphorylation domain in a promoter-specific manner.
Domain Functions of PhoP-To determine the role of N-domain and/or its phosphorylation in regulating PhoP function, we next investigated the domain structure of the protein. We previously studied histidine-aspartate phosphorelay between M. tuberculosis PhoR and PhoP proteins and showed that Asp-71 is the phospho-acceptor group of PhoP (10). A BLAST search of PhoP within the Protein Data Bank shows the highest sequence identity of 45% with M. tuberculosis PrrA (16) (Protein Data Bank code 1YS6) and the second highest identity of 33% with OmpR/PhoB homologue (14) (Protein Data Bank code 1KGS) from Thermotoga maritima. Comparison of sequence analysis of PhoP with its family members along with the available structural data of PhoPC (11) allowed us to design truncated versions of two domains based on the likely domain boundary ( Fig. 2A). To this end, PhoPN 141 and PhoPC 141 spanning residues Met 1 -Lys 141 and Lys 141 -Arg 247 , respectively, were cloned, expressed, and purified to homogeneity as described under "Experimental Procedures" (Fig. 2B). To assess the secondary structural content, we have measured far-UV CD spectra of the purified domains. Both proteins displayed a double trough shape with negative maxima at 209 and 221 nm, characteristic of ␣-helical structure (supplemental Fig. S1). Thus, our results indicate that the recombinant proteins are folded and retain stability even when expressed as individual domains.
To determine the functionality of PhoPN 141 , we next compared its phosphorylation efficiency with full-length PhoP using radiolabeled PhoRC as the phospho-donor (10). Transphosphorylation experiments under the conditions examined and at comparable protein concentration clearly show that PhoPN 141 on its own is just as efficient as the full-length protein in its ability to accept radiolabeled phosphate from labeled PhoRC (Fig. 2C, compare lanes 2-4 and 5-7).
For certain OmpR homologues, the purified winged helix domain is sufficient for binding to the target DNA (13,29). To compare the functionality of PhoPC 141 and PhoP, we tested the ability of the homogeneous protein samples to bind radiolabeled DR1,2 DNA (phoP promoter-derived oligonucleotidebased DNA substrate) consisting of a direct repeat motif (10,12,25,28). Purified PhoPC 141 was found to form a single slower moving complex stable to gel electrophoresis (Fig. 2D, lane 2). To test the specificity of interaction, we next examined PhoPC 141 binding to DR1,2 in the presence of different concentrations of unlabeled DR1,2 as specific competitor (lanes [3][4][5][6] or unlabeled nonspecific competitor (lanes 7-10; sequence as described previously (25)), which lacks a consensus PhoP binding site but has comparable base composition. Although more than 64 Ϯ 1.8% of reduction in PhoPC 141 binding was observed with an ϳ50-fold excess of specific competitor (compare lane 2 and lane 6), an insignificant variation of DNA binding (5 Ϯ 1% in the presence of a 50-fold molar excess of nonspecific competitor; compare lane 2 with lane 10) suggests that PhoPC 141 binding to DR1,2 site is sequence-specific. However, PhoPC 141 , as expected, was unable to form a complex of reduced electrophoretic mobility with msl3 promoter DNA (Fig. 2E) (lanes 2 and 3). In all cases, protein-DNA complexes were visualized by autoradiography and quantified by scanning the gels on a PhosphorImager. Open and filled arrowheads indicate origins of the polyacrylamide gel and retarded complexes, respectively.
Identification of an Interdomain Linker-Having defined the minimal phosphorylation domain within the PhoP protein, we sought to investigate the proximal boundary of the C-terminal domain. Individual protein domains can often be identified by limited proteolysis under native conditions because the flexible unstructured interdomain linkers are considerably more susceptible to cleavage (30,31). To determine likely domain boundaries within PhoP, purified protein was digested with trypsin and analyzed by SDS-polyacrylamide gel electrophoresis. The results showed that trypsin yielded two specific fragments (I and II) of ϳ15.5 and 11.2 kDa, respectively (Fig. 4A,  arrows). To probe the cleavage sites, both peptide fragments were transferred from the gel by electroblotting the samples onto polyvinylidene difluoride (Millipore) membrane and subjected to N-terminal sequencing. The results showed unambiguously that the fragments I and II were generated by trypsin cleavage of the peptide bonds between Arg 22 and Val 23 and between Arg 147 and Asn 148 of the PhoP primary sequence, respectively. Although the former cleavage site maps within PhoPN, interestingly, the latter site lies within the region of the PhoPC domain, which has been suggested as a part of the PhoP linker region (11).
To precisely define the linker, we constructed a collection of plasmids encoding a nested series of C-terminal PhoP fragments starting at residues Lys 144 , Arg 147 , and Arg 150 extending up to the C-terminal end (Arg 247 ) of full-length PhoP. These proteins, which are referred to as PhoPC 144 -PhoPC 150 , were overexpressed and purified (supplemental Fig. S2B). Interestingly, in EMSA experiments with 32 P-labeled DR1,2 DNA, all of the purified PhoPC fragments (PhoPC 144 , PhoPC 147 , and PhoPC 150 ) were found to be equally effective in forming slower moving complexes stable to gel electrophoresis (Fig. 4B). Over a range of protein concentrations examined, all of the three recombinant proteins (i) formed a single retarded complex in   NOVEMBER 5, 2010 • VOLUME 285 • NUMBER 45

JOURNAL OF BIOLOGICAL CHEMISTRY 34313
native polyacrylamide gel like that of PhoPC 141 and (ii) displayed comparable DNA binding affinity based on the limits of detection in this assay. However, two additional C-domain constructs, PhoPC 152 (residues 152-247) and PhoPC 154 (residues 154 -247), of PhoP failed to show any detectable expression, presumably due to the lack of in vivo stability (data not shown). Thus, the deletion analyses described above have established that the domain spanning residues Arg 150 -Arg 247 of PhoP is directly involved in and essential for DNA binding. In order to determine whether this domain is sufficient (on its own) for specific binding to the target DNA sites, the specificity of binding was demonstrated by its resistance to competition by an unlabeled heterologous competitor relative to the unlabeled homologous competitor (Fig. 4C). Quantification of the competition experiments shows that although there was a greater than 60 Ϯ 2% decrease in radioactivity of bound DNA by a 50-fold excess of specific competitor (compare lanes 2 and 6), there was a less than 4 Ϯ 1% variation of radioactivity of bound DNA in the presence of an identical -fold excess of nonspecific competitor (compare lanes 2 and 10). Thus, this domain is both necessary and sufficient for specific DNA binding activity of PhoP.
Based on the above experiments and in conjunction with previous results, we propose that residues Met 1 -Arg 138 comprise the minimal, functional N-terminal domain of PhoP, and residues Arg 150 -Arg 247 comprise the shortest C-terminal domain. Residues Ala 139 -Val 149 do not have any significant effect on the function of either domain in vitro, suggesting that this segment simply tethers two independent domains of the PhoP protein together. Fig. 5A shows a schematic presentation of the revised domain structure with residues indicating domain boundaries of PhoP.
To examine the importance of the linker residues in effector domain regulation, we next performed alanine-scanning mutagenesis to obtain single alanine substitutions of 10 nonalanine residues of the linker, and their effects on DNA binding were evaluated. We introduced the mutations into the phoP gene residing in plasmid pET-phoP. In order to expedite purification and analysis of mutant PhoPs, these proteins were expressed like wild-type PhoP containing an N-terminal polyhistidine tag and purified by nickel affinity chromatography (supplemental Fig. S3) as described under "Experimental Procedures." His tags were cleaved using the Thrombin Clean Cleavage kit from Sigma. To examine the effect of Ala substitution, DNA binding of purified mutant proteins was investigated by EMSA using an end-labeled DR1,2 probe. All of the point mutants, under the conditions examined and at comparable protein concentrations, displayed wild-type PhoP-like DNA binding affinity by efficiently forming a single retarded complex stable to gel electrophoresis (Fig. 5B). Thus, we surmise that amino acid residues of the linker region do not appear to have any significant role in DR1,2 DNA binding affinity of M. tuberculosis PhoP.
Role of the Linker Region in Phosphorylation-coupled DNA Binding by PhoP-We next examined msl3 promoter DNA binding by linker mutants of PhoP. Strikingly, all of the linker mutants with a single alanine replacing a nonalanine linker residue of PhoP showed efficient phosphorylation-dependent DNA binding similar to the wild-type protein (Fig. 6). These results suggest that a single residue of the linker region does not appear to regulate phosphorylation-coupled DNA binding to the msl3 promoter. To further investigate the importance of linker, we generated a number of linker deletion mutants with three, five, seven, and nine amino acid residues of the PhoP linker deleted, as shown schematically in (Fig. 7A). The deletion mutants were overexpressed, purified to homogeneity (data not shown), and examined for DNA binding activity. Importantly, all of the linker deletion mutants, including PhoPL⌬3-PhoPL⌬9, showed wild-type PhoP-like DNA binding ability when examined using end-labeled DR1,2 DNA (Fig. 7B). Although these observations indicate overall structural stability of linker deletion constructs of PhoP, consistent with our previous observations (Figs. 2 and 4), these results suggest that the affinity and specificity of DNA binding are primarily determined by the residues of the PhoP C-terminal domain.
We next investigated phosphorylation-coupled DNA binding of PhoP linker deletion mutants. Interestingly, PhoPL⌬3 showed phosphorylation-dependent msl3 promoter DNA binding similar to single alanine mutants of PhoP (Fig. 7C, lanes  10 -13). In striking contrast, with identical labeled probe, PhoPL⌬5 failed to generate a stable DNA-protein complex with reduced electrophoretic mobility (Fig. 7C, lanes 14 -17). It should be noted that alanine substitution of two residues, Gly 142 and Pro 146 , which makes up the difference between PhoPL⌬3 and PhoPL⌬5, did not influence phosphorylationcoupled DNA binding by PhoP (Fig. 7C, lanes 2-5 and lanes  6 -9, respectively). From these results, we conclude that not a single residue of the linker but rather a constellation of linker residues is important for phosphorylation-coupled DNA binding by PhoP. One additional control experiment examined the effect of alanine substitution for all of the five residues spanning Gly 142 -Pro 146 together in a single PhoP mutant (PhoPLAla5) on DNA binding by PhoP. Strikingly, the mutant PhoPLAla5 with unchanged linker length showed phosphorylation-dependent DNA binding to the msl3 promoter region as efficiently as the wild-type PhoP (Fig. 7D, lanes 3-6). Thus, our results suggest that linker length and not the residues of the PhoP linker region is essential for phosphorylation-dependent DNA binding by the regulator. Together, these results are summarized as a scheme showing two distinctly different modes of DNA binding by PhoP (Fig. 7E). According to this scheme, unphosphorylated PhoP binds to DNA independently of its linker; however, phosphorylated PhoP binding to a specific promoter is dependent on its linker length and not amino acid residues of the linker.

DISCUSSION
M. tuberculosis PhoP regulates more than 110 genes, acting both as a transcriptional activator and repressor of the target promoters (4). However, biochemical studies on its mechanism of action have been lacking. Here, we determine the precise domain structure of the protein and show that interdomain interactions between two functionally independent domains tethered together by an 11-residue-long linker play a key role in regulating functionality of this important regulator. Although the newly identified linker does not appear to be critical for phosphorylation-independent DNA binding, we demonstrate the importance of the linker in phosphorylation-dependent DNA binding of the regulator at a specific promoter.
We have recently shown that PhoP appears to bind symmetrically on tandem direct repeats, suggesting that one of the two repeat units is being recognized backwards (i.e. one recognizes 5Ј to 3Ј and the other recognizes 3Ј to 5Ј) (25). Efficient recognition of downstream repeat motif by the second PhoP protomer in reverse orientation concomitant with the binding of the first PhoP molecule to the upstream site is likely to involve flexibility sufficient to allow changes in the relative orientation of PhoP domains. In this work, we identify a linker region extending from at least Ala 139 to Val 149 . Because the residues of the linker region do not have any role in either of the domain functions (Fig. 2) and one of the trypsin cleavage sites maps within this region (Fig. 4A), we suppose that the linker region remains largely unstructured. An unstructured (and potentially long) linker between two domains might impart flexibility,   Fig. 1A. C, EMSA of radiolabeled msl3 promoter region for binding of the indicated single alanine mutants as well as linker deletion mutants of PhoP, preincubated in phosphorylation mix with or without AcP. D, EMSA of radiolabeled msl3 promoter region for binding of PhoP-LAla5 mutant, preincubated in phosphorylation mix with or without AcP. Each of the gels is representative of at least three independent experiments with at least two different preparations of protein stocks. Sample analysis and detection of protein-DNA complexes were as described in the legends to Fig.  1A. E, dual mode of DNA binding by unphosphorylated and phosphorylated PhoP, respectively, employing mechanisms that are either linker-independent or require the presence of at least eight residues of the PhoP linker (see "Results"). The N-terminal domain is represented as a cylinder, and the C-terminal DNA binding domain is represented as an ellipse for PhoP and a circle for phosphorylated PhoP (to suggest phosphorylation-coupled conformational change), respectively; the protein-protein interaction interface is shaded in gray. NOVEMBER 5, 2010 • VOLUME 285 • NUMBER 45 allowing changes in the relative orientation of the two domains of PhoP.

Domain Structure and Interdomain Interactions in PhoP
The full-length form of DrrD (14) and DrrB (15) from T. maritima, two members of PhoP family, have been crystallized and shown to be composed of two domains: an N-terminal one at which phosphorylation occurs and the DNA binding domain at the C-terminal end. However, the role of N-terminal domain and/or its phosphorylation in effector domain regulation remains unclear. The fact that affinity and specificity of DR1,2 DNA binding by PhoP is determined by the residues of the C-domain alone ( Fig. 4; see also Ref. 12) is consistent with the DNA binding domain of PhoB, the E. coli orthologue of M. tuberculosis PhoP, which shows higher affinity for DNA than the unphosphorylated full-length PhoB protein (21). However, this is in striking contrast with the DNA binding domain of Bacillus subtilis PhoP, which requires a more than 5-fold higher protein concentration for DNA binding to the target site compared with the unphosphorylated full-length PhoP (32). Thus, despite sharing largely similar structures, clear differences in their mode of action suggest that structural homology does not always lead to functional homology, even among closely related members of the protein subfamily, and linker length and/or sequence presumably contributes, at least in part, to the functional diversity of the response regulators. In agreement with this view, OmpR and PhoB with largely superimposable C-terminal structure (17) undergo phosphorylation-coupled activation via two distinctly different mechanisms (18,19). The difference that could account for the two different mechanisms of action resides in the interdomain linker region (15 residues versus 6 residues, respectively) (22). What is particularly interesting and, we think, mechanistically important is that here we show that M. tuberculosis PhoP comprises a linker, which is approximately average in length, to that of OmpR and PhoB, to retain the ability to bind the target promoters both in phosphorylation-independent and phosphorylation-dependent form.
Although phosphorylation is thought to play an important role to bind the target DNA to activate or repress gene expression in vivo, response regulators under in vitro conditions often bind their cognate recognition sequence in the absence of protein phosphorylation. Consistent with this view, PhoP has been shown to recognize DR1,2 independent of phosphorylation (12,25) (also see Fig. 1B). Also, members like PhoP from Salmonella enterica and B. subtilis form dimers in solution and bind DNA regardless of its phosphorylation status (33,34). However, phosphorylation of PhoB from E. coli induces dimerization and increases its affinity for the target DNA (35). In sharp contrast, structural data from DrrD (14) and DrrB (15) from T. maritima indicate that in the unphosphorylated state, the recognition helix is freely exposed to the solvent, which would make it available for DNA binding. In agreement with these findings, PhoPC bound to the PhoP boxes with an affinity comparable with that of the full-length protein. This suggests that the N-terminal end of PhoP does not influence binding to the target sequences (DR1,2 DNA) regardless of its phosphorylation state, a result further confirmed by efficient DR1,2 binding by PhoPD71N (Fig. 1B).
Because PhoP was shown to be a positive regulator of the synthesis of three classes of polyketide-derived acyltrehaloses known as SL, DAT, and PAT (4,5), we set out to study msl3 promoter recognition by PhoP. Interestingly, striking similarities in growth, attenuation, morphological, and cytochemical properties between M. tuberculosis H37Ra and M. tuberculosis phoP mutant have been proposed to be a direct consequence of the absence of these three kinds of complex lipids (5, 36 -38). It is noteworthy that within the Mycobacterium genus, SL, DAT, and PAT are relatively restricted to the virulent strains of the M. tuberculosis complex, suggesting that these lipids contribute to pathogenicity of the tubercle bacilli, an assumption that has been supported by a number of in vitro and in vivo studies (for a review, see Refs. 39 -46). In addition, recent studies establish a link between PhoP function and the secretion of ESAT-6 and CFP-10 (9), the secreted mycobacterial proteins that are immunodominant antigens in a majority of human tuberculosis cases (47). These reports in conjunction with recent results showing that among the reasons for the attenuation of the M. tuberculosis H37Ra strain is a single nucleotide polymorphism affecting codon 219 (TCG to TTG) in the phoP gene and changing a Ser to a Leu that (i) is incapable of restoring polyketide-derived acyltrehalose synthesis in a phoP-phoR knock-out mutant of H37Rv (8) and (ii) prevents the secretion of proteins that are important for virulence (7) explain why PhoP is considered a virulence-associated response regulator of M. tuberculosis.
A detailed study on the role of PhoP in complex lipid biosynthesis in M. tuberculosis suggests that PhoP probably regulates the expression of the acyltransferase, polyketide synthase (pks), or pks-associated genes involved in the synthesis or transfer of the methyl branched fatty acyl substituents found in SL, DAT, and PAT (5). However, the presence of PhoR, the cognate sensor kinase of PhoP was not required for the synthesis of these lipids because complementation of 1237⌬phoPR::hyg (a 774-bp fragment of the M. tuberculosis genome encompassing part of the phoP and phoR coding sequences was replaced with a hygromycin resistance cassette from Streptomyces hygroscopicus; see Ref. 5) with phoP alone was sufficient to restore their synthesis (5). Thus, it was suggested that phosphorylation might only serve to increase the affinity of the response regulator for certain promoters and might not be essential for the binding of PhoP to its DNA target (5) as has been shown by PhoQ-independent activation of target genes by S. enterica PhoP (48). In agreement with this view, we observed a striking difference in DNA binding affinity of phosphorylated PhoP to the msl3 promoter compared with the DR1,2 DNA binding (Fig. 1, compare A and B). In fact, this view gets further support from our additional results showing phosphorylation-independent comparably effective msl3 promoter recognition by wildtype PhoP and PhoPD71N at relatively higher protein concentrations (supplemental Fig. S4). Furthermore, recent analyses of lipid composition revealed that M. tuberculosis H37Ra is deficient in the production of phthiocerol dimycocerosates (8), a family of polyketide-derived lipids implicated in the virulence of M. tuberculosis (49 -52). Studies with lipid-deficient mutants of M. tuberculosis suggest that loss of phthiocerol dimycocerosate in addition to SL, DAT, and PAT explains the inability of M. tuberculosis H37Ra to fix neutral red (53). It is noteworthy that contrary to the production of SL, DAT, and PAT, the synthesis of phthiocerol dimycocerosate is not under the regulatory control of the phoP-phoR system (5). In fact Chesne-Seck et al. (8) recently showed that synthesis of phthiocerol dimycocerosate of M. tuberculosis H37Ra was not restored upon complementation with phoP from M. tuberculosis H37Rv. Thus, a mutation replacing Asp 71 by Asn at the primary phosphorylation site of PhoP is unlikely to influence phenotypic consequences through effects on lipid composition. Nevertheless, our results shown here clearly demonstrate the importance of the linker for transmission of the effect of phosphorylation that influences high affinity DNA binding by the response regulator at the promoter region of the gene involved in lipid biosynthesis.
Taken together, these in vitro data suggest that coupling of two functional domains of PhoP in a single polypeptide chain is essential for interdomain interaction(s), which regulates phosphorylation-dependent DNA binding. We consider different mechanisms to explain these results. First of all, the fact that the truncated PhoPC domain is as efficient as the full-length protein in binding to DR1,2 DNA rules out the possibility of occlusion of DNA binding surfaces by PhoPN. Thus, a model in which interaction of PhoP with DNA can cause the N-domain to move away from the C terminus, a process that might remove a barrier to stable DNA binding, does not fit with our results. However, a potential mechanism involving interdomain interactions between the N-domain and the C-domain facilitate an integrated view of our results. In fact, this is in agreement with our earlier observations suggesting conformational change within C-domain of the protein as a result of phosphorylation at the N terminus of PhoP (27). The fact that a linker deletion mutant with deletion of 5 residues or more fails to recognize msl3 promoter region but is otherwise active for DR1,2 DNA binding is strongly suggestive of the importance of linker length for phosphorylation-dependent DNA binding. Thus, it is tempting to speculate that a long, flexible linker region is essential for the transmission of the conformational change from the N-domain to the C-domain of the protein. Consistent with this view, in the structure of M. tuberculosis PrrA, which shares the highest (45%) sequence identity with PhoP, the recognition helix is involved in interactions with the regulatory domain (16).
Although architectural variations among response regulator family members are not surprising and are even expected because of weak sequence homologies within the diverse target promoters, one wonders how to address promoter-specific recognition by conserved structures to achieve diverse functions. Based on these results and results from other studies (for a review, see Ref. 54) and the considerations discussed above, we suggest that much (but not all) of the binding energy and specificity for interaction with target DNA sites and response regulator proteins come from the C-terminal DNA binding domain and that additional discrimination is provided by the linker region during recognition of the target DNA site. In support of this view, we cite experiments from Kenney and co-workers (22) that attempted to understand the mechanism of effector domain regulation of homologous proteins (E. coli OmpR and PhoB). These experiments, which exploited chimeric constructs of OmpR and PhoB with substituted linkers, measured in vivo transcription activation. Consistent with our suggestion, their transcription assays suggested that the effector domain regulation by either N terminus required its cognate interdomain linker.
In conclusion, our results identify the domain structure of PhoP and provide new insights into the mode of action of the transcription regulator. Although we determine that interdomain interactions play an important role in regulating PhoP function, our results clearly show that coupling of the two functional domains of PhoP together in a single polypeptide chain is essential for phosphorylation-dependent DNA binding. Most strikingly, we also demonstrate that although PhoP recognizes the core DNA site within its own promoter in a linker-independent manner, the linker region of the regulator is essential for phosphorylation-coupled specific recognition of another target promoter (msl3 promoter). Although much work remains to be done in identifying genetic determinants (within the msl3 promoter) regulated by phosphorylated PhoP and mechanism of transcription regulation, the present work clearly shows that the regulator retains the capability of interacting with multiple promoters by at least two distinctly different modes (namely linker-independent and linker-dependent), a novel result of unusual significance with implications for the mechanism of DNA binding and transcription regulation by the key regulator.