Structural Basis for the Regulation of the MmpL Transporters of Mycobacterium tuberculosis*

Background: The expression of MmpLs is controlled by a complex regulatory network, including the TetR family regulators Rv3249c and Rv1816. Results: Both Rv3249c and Rv1816 form dimeric two-domain molecules with architecture consistent with the TetR family regulators. Conclusion: These regulators are able to recognize the promoter and intragenic regions of multiple mmpLs. Significance: These findings suggest that saturated fatty acids may be natural ligands for these regulators. The mycobacterial cell wall is critical to the virulence of these pathogens. Recent work shows that the MmpL (mycobacterial membrane protein large) family of transporters contributes to cell wall biosynthesis by exporting fatty acids and lipidic elements of the cell wall. The expression of the Mycobacterium tuberculosis MmpL proteins is controlled by a complex regulatory network, including the TetR family transcriptional regulators Rv3249c and Rv1816. Here we report the crystal structures of these two regulators, revealing dimeric, two-domain molecules with architecture consistent with the TetR family of regulators. Buried extensively within the C-terminal regulatory domains of Rv3249c and Rv1816, we found fortuitous bound ligands, which were identified as palmitic acid (a fatty acid) and isopropyl laurate (a fatty acid ester), respectively. Our results suggest that fatty acids may be the natural ligands of these regulatory proteins. Using fluorescence polarization and electrophoretic mobility shift assays, we demonstrate the recognition of promoter and intragenic regions of multiple mmpL genes by these proteins. Binding of palmitic acid renders these regulators incapable of interacting with their respective operator DNAs, which will result in derepression of the corresponding mmpL genes. Taken together, these experiments provide new perspectives on the regulation of the MmpL family of transporters.

The mycobacterial cell wall is critical to the virulence of these pathogens. Recent work shows that the MmpL (mycobacterial membrane protein large) family of transporters contributes to cell wall biosynthesis by exporting fatty acids and lipidic elements of the cell wall. The expression of the Mycobacterium tuberculosis MmpL proteins is controlled by a complex regulatory network, including the TetR family transcriptional regulators Rv3249c and Rv1816. Here we report the crystal structures of these two regulators, revealing dimeric, twodomain molecules with architecture consistent with the TetR family of regulators. Buried extensively within the C-terminal regulatory domains of Rv3249c and Rv1816, we found fortuitous bound ligands, which were identified as palmitic acid (a fatty acid) and isopropyl laurate (a fatty acid ester), respectively. Our results suggest that fatty acids may be the natural ligands of these regulatory proteins. Using fluorescence polarization and electrophoretic mobility shift assays, we demonstrate the recognition of promoter and intragenic regions of multiple mmpL genes by these proteins. Binding of palmitic acid renders these regulators incapable of interacting with their respective operator DNAs, which will result in derepression of the corresponding mmpL genes. Taken together, these experiments provide new perspectives on the regulation of the MmpL family of transporters.
Understanding the molecular mechanisms underlying the biogenesis of the mycobacterial cell wall not only elucidates the basic biology of pathogenic mycobacteria but also identifies potential targets for antimicrobials. Mycolic acids are essential to mycobacterial viability and are incorporated as trehalose dimycolate and arabinogalactan mycolates. Biosynthesis of these respective mycobacterial cell wall lipids is targeted by the first line anti-tuberculosis drugs isoniazid and ethambutol (12)(13)(14). Mycobacterial cell wall biosynthesis is facilitated by the MmpL (mycobacterium membrane protein large) transporters (15), which belong to the RND (resistance-nodulation-cell division) superfamily (16). Based on the genomic sequence of M. tuberculosis H37Rv (15), this organism harbors 14 different MmpL proteins. MmpL3 is essential, and MmpL4, MmpL5, MmpL7, MmpL8, MmpL10, and MmpL11 are required for full Mtb virulence. Similar to the RND efflux pumps of Gram-negative bacteria, several of these MmpL transporters appear to work in conjunction with smaller accessory proteins called MmpS (mycobacterial membrane protein small) (7,17,18). However, unlike other RND family proteins, the MmpL proteins in M. tuberculosis are not believed to export antibiotics (7). Instead, there is strong evidence that these MmpL transporters and their MmpS accessory proteins are responsible for shuttling fatty acid and lipid components of the cell wall, such as trehalose monomycolate, sulfolipids, phthiocerol dimycoc-erosate, diacyltrehalose, monomeromycolyl diacylglycerol, and mycolate wax ester (7-9, 11, 19, 20, 22-25).
Cell wall biogenesis and subsequent remodeling is crucial to the ability of Mtb to establish and maintain infection. It is therefore likely that Mtb carefully controls expression of cell wall lipid biosynthetic enzymes and MmpL lipid transporters. We capitalized on data made available by the Tuberculosis Systems Biology Consortium to begin an in-depth analysis of how mmpL and mmpS genes are regulated. Currently, ChIP-Seq data for 82 of the 180ϩ Mtb transcription factors are available on the Tuberculosis Database (26 -29). Utilizing these data, we recently showed that the MarR family regulator Rv0678 regulates the mmpS2-mmpL2, mmpS4-mmpL4, and mmpS5-mmpL5 genes (30). In addition, we determined the crystal structure of Rv0678, which provided crucial insight into the induction mechanism of Rv0678 (30). Intriguingly, the Rv0678 crystal structure indicated the existence of a fortuitous bound ligand that was identified as 2-palmitoylglycerol (C 21 H 42 O 4 ). The induced conformational change leading to substrate-mediated derepression is primarily caused by a rigid body rotational motion of the entire DNA-binding domain of the regulator toward the dimerization domain (30).
In this paper, we report the crystal structures of the TetR family transcriptional regulators Rv3249c and Rv1816, chosen based upon their predicted regulatory interactions with the mmpL3 or mmpL11 loci. Binding of these transcriptional regulators to the promoter and intragenic regions of mmpL genes is summarized in Fig. 1. The crystal structures reported here highlight the unique properties of these regulators.
Typically, the TetR family regulators are ␣-helical dimeric proteins, consisting of a smaller N-terminal DNA-binding domain and a larger C-terminal regulatory domain (31,32). The N-terminal domains are quite conserved in protein sequences and form a helix-turn-helix motif for DNA binding. However, the C-terminal sequences are poorly conserved, forming ligand-specific binding domains for inducing molecules. Our data indicate that saturated fatty acids are able to bind at the C-terminal ligand-binding domains of Rv3249c and Rv1816. Interestingly, we have recently also found that the M. tuberculosis Rv0302 regulator can recognize fatty acid molecules (33).  (29). The colored shapes corresponding to each transcription factor (red circle, Rv3249c; blue triangle, Rv1816) are placed at the putative binding sites.
These findings suggest that fatty acids may be the natural ligands of these regulatory proteins. We used fluorescence polarization and EMSA to demonstrate that these proteins regulate multiple mmpL genes. These results emphasize the complexity of the mmpL regulatory system and a novel mechanism by which the bacterium can sense metabolic state to modulate cell wall lipid biosynthesis and transport to maintain homeostasis and promote virulence.

Experimental Procedures
Cloning of rv3249c and rv1816 -The rv3249c ORF from genomic DNA of M. tuberculosis strain H37Rv was amplified by PCR using the primers 5Ј-CTTTAAGAAGGAGATATACCA-TGGTGAGCACTCCAAGCGCTAC-3Ј and 5Ј-GATCCTCA-GTGATGATGGTGGTGATGGGGGACATTGATCACAC-CGTG-3Ј to generate a product that encodes a Rv3249c recombinant protein with a His 6 tag at the C terminus. The corresponding PCR product was digested with NcoI and BamHI, extracted from the agarose gel, and inserted into pET-15b as described by the manufacturer (Merck). The recombinant plasmid (pET15b⍀rv3249c) was transformed into DH10b cells, and the transformants were selected on LB agar plates containing 100 g/ml ampicillin. The presence of the correct rv3249c sequence in the plasmid construct was verified by DNA sequencing.
The procedures for producing the recombinant plasmid pET15b⍀rv1816 were the same as those for pET15b⍀rv3249c. The rv1816 ORF from genomic DNA of M. tuberculosis strain H37Rv was amplified by PCR using the primers 5Ј-CTTTAA-GAAGGAGATATACCATGTTGTGTCAGACTTGCCGCG-TG-3Ј and 5Ј-GATCCTCAGTGATGATGGTGGTGATGCT-CGGCCAGCACGGCCAC-3Ј to generate a product that encodes the Rv1816 recombinant protein with a His 6 tag at the C terminus.
Expression and Purification of Rv3249c and Rv1816 -Briefly, the full-length Rv3249c protein containing a His 6 tag at the C terminus was overproduced in Escherichia coli BL21(DE3) cells possessing pET15b⍀rv3249c. The cells were grown in 6 liters of 5Ј-ACAGATTTCGTGAAATCGGG-3Ј rv1094 6.9 Ϯ 0.7 5Ј-F-CCCGATTTCACGAAATCTGT-3Ј a a F denotes the fluorescein that was covalently attached to the 5Ј end of the oligodeoxynucleotide (reversed) by a hexamethylene linker.

Primer Sequence
LB medium with 100 g/ml ampicillin at 37°C. When the A 600 reached 0.5, the culture was treated with 0.2 mM isopropyl-␤-D-thiogalactopyranoside to induce Rv3249c expression, and cells were harvested within 3 h. The collected bacterial cells were suspended in 100 ml of ice-cold buffer containing 20 mM Na-HEPES (pH 7.2) and 200 mM NaCl, 10 mM MgCl 2 , and 0.2 mg of DNase I (Sigma-Aldrich). The cells were then lysed with a French pressure cell. Cell debris was removed by centrifugation for 45 min at 4°C and 20,000 rev/min. The crude lysate was filtered through a 0.2-m membrane and was loaded onto a 5-ml Hi-Trap Ni 2ϩ -chelating column (GE Healthcare Biosciences) pre-equilibrated with 20 mM Na-HEPES (pH 7.2) and 200 mM NaCl. To remove unbound proteins and impurities, the column was first washed with six column volumes of buffer containing 50 mM imidazole, 250 mM NaCl, and 20 mM Na-HEPES (pH 7.2). The Rv3249c protein was then eluted with four column volumes of buffer containing 300 mM imidazole, 250 mM NaCl, and 20 mM Na-HEPES (pH 7.2). The purity of the Rv3249c protein (Ͼ95%) was judged using 12.5% SDS-PAGE stained with Coomassie Brilliant Blue. The purified protein was then concentrated to 10 mg/ml. For the His 6 SeMet-Rv3249c protein expression, a 10-ml LB broth overnight culture containing E. coli BL21(DE3)/ pET15b⍀rv3249c cells was transferred into 60 ml of LB broth containing 100 g/ml ampicillin and grown at 37°C. When the A 600 value reached 1.2, cells were harvested by centrifugation at 5,000 ϫ g for 10 min and then washed two times with 10 ml of M9 minimal salt solution. The cells were resuspended in 60 ml of M9 medium and then transferred into a 6-liter prewarmed M9 solution containing 100 g/ml ampicillin. The cell culture was incubated at 37°C with shaking. When the A 600 reached 0.4, 100 mg/liter of lysine, phenylalanine, and threonine, 50 mg/liter isoleucine, leucine, and valine, and 60 mg/liter of L-selenomethionine were added. The culture was induced with 0.2 mM isopropyl-␤-D-thiogalactopyranoside after 15 min. The cells were then harvested within 3 h after induction. The procedures for purifying SeMet-Rv3249c were identical to those of the native protein.
The full-length Rv1816 protein containing a His 6 tag at the C terminus was overproduced in E. coli BL21(DE3) cells possessing pET15b⍀rv1816. The procedures for expressing and purifying the Rv1816 protein were the same as those for Rv3249c. The purity of the Rv1816 protein (Ͼ95%) was judged using 12.5% SDS-PAGE stained with Coomassie Brilliant Blue. The purified protein was then concentrated to 10 mg/ml.
Crystallization of Rv3249c and Rv1816 -The His 6 Rv3249c crystals were grown in microcentrifuge tubes (Fisher Scientific). Briefly, 10-l aliquots of protein solution containing 20 mg/ml Rv3249c protein in 20 mM Na-HEPES (pH 7.5), 250 mM NaCl, and 300 mM imidazole were incubated at 4°C. Crystals of Rv3249c grew to a full size in the drops within 2 weeks. Typically, the dimensions of the crystals were 0.05 mm ϫ 0.05 mm ϫ 0.1 mm. Cryoprotection was achieved by raising the glycerol concentration stepwise to 40% with a 5% increment in each step. Crystals of SeMet-Rv3249c were obtained using the same procedures as for the native Rv3249c crystals. Typically, the dimensions of the crystals were 0.05 mm ϫ 0.05 mm ϫ 0.1 mm.  The His 6 Rv1816 crystals were grown at room temperature in 24-well plates (Hampton Research, Aliso Viejo, CA) using hanging drop vapor diffusion with the following procedures. A 1-l protein solution containing 10 mg/ml Rv1816 protein in 20 mM Na-HEPES (pH 7.5), 250 mM NaCl, and 300 mM imidazole was mixed with 1 l of reservoir solution containing 0.05 M sodium cacodylate (pH 6.5), 2 M ammonium sulfate, and 10 mM magnesium sulfate. The resultant mixture was equilibrated against 500 l of the reservoir solution. Hexagonal crystals appeared within 2 days and grew to full size in the drop within 1 week. Typically, the dimensions of the crystals were 0.05 mm ϫ 0.05 mm ϫ 0.1 mm. Cryoprotection was achieved by raising the ammonium sulfate concentration stepwise to 3.5 M.
Data Collection, Structural Determination, and Refinement-All diffraction data were collected at 100 K at Beamline 24ID-C located at the Advanced Photon Source, using a Pilatus 6M detector (Dectris Ltd., Switzerland). Diffraction data were processed using DENZO and scaled using SCALEPACK (34).
The Rv3249c crystals belong to the space group C2 (Table 1). Based on the molecular weight of Rv3249c (22.8 kDa), six Rv3249c molecules are expected in the asymmetric unit with a solvent content of 62.96%. Within the asymmetric unit of SeMet-Rv3249c, 12 selenium sites (two per monomer) were identified using SHELXC and SHELXD (35) as implemented in the HKL2MAP package (36). The full-length Rv3249c protein contains two methionine residues; both were identified as SeMet sites in each protomer. Single anomalous dispersion was employed to obtain experimental phases using the program MLPHARE (37,38). The resulting phases were then subjected to density modification and NCS averaging using the program PARROT (39) using the native structure factor amplitudes. The SeMet sites were also used to trace the molecules by anomalous difference Fourier maps, where we could ascertain the proper registry of SeMet residues. The initial model, which consists of three dimeric molecules, was constructed manually using program Coot (40). Then the model was refined using PHENIX (41), leaving 5% of reflections in the Free-R set. Iterations of refinement using PHENIX (41) and CNS (42) and model building in Coot (40) lead to the current model, which consists of three dimeric structures with a total number of 1,160 residues ( Table 1).
The Rv1816 crystal belongs to the space group P2 1 2 1 2 1 , resulting in a single Rv1816 dimer (25.4 kDa) in the asymmetric unit with a solvent content of 51.78%. The structure of Rv1816 was determined using molecular replacement. The partial structure of the TetR family protein SCO4313 (residues 61-229) (Protein Data Bank code 2OI8) was utilized as a template. The resulting phases were then subjected to density modification using the program RESOLVE (43). The remaining part of the model was manually constructed using the program Coot (40). Model was then refined using PHENIX (41), leaving 5% of reflections in the Free-R set. Iterations of refinement using PHENIX (41) and CNS (42) and model building in Coot (40) lead to the current model of the Rv1816 dimer, consisting of a total number of 449 residues ( Table 1).
Identification of Fortuitous Ligands-To identify the nature of the bound ligand in crystals of Rv3249c, we used GC-MS. The Rv3249c crystals were extensively washed with the crystallization buffer and transferred into deionized water. The mixture was then incubated at 100°C for 5 min, and then chloroform was added into the mixture to a final concentration of 80% (v/v) to denature the protein and allow for the extraction of ligand. GC-MS analysis indicated that the mass of the bound ligand was n-hexadecanoic acid, also called palmitic acid.
We used the same method to identify the bound ligand in crystals of Rv1816. GC-MS analysis indicated that the mass of the bound ligand corresponded to isopropyl dodecanoate, also called isopropyl laurate.
Isothermal Titration Calorimetry for Ligand Binding-We used isothermal titration calorimetry (ITC) to determine the binding affinity of palmitic acid to the purified Rv3249c regulator. Measurements were performed on a VP-Microcalorimeter (MicroCal, Northampton, MA) at 25°C. Before titration, the protein was thoroughly dialyzed against buffer containing 10 mM sodium phosphate (pH 7.2), 100 mM NaCl, and 0.001% Injections occurred at intervals of 240 s, and the duration time of each injection was 20 s. Heat transfer (cal/s) was measured as a function of elapsed time (s). The mean enthalpies measured from injection of the ligand in the buffer were subtracted from raw titration data before data analysis with ORI-GIN software (MicroCal). Titration curves were fitted by a nonlinear least squares method to a function for the binding of a ligand to a macromolecule. Nonlinear regression fitting to the binding isotherm provided us with the equilibrium binding constant (K A ϭ 1/K D ) and enthalpy of binding (⌬H). Based on the values of K A , the change in free energy (⌬G) and entropy (⌬S) were calculated with the equation: ⌬G ϭ ϪRT lnK A ϭ ⌬H Ϫ T⌬S, where T is 273 K and R is 1.9872 cal/K per mol. Calorimetry trials were also carried out in the absence of Rv3249c in the same experimental conditions. No change in heat was observed in the injections throughout the experiment.
ITC was also used to determine the binding affinities of palmitate and laurate to the purified Rv1816 regulator. The Rv1816 protein was thoroughly dialyzed against buffer containing 10 mM sodium phosphate (pH 7.2), 100 mM NaCl, and 0.001% DDM. The dimeric Rv1816 sample was then adjusted to a final concentration of 10 M. The ligand solution contained 400 M palmitic or lauric acid, 10 mM sodium phosphate (pH 7.2), 100 mM NaCl, and 0.001% DDM. The procedures for these binding experiments were identical to those for the Rv3249c protein.
Fluorescence Polarization Assay for DNA Binding-Fluorescence polarization assays were used to determine the affinities for DNA binding by Rv3249c and Rv1816, respectively. All oligodeoxynucleotides and their corresponding fluorescein-labeled oligodeoxynucleotides were purchased from Integrated  Table 2. In short, the fluoresceinated dsDNA was prepared by annealing the oligodeoxynucleotide and its corresponding fluorescein-labeled oligodeoxynucleotide together. Fluorescence polarization experiment was done using a DNA binding solution containing 10 mM sodium phosphate (pH 7.2), 100 mM NaCl, 2.5 nM fluoresceinated DNA, and 1 g of poly(dI-dC) as nonspecific DNA. The protein solution containing 500 nM dimeric Rv3249c or Rv1816, and 2.5 nM of the respective fluoresceinated DNA was titrated into the DNA binding solution until the millipolarization became unchanged. All measurements were performed at 25°C using a PerkinElmer LS55 spectrofluorometer equipped with a Hamamatsu R928 photomultiplier. The excitation wavelength was 490 nm, and the fluorescence polarization signal (in ⌬P) was measured at 525 nm. Each titration point recorded was an average of 15 measurements. The data were analyzed using the equation, p ϭ {(P bound Ϫ P free )[protein]/(K D ϩ [protein])} ϩ P free , where P is the polarization measured at a given total protein concentration, P free is the initial polarization of free fluorescein-labeled DNA, P bound is the maximum polarization of specifically bound DNA, and [protein] is the protein concentration. The titration experiments were repeated for three times to obtained the average K D value. Curve fitting was accomplished using the program ORIGIN (OriginLab Corporation, Northampton, MA).
Electrophoretic Mobility Shift Assay-Probes were amplified from the H37Rv genome using the primers listed in Table 3. All probes were labeled with digoxigenin using the Roche DIG gel shift kit. For EMSA analysis, 12 nM Dig-labeled probe and the indicated micromolar concentrations of protein were incubated for 45 min at room temperature in the Roche binding buffer modified by the addition of 0.25 mg/ml herring sperm DNA, and 0.75 mg/ml poly(d[I-C]). For ligand competition assays, stock solutions of fatty acids were made in DMSO and a solvent control reaction included at the highest concentration of DMSO. All reactions were resolved on a 6% native polyacrylamide gel in TBE buffer and transferred to nylon membrane and Dig-labeled DNA-protein complexes detected following the manufacturer's recommendations. Chemiluminescent signals were acquired using an ImageQuant LAS 4000 (GE).

Results and Discussion
Overall Structure of Rv3249c-The crystal structure of the 211-amino acid Rv3249c protein was determined to a resolution of 3.59 Å using single anomalous dispersion (Table 1 and Fig. 2). Six molecules of Rv3249c are found in the asymmetric unit, which assemble as three independent dimers. Superimposition of these six Rv3249c molecules gives root mean square deviations between 0.8 and 1.0 Å over 190 C␣ atoms, indicating that their conformations are nearly identical to each other.
Like other members of the TetR family regulators (31, 32), the structure of Rv3249c suggests that this regulator is an all ␣-helical protein. Each subunit of Rv3249c dimer is composed of 10 ␣-helices (␣1-␣10 and ␣1Ј-␣10Ј, respectively) (Fig. 3). These helices are designated as ␣1 Helices ␣4, ␣5, ␣8, and ␣9 form an antiparallel four-helix bundle, which creates a large ligand-binding cavity to accommodate for the inducing ligands. Helices ␣6 and ␣7, and the loop connecting these two short helices create the floor of this ligand-binding site. It should be noted that a typical TetR regulator normally utilizes a single helix, which runs horizontally to form the bottom of the ligand-binding site. Rv3249c is distinct in that it employs two inclined helices to create a V-shaped bottom, allowing it to enlarge the volume of the binding site. Indeed, each subunit of Rv3249c possesses a very large cavity, which was estimated to be ϳ450 Å 3 , within the C-terminal ligand binding domain. Based on this observation, we expect that this regulator can accommodate a large inducing ligand.
The smaller N-terminal domain of Rv3249c shares considerably structural similarities with other TetR family members. However, this domain also presents some noticeable differ- ences. Helix ␣1, consisting of 21 amino acids, is relatively long compared with other members of the family. This helix tilts upward by 15°in relation to the horizontal surface, which is perpendicular to the vertical plane formed by the dimer interface. This orientation facilitates the two N-terminal domains within the dimer to shift away from each other, leaving a relatively large gap between the two DNA recognition helices ␣3 and ␣3Ј. Helices ␣3 and ␣3Ј play a key role in the binding of cognate DNA. Because the distance between two consecutive major grooves of a B-form DNA is ϳ34 Å, the center to center distance between helices ␣3 and ␣3Ј of the regulator has to be about 34 Å to bind the DNA. In Rv3249c, this center to center distance, according to the separation between C␣ atoms of Tyr61 and Tyr61Ј, was measured to be 47 Å, suggesting that in this conformation, the regulator cannot bind B-DNA, and this conformation should correspond to the induced form.
The relatively large center to center distance also indicated that the crystallized Rv3249c protein has an attached ligand. As expected, a large extra electron density was found within the C-terminal regulatory domain of the ligand-binding site of each subunit of Rv3249c, indicating the existence of a fortuitous bound ligand co-purified and co-crystallized with this regulator. To identify the unknown bound ligand, GC-MS was used to study these Rv3249c crystals (Fig. 4). The data suggest that the fortuitous ligand is palmitic acid, also called hexadecanoic acid, a saturated fatty acid containing 16 carbons with the molecular formula C 16 H 32 O 2 . Because this fatty acid was found to be copurified and co-crystallized with the Rv3249c regulator, fatty acids may be the natural substrates of this protein.
Overall Structure of Rv1816 -The crystal structure of the 234-amino acid protein Rv1816 was determined to a resolution of 2.00 Å (Table 1 and Fig. 6). Because M. tuberculosis Rv1816 shares 31% protein sequence identity with Streptomyces coeli- color SCO4313 (Protein Data Bank code 2OI8), we utilized the existing structure of SCO4313 from the Protein Data Bank as a search model for molecular replacement to obtain the crystal structure of Rv1816. Two molecules of Rv1816, which assemble as a dimer, are found in the asymmetric unit. Similar to Rv3249c, the structure of Rv1816 can be divided into N-terminal DNA-binding and C-terminal ligand-binding domains. This regulator is also an all ␣-helical protein, composed of 10 ␣-helices (␣1-␣10 and ␣1Ј-␣10Ј, respectively) ( The structure of Rv1816 is unique among the TetR family regulators. In the C-terminal ligand-binding domain, helices ␣9 and ␣10 of one subunit of Rv1816, which run anti-parallel to each other, interact with helices ␣9Ј and ␣10Ј of the next subunit, forming a four-helix bundle at the subunit interface to secure the dimerization state. A tunnel-like cavity is formed at the cleft between helices ␣7, ␣8, ␣9, and ␣10 and the random loop connecting ␣9Ј and ␣10Ј of the second subunit. Unlike Rv3249c, a single helix, ␣6, forms the floor of this cavity, which potentially creates a binding tunnel for inducing ligands. Helix ␣8, which orients horizontally and nearly perpendicular to the 2-fold vertical axis of the dimer, generates a unique architecture of the Rv1816 regulator. This feature has not been found in the existing structures of the TetR family regulators including TetR (44,45), QacR (46,47), CprB (48), EthR (49,50), CmeR (51,52), AcrR (53), SmeT (54), Rv3066 (55), Rv1219c (56), and Rv3249c. Based on its location, helix ␣8, together with the loop region between helices ␣9Ј and ␣10Ј, seems to lengthen and enlarge the binding cavity to accommodate for larger ligands. Indeed, it was found that the loop residues as well as residues located within helix ␣8 are involved in ligand binding. The volumes of the ligand binding cavities of the C-terminal regulatory domains of the two subunits within the Rv1816 dimer are ϳ500 and ϳ450 Å 3 (corresponding to the right and left subunits of Rv1816 in Fig. 7B, respectively).
Within the N-terminal DNA-binding domain, helix ␣1 of Rv1816 is relatively long at 19 amino acids. However, in direct contrast to the structure of Rv3249c, this helix tilts downward by ϳ10°with respect to the horizontal plane. The center to center distance between helices ␣3 and ␣3Ј of the regulator is 39 Å, defined by the separation between residues Tyr-54 and Tyr-54Ј. Such distances in the apo form of TetR (44), QacR (46), and AcrR (53) are 35, 39, and 42 Å, respectively. This spacing indicates that the crystal structure of Rv1816 may represent its ligand-free conformational state. Surprisingly, a large extra electron density appears in the elongated binding pocket of one subunit of the Rv1816 protein, suggesting that Rv1816 is bound by a ligand. However, the other binding pocket in the next subunit of the regulator is still empty. As in the case of Rv3249c, GC-MS was employed to identify this fortuitous ligand. The data indicate that Rv1816 was co-purified and co-crystallized with the saturated fatty acid ester isopropyl laurate, also called propan-2-yl dodecanoate, with the molecular formula C 15 H 30 O 2 (Fig. 8). Again, these data support the role of fatty acids as the natural ligands of these proteins. The bound fatty acid ester is partially exposed to solvent at its isopropyl headgroup, whereas the carbon chain tail of the molecule is buried in the region of the binding cavity created by ␣6, ␣8, and ␣9 and the loop region between the adjacent helices ␣9Ј and ␣10Ј. There are extensive hydrophobic interactions between the bound isopropyl laurate molecule and Rv1816. Within 5 Å of this bound ligand, at least 16 residues of Rv1816, including Pro-128, Thr-131, Ala-132, Ala-135, Thr-136, Val-139, Phe-142, Phe-143, Phe-177, Cys-189, Phe-190, Leu-192, Trp-193, Tyr-208Ј, Ala-210Ј, and Met-212Ј, are involved in the interaction (Fig. 9). Again, many of these residues are hydrophobic in nature.
Isothermal Titration Calorimetry-ITC was used to quantify regulator-ligand interactions. The strength of interaction between Rv3249c and the palmitate ligand was measured by ITC and possessed a negative enthalpic contribution of a typical hyperbolic binding curve (Fig. 10A). The enthalpic (⌬H) and entropic (⌬S) parameters of Rv3249c binding to palmitic acid are Ϫ629.6 Ϯ 38.1 cal⅐mol Ϫ1 and 21.3 cal⅐mol Ϫ1 ⅐deg Ϫ1 , giving rise to a dissociation constant, K D , of 7.5 Ϯ 1.2 M.  Fig. 7B). The bound isopropyl laurate is shown as a stick model (green, carbon; red, oxygen). The simulated annealing 2F o Ϫ F c electron density map is contoured at 1.0 (blue mesh). The right subunit of Rv1816 is shown as orange ribbons. B, the isopropyl laurate binding site. Amino acid residues within 5.0 Å from the bound isopropyl laurate (green, carbon; red, oxygen) are included. The side chains of selected residues from the right subunit of Rv1816 in Fig. 7B are shown as orange sticks (orange, carbon; blue, nitrogen; red, oxygen). C, schematic representation of the Rv1816 and isopropyl laurate interactions. Amino acid residues within 5.0 Å from the bound isopropyl laurate are included.
ITC was then employed to determine the binding affinity of the Rv1816 regulator for the laurate ligand (Fig. 10B). Because isopropyl laurate has a low solubility, we used lauric acid as a ligand for these experiments. Again, the titration depicts a typical hyperbolic binding curve, which is characterized by a negative enthalpic contribution. The thermodynamic parameters of binding of lauric acid to Rv1816 are Ϫ872.1 Ϯ 41.4 cal⅐mol Ϫ1 (⌬H) and 19.2 cal⅐mol Ϫ1 ⅐deg Ϫ1 (⌬S), which give rise to a K D of 14.4 Ϯ 1.8 M.
As the predicted binding sites of Rv3249c and Rv1816 within the promoter and intragenic region of the mmpL genes are partially overlapped, we thought that these two regulators may share a similar set of ligands. We thus determined whether Rv1816 is able to bind palmitic acid (Fig. 10C). Surprisingly, ITC data depict that Rv1816 binds palmitic acid with a K D of 23.2 Ϯ 2.9 M. The thermodynamic parameters of binding of palmitic acid to Rv1816 are Ϫ491.6 Ϯ 29.2 cal⅐mol Ϫ1 (⌬H) and 19.6 cal⅐mol Ϫ1 ⅐deg Ϫ1 (⌬S).
Fluorescence Polarization Assay-Fluorescence polarization was used to quantify the strength of regulator DNA interactions. To identify regulatory targets of these proteins, we utilized chromatin immunoprecipitation sequencing (ChIP-Seq) data from Galagan and co-workers (26 -29) and the Tuberculosis Database. Regions of the M. tuberculosis H37Rv genome that were found by these experiments to interact with Rv3249c or Rv1816 were first examined to identify a potential binding sequence for each individual protein. Typically, TetR family proteins interact with DNAs via symmetric palindromic stretches called inverted repeats, ϳ15-30 nucleotides long. Thus, the search was narrowed to include sequences that contain these patterns. For each protein we were able to identify a putative inverted repeat sequence located in one or more of the M. tuberculosis H37Rv genes encoding MmpL transporter proteins. These DNA sequences are in good agreement with both the consensus binding sequences and protein-DNA interactions determined by others (21). We found that the Rv3249c protein might act as a regulator for mmpS1/L1, mmpL11, and rv1067c and Rv1816 for mmpL13b and rv1094.
Fluorescence polarization assays were performed using purified regulator proteins and duplex DNA. For example, we quantified the interaction between Rv3249c and the 19-bp DNA sequence (ACATCGAAACGGTCGATGT), which is located at the mmpL11 operon. Fig. 11A illustrates the binding isotherm of Rv3249c in the presence of 2.5 nM fluoresceinated DNA, indicating that Rv3249c binds this 19-bp promoter DNA with a dissociation constant, K D , of 5.6 Ϯ 1.0 nM. Similarly, the interaction between Rv3249c and the 19-bp putative promoter DNA sequence (ACCTCGCCGTAAACGATGT) within the mmpS1/L1 operon was determined. The data suggest that the K D value for this binding is 18.7 Ϯ 6.5 nM (Fig. 11B). The binding constants of Rv3249c and Rv1816 with their corresponding DNAs are summarized in Table 2.
Electrophoretic Mobility Shift Assay-Rv3249c is predicted to regulate expression of mmpS1/L1, mmpS5/L5, mmpL10, and  NOVEMBER 20, 2015 • VOLUME 290 • NUMBER 47 mmpL11 (Fig. 12A). We performed EMSAs using purified Rv3249c to demonstrate direct transcriptional regulation by Rv3249c. We observed a concentration-dependent shift of the mmpL3, mmpL11, and mmpS1 probes (Fig. 12B). A second shift of the mmpL3 probe with 0.5 M Rv3249c suggests multiple binding sites for the regulator in this promoter region. As a control, EMSAs were performed in the presence of nonlabeled "cold" probe. Release of Dig-labeled probe was observed consistent with specific binding of Rv3249c to the mmpS1 probe (Fig. 12C). The probe corresponding to a significant ChIP-Seq peak in the coding sequence of rv3249c itself did not shift. Our EMSA conditions may not be ideal for Rv3249c binding to the rv3249c probe or autoregulation may depend on a ligand or environmental condition that we have not adequately mimicked in our EMSA conditions. Rv3249c crystallized with a palmitic acid molecule, and ligand binding appears to be incom-patible with DNA binding activity. We performed an EMSA in the presence and absence of palmitic acid to demonstrate this experimentally. The addition of ligand reduced binding of Rv3249c to the rv1067c probe (Fig. 12D).

Structures of Transcriptional Regulators Rv3249c and Rv1816
ChIP-Seq data suggest that Rv1816 potentially regulates itself, mmpL2, mmpL3, mmpS4/L4, mmpL7, mmpL8, and mmpL11 expression. Interestingly ChIP-Seq data indicated that Rv1816 regulates kasA, which is co-regulated with mmpL3 and encodes a ␤-ketoacyl-ACP synthase involved in meromycolate synthesis, in addition to the mmpS3 gene, encoding an accessory MmpS protein. We performed EMSAs using purified Rv1816 and probes corresponding to Rv1816, the mmpL3/ mmpL11 region, mmpL7, kasA, and mmpS3 (Fig. 13A). We observed a shift of the mmpL3/mmpL11 locus probes, and less so for the mmpL7 probe (Fig. 13B). We also obtained a robust concentration-dependent shift of the kasA, mmpS3, and rv1094 probes. These data indicate that Rv1816 binds several times within the mmpL3/mmpL11 genomic locus and suggest that Rv1816 contributes to the coordinated regulation of cell wall lipid biosynthesis and transport. As a control, EMSAs were performed in the presence of nonlabeled "cold" probe. Release of Dig-labeled probe was observed consistent with specific binding of Rv1816 to the rv1094 probe (Fig. 13C). Rv1816 was cocrystallized with isopropyl laurate. In addition, ITC experiment suggested that this regulator binds lauric acid with an affinity in the micromolar range. We thus performed an EMSA in the presence of lauric acid. Surprisingly, the presence of this fatty acid did not reduce binding of Rv1816 to the rv1094 probe. It appears that lauric acid binding does not prevent the interaction of this regulator with cognate DNAs. We then performed an EMSA both in the presence and absence of palmitic acid, because Rv1816 is also able to bind this compound. Indeed, the addition of ligand reduced binding of Rv1816 to the rv1094 probe (Fig. 13D).
Conclusion-In this paper, we describe the crystal structures of the Rv3249c and Rv1816 transcriptional regulators, which participate in the regulatory network that controls the expression of essential and virulence-associated MmpL transporters. Specifically, existing ChIP-Seq data and the analyses presented herein suggest that Rv3249c regulates the genes mmpS1/L1, mmpL3, mmpS5/L5, mmpL7, mmpL10, mmpL11, and mmpL12, and Rv1816 regulates mmpS2/L2, mmpL3, mmpS4/L4, mmpL7, mmpL8, mmpL11, and mmpL13b. MmpL transporters significantly contribute to the export of important lipid components of the mycobacterial cell wall and are necessary for the virulence of this pathogen. Our experimental data demonstrate a direct binding of these transcriptional regulators to intragenic and promoter DNAs, providing evidence for the transcriptional control of mmpL gene expression. Multiple transcriptional factor binding sites exist within the promoter and intragenic region of the mmpL genes, and each transcriptional regulator recognizes several mmpL regulatory regions. These findings highlight that mmpL gene expression relies on a complex interplay of multiple transcription regulators.
Fortuitously, the crystal structures of Rv3249c and Rv1816 were resolved in complexes with palmitic acid and isopropyl lauric acid, respectively. These structures suggest that satu- rated fatty acids are the natural substrates of these regulators. There are extensive interactions of these fatty acids with the transcriptional regulators, where the C-terminal regulatory domain of the regulators provides a hydrophobic environment for substrate binding. Within the fatty acid binding site of Rv3249c, there are at least 22 residues that participate in binding the palmitate ligand through hydrophobic interactions. Similarly, 16 residues are involved in binding the fatty acid ester isopropyl laurate within the ligand-binding pocket of Rv1816. Many of these residues are hydrophobic in nature, suggesting that fatty acid recognition is mainly governed by hydrophobic interactions.
The binding of palmitate to Rv3249c results in lengthening the center to center distance of this regulator, making it incompatible with cognate DNAs. However, the structure of Rv1816 indicates that binding of laurate does not change the center to center distance of the regulator. Thus, this regulator may still be able to hold onto the promoter DNA and repress gene expression even after interacting with this ligand. Indeed, gel shift experiment showed that the addition of laurate does not have any effect on the Rv1816-DNA complex. Rather, it appears that palmitic acid reduces DNA binding activity of Rv1816. Future work will define native fatty acid ligands for this regulator and dissect their effects on Rv1816 activity.
The TetR family regulators use several distinct mechanisms for modulating transcriptional regulation. However, the net consequence of binding of inducing ligands to these regulators is essentially the same. Ligand binding at the C-terminal regulatory domain triggers a long distance conformational change at the N-terminal DNA binding domain, resulting in the release of the regulator from its operator DNA. The TetR family regulators utilize the N-terminal recognition helix ␣3 to bind the major groove of B-DNA. The distance between two consecutive major grooves of B-DNA is ϳ34 Å. One major mechanism found in the TetR family is that ligand binding increases the center to center distance between the two recognition helices ␣3 and ␣3Ј within the dimer, making this distance no longer FIGURE 12. Rv3249c binds to promoter regions of mmpS1 and mmpL3 and intragenic region of mmpL11. A, a schematic depicting the DNA probes used in EMSAs. B, for EMSA analysis, 6 nM Dig-labeled probe and the indicated micromolar concentrations of protein. C, to demonstrate specificity, the MmpS1 EMSA was performed in the presence of nonlabeled ("cold") probe. Reactions were performed with 6 nM Dig-labeled probe, the indicated micromolar concentrations of protein, and 360 nM cold probe. D, ligand-bound Rv3249c does not bind target probes. EMSA was performed using 12 nM Dig-labeled probe and 0.1 M Rv3249c in the absence or presence of indicated concentration of palmitic acid. An arrow denotes the shifted probes and the asterisk notes the accumulation of free Dig-labeled probe.
compatible with the 34 Å separation between two successive major grooves of B-DNA. This results in the release of the dimeric regulator from the promoter region, allowing the expression of the respective regulated gene. For example, the center to center distance for QacR is 39 Å in the absence of inducing ligands (46). This distance becomes 48 Å upon ligand binding (47). The conformational change is augmented in the cases of CmeR (51) and EthR (50), where the center to center distances were observed to be 54 and 52 Å in the respective ligand-bound structures. Thus, it is most likely that ligand binding by the Rv3249c and Rv1816 regulators leading to derepression is primarily triggered by the increase in center to center distance, making these two regulators no longer compatible with their corresponding cognate DNAs.  Rv1816 binds to intragenic and/or promoter regions of rv1816, mmpL3, rv0204, mmpL11, mmpL7, kasA, and mmpS3. A, a schematic depicting the DNA probes used in EMSAs. B, EMSAs were performed using 12 nM Dig-labeled probe and the indicated micromolar concentrations of protein.
An arrow denotes the shifted probes. EMSAs were performed using probes spanning regions of rv0550 and rv2245. These were predicted binding sites by ChIP-Seq but did not shift at the indicated concentrations. C, to demonstrate specificity, EMSAs were performed in the presence of nonlabeled ("cold") probe. Reactions were performed with 6 nM Dig-labeled probe, the indicated micromolar concentrations of protein, and 0.24 M cold probe. The asterisk notes the accumulation of free Dig-labeled probe. D, ligand-bound Rv1816 does not bind target probes. EMSA was performed using 12 nM Dig-labeled probe and 0.5 M Rv1816 in the absence or presence of 5 M palmitic acid. An arrow denotes the shifted probes, and the asterisk notes the accumulation of free Dig-labeled probe.