Mycobacterium tuberculosis Rv1395 Is a Class III Transcriptional Regulator of the AraC Family Involved in Cytochrome P450 Regulation*

Rv1395 is annotated as a potential transcriptional regulator of the AraC family. The Rv1395 insertional mutant was identified in a signature tag mutagenesis study in Mycobacterium tuberculosis and was shown to be attenuated in the lungs of mice. Here, we used comparative genomics and biochemical methods to show that Rv1395 is unique to the M. tuberculosis complex and that it encodes a protein that binds the region between two divergent genes, a member of the cytochrome P450 family (Rv1394c or cyp132) and Rv1395 itself. Rv1395 binds to this DNA region by its helix-turn-helix-containing C-terminal domain, and it recognizes two sites with different affinity. We identified the transcriptional start points (TSP) of Rv1394c and Rv1395: both genes have two TSPs, three of which are located in the intergenic region. We constructed and compared various transcriptional fusions consisting of the promoter regions and a reporter gene in Mycobacterium smegmatis: this showed that Rv1395 induces the expression of the cytochrome P450 gene (Rv1394c) and represses its own transcription. This was confirmed in M. tuberculosis when the wild type and a Rv1395-overexpressing strain were used as hosts for the fusions. Site-directed mutagenesis showed that Rv1395 binds to the two sites in a co-operative manner and that binding to both sites is required for Rv1395 optimal activity. A model describing the potential mode of action of Rv1395 is discussed.

Regulation of gene expression allows bacteria to respond to external stimuli and thus to adapt continuously to their environment. Studies with LacI, MalT, CRP, and lambda cI in Escherichia coli (1)(2)(3)(4) have elucidated the molecular mechanisms of gene regulation and paved the way for further investigations in other bacterial species. Gene regulation is considered to play a central role in host-microbe interactions, and many symbiotic and virulence genes are regulated in response to the host (5)(6)(7)(8). Mycobacterium tuberculosis is an intracellular pathogen able to replicate in phagocytic cells and to resist many host defenses (9,10). Thus, the dialogue between M. tu-berculosis and the host is likely to be very complex and to involve a variety of adaptations and networks regulations (11). In this regard, analysis of the genome sequence of M. tuberculosis has revealed the existence of over a hundred potential regulatory proteins, 13 sigma factors, and 11 two-component systems (12). In addition, 11 eukaryotic-like serine-threonine kinases, whose function has not been characterized yet, have been identified and may be part of regulatory signal-transduction pathways (13). All these genes may play an important role in the ability of M. tuberculosis to respond to the different environments encountered during the infection of the host. Compared with other bacterial genera, little functional data about gene regulation in M. tuberculosis are currently available, due to the late development of mycobacterial genetics (14,15) and the slow growing nature of M. tuberculosis. However, many studies have recently appeared, focused especially on sigma factors (16 -20) and two-component systems such as mtrA-mtrB (21,22), trcS-trcR (23)(24)(25), mprA (26), prrA-prrB (25), devS-devR (27)(28)(29)(30), phoP-phoQ (31,32), senX3-regX3 (25,33) and more recently kdpE-kdpD (34). Some regulators such as IdeR, an iron-responsive DNA-binding protein from the DtxR family (35)(36)(37), LexA, involved in DNA repair (38,39), FurA (40,41), and WhiB3 (42) have also been partially characterized.
The genome of M. tuberculosis contains six predicted transcriptional regulators belonging to the AraC family (43): Rv1317c (alkA (44)), Rv1395, Rv1931c, Rv3082c (virS (45,46)), Rv3736, and Rv3833. Members of the AraC family mainly act as activators and only rarely as repressors (47)(48)(49). The proteins can be divided into two domains. The N-terminal region, which varies largely among members, is recognized by the effector molecules and is responsible for dimerization. The Cterminal region recognizes and binds the target DNA by two helix-turn-helix motifs that are conserved among organisms and represents the "signature" of the family. Rv1395 is a potential transcriptional regulator of the AraC family. It was identified by signature-tagged mutagenesis in M. tuberculosis and the Rv1395 insertional mutant was shown to be attenuated in the lungs in the mouse model of tuberculosis (50). Rv1395 can be divided into two domains, and it has two predicted helix-turn-helix motifs in the C-terminal domain, recognized as members of the AraC family (43). The aim of this study was to identify Rv1395 target genes and to characterize its regulatory activity.
Bacterial Strains-M. smegmatis mc 2 155 and M. tuberculosis MT103 and H37Rv were grown in Middlebrook 7H9 medium (BD Biosciences) supplemented with albumin/dextrose/catalase (from BD Biosciences) and 0.05% Tween 80, at 37°C. The M. tuberculosis Rv1395 Ϫ strain is the 2B26 strain isolated by Camacho et al. (50). E. coli DH5␣ and BL21 were grown at 37°C in L-broth. When required, antibiotics were added at the following concentrations: ampicillin, 100 g ml Ϫ1 ; kanamycin, 20 g ml Ϫ1 ; hygromycin, 200 g ml Ϫ1 (for E. coli) or 50 g ml Ϫ1 (for mycobacteria). Competent E. coli cells were prepared by the CaCl 2 method and transformed according to Sambrook et al. (55). Competent M. smegmatis and M. tuberculosis cells were prepared and transformed by electroporation as described by Guilhot et al. (56).
In Vitro Binding Assay-Crude extracts were prepared by resuspending pellets from 2-to 3-week-old M. tuberculosis cultures in 500 l of lysis buffer (10 mM Tris-HCl buffer, pH 8.0, 1 mM EDTA, 10% glycerol, 1 mM dithiothreitol, 1 mM phenylmethylsulfonyl fluoride) and breaking the bacteria with 500 l of 0.1-mm glass beads. Soluble proteins were separated from cell debris by centrifugation and quantified by use of an ESL kit (Roche Applied Science). Radioactive probes were prepared by phosphorylating 10 pmol of one of the primers with [␥-32 P]ATP (3000 Ci/mM) and T4 polynucleotide kinase (BioLabs) and adding the labeled primer to PCR reactions. Binding assays were carried out in 10 l of binding buffer (43 mM Tris acetate, pH 8.0, 30 mM potassium acetate, 8 mM magnesium acetate, 27 mM ammonium acetate, 1 mM dithiothreitol, 80 mM KCl, 10% glycerol, 4% polyethylene glycol 8000, 100 g/ml bovine serum albumin, 100 ng/ml poly(dI-dC)) at 30°C for 30 min and run in 10% polyacrylamide gels. The intensity of the bands was quantified with ImageQuant (Amersham Biosciences).
Purification of Rv1395Cter-All PCRs were carried out with the Pfu Turbo polymerase from Stratagene and with genomic DNA from M. tuberculosis as a template. The C-terminal region of Rv1395 was amplified using primers TR9c (5Ј-CATGCCATGGCACAATGCGACGT-GCTGATG-3Ј) and Tr12bis (5Ј-GGAATCCATATGCTAGTGGTGATG-GTGGTGGTGCCCGCGGCGCGAATATTCGCT-3Ј), cloned into the NcoI/NdeI sites of pET14b (Novagen), and sequenced. The resulting plasmid was named pET 9c-12b. E. coli BL21 was transformed with pET 9c-12b and induced with 0.4 mM isopropyl-1-thio-␤-D-galactopyranoside at A 600 ϭ 0.5 for 5 h. Bacteria were harvested by centrifugation, washed, resuspended in lysis buffer (10% glycerol, 0.5 M NaCl, 20 mM Tris-HCl buffer, pH 8.0), and broken by sonication. The resulting pellet (insoluble fraction) was resuspended overnight at 4°C in urea buffer (8 M urea, 0.5 M NaCl, 20 mM Tris-HCl buffer, pH 8.0) at pH 8.5. After centrifugation at 8000 ϫ g, the supernatant was loaded in an equilibrated nickel-nitrilotriacetic acid-agarose (Qiagen) column. The column was washed with urea buffer (8 M urea, 0.5 M NaCl, 20 mM Tris-HCl buffer, pH 8.0) at pH 6.6, and the protein eluted with urea buffer (8 M urea, 0.5 M NaCl, 20 mM Tris-HCl buffer, pH 8.0) at pH 4.5. Fractions were first analyzed on gels, then gathered and dialyzed in the presence of 0.05% salmon sperm DNA against phosphate-buffered saline (57). The protein was then centrifuged at 12,000 ϫ g, and the soluble fraction was analyzed: after dialysis, the fractions that had been eluted at pH Յ 5.5 appeared as a single band in a Coomassie Blue-stained polyacrylamide gel. These fractions were divided into aliquots and stored at Ϫ80°C.
Preparation of Anti-Rv1395 Antibodies-Polyclonal anti-Rv1395 antibodies were raised by immunizing rabbits with recombinant Rv1395. The full-length protein was obtained by cloning the PCR product obtained with primers TR9 (5Ј-CATGCCATGGGACATCTACCGCCTC-CGGCC-3Ј) and TR12bis into pET14b (Novagen). The protein was purified as described for the C terminus. The purified protein was highly degraded and showed poor binding capacity. New Zealand rabbits were immunized with 200 g of recombinant Rv1395; two boosters (200 g) were given at 2-week intervals. Blood was collected, incubated for 1 h at 37°C and overnight at 4°C. Sera were recuperated by centrifuging the blood at 3000 ϫ g for 20 min.
DNase I Footprinting-The binding assay was performed as described above. DNase I (40 ng) was then added, and the tubes were incubated at 30°C for 2 min. The reaction was stopped by adding 5 l of formamide loading buffer. Sequencing was performed using the T7 DNA Polymerase Sequencing Kit from USB. Samples were run in a urea-denaturing 7.5% polyacrylamide gel.
Primer Extension-RNA was extracted from late logarithmic-phase cultures of recombinant M. smegmatis and M. tuberculosis H37Rv as follows: culture pellets were resuspended in 1 ml of TRIzol (Invitrogen) and broken with 500 l of 0.1-mm glass beads for 1 min. Then, 200 l of chloroform:isoamyl alcohol (49:1) (Ready Red from Appligene) was added, and the mixture was incubated for 3 min at 23°C. Tubes were centrifuged at 15,000 ϫ g for 15 min, and the supernatant was precipitated with 500 l of isopropanol at 23°C. Pellets were washed with 75% ethanol and resuspended in 50 l of diethyl pyrocarbonate-treated water (Ambion). The RNA preparations were treated with DNase with the DNA-free kit from Ambion. For primer extension analysis, 2-5 g of RNA was mixed with 2 pmol of radioactive primer TRϩ1 (5Ј-AGCAC-CCGGGTCGCATACACC-3Ј) for Rv1395, primer Cytϩ1 (5Ј-TCATCGT-CCAGGTGCTCATCC-3Ј) for Rv1394c, primer Galϩ1 (5Ј-GAAGCTTC-CGATTCGTAGAGC-3Ј) for ␤-galactosidase fusions, and reverse-transcribed with Expand Reverse Transcriptase (Roche Applied Science) following the manufacturer's instructions. Samples were run in a ureadenaturing 7.5% polyacrylamide gel.
␤-Galactosidase Assay-Recombinant M. smegmatis and M. tuberculosis were grown to late-logarithmic phase, harvested by centrifugation, resuspended in 500 l of 0.1 M sodium phosphate buffer, pH 7.0, and broken with 0.1-mm glass beads. One hundred microliters of the supernatant fraction was mixed with 600 l of 4 mM O-nitrophenyl ␤-Dgalactopyranoside, 0.1 M sodium phosphate buffer, pH 7.0, and optical densities were read at 420 nm. ␤-Galactosidase activity was measured as the increase in A 420 ϫ 1000 per mg of protein per minute. Results are the average of at least three independent experiments.

Rv1395 Is Unique to the Members of the TB Complex
Rv1395 is a potential transcriptional regulator of 344 amino acids of the AraC family. Rv1395 similarity to the AraC family is limited to the last 103 amino acids of the protein, which share about 40% identity and over 60% similarity with the other members of the family. However, Rv1395 is similar (43-50%) over the full-length to some putative transcriptional regulators in Pseudomonas aeruginosa, namely PA2096, PA5324, and PA5032 (60), and to OruR, a P. aeruginosa transcriptional regulator involved in ornithine metabolism (61).
We compared the region encompassing Rv1395 in M. tuberculosis and in the other mycobacteria for which the genome sequence is available. The genomic region of M. tuberculosis H37Rv contained, on each side of Rv1395, two genes widely conserved among prokaryotes (54,62). These were metK (COG 0657), an S-adenosyl methionine synthetase, and lipH (COG 0192), a member of the lipase-esterase family (Fig. 1). In M. tuberculosis, the region comprised between metK (Rv1392) and lipH (Rv1399c) contains six genes possibly encoding a monooxygenase (Rv1393c), a member of the cytochrome P450 family (Rv1394c), Rv1395, a member of the PGRS family (Rv1396c), an unknown product (Rv1397c), and a putative regulator of the CopG family (Rv1398c) (Fig. 1). The two genes encoding the cytochrome P450 and the monooxygenase are probably organized in an operon, because the stop codon of the former overlaps the initiation codon of the latter. We first explored the Rv1395 region of the other members of the M. tuberculosis complex: in M. tuberculosis CDC1551 the region was identical except for a frameshift at the beginning of the PGRS gene; the locus was also conserved in Mycobacterium microti and in M. bovis, although there was a 374-bp deletion in the middle of the PGRS gene of the latter. The metK-lipH region was also present in the vaccine strain M. bovis BCG, but lipH, Rv1395, and the PGRS genes had frameshift mutations at amino acids 84, 329, and 87, respectively. Thus, the Rv1395 region appears to have been conserved throughout the evolution in the mycobacteria of the M. tuberculosis complex. We then examined the locus in other mycobacterial species. In M. leprae, described as an extreme case of reductive genomic evolution (63), the region between metK and lipH is completely devoid of any ORF and limited to 663 bp ( Fig. 1). In M. avium, the region encodes four potential products: two monooxygenases (one of which is similar to Rv1393c), one protein of unknown function similar to a Streptomyces coelicolor putative protein annotated as SCL24.07, and a potential regulator of the AsnC family ( Fig. 1). However, the potential regulator of the AsnC family and Rv1395 do not share any similarity. The region of M. marinum is similar to that of M. avium, but the central region composed of the AsnC regulator and the first monooxygenase is absent (Fig. 1). In M. smegmatis, the region encodes three potential genes ( Fig. 1) with similarity with the E. coli genes encoding the D-methionine ABC transporter (64), which have the same tandem organization in E. coli. No Rv1395 homologues were found elsewhere in the genomes of M. leprae, M. avium, M. marinum, and M. smegmatis. This analysis illustrates the extreme plasticity of the metK-lipH locus in mycobacteria and demonstrates that Rv1395 is a potential regulator unique to the M. tuberculosis complex.

Rv1395 Binds to the Intergenic Region of Rv1395-Rv1394c by Its C-terminal Moiety
To investigate the DNA-binding properties of Rv1395, a gel mobility assay was developed using crude extracts of M. tuberculosis. Because regulators often act on genes located in the vicinity of their own gene, a binding assay was performed on each of the three intergenic regions unique to the M. tuberculosis complex in the metK-lipH locus (Fig. 1). This experiment was done using protein extracts of M. tuberculosis wild type and Rv1395 Ϫ . No band shift was observed with the putative promoter regions of the PGRS or the CopG regulator (data not shown). On the contrary, a clear band shift was observed with the region between Rv1395 and Rv1394c with the extract of wild type M. tuberculosis but not with the Rv1395 Ϫ extract ( Fig. 2A). Two shifted complexes could be seen on the gel. Moreover, these complexes were also observed with extracts of the Rv1395 Ϫ strain complemented with an intact copy of Rv1395 ( Fig. 2A), although the low mobility complex was more abundant than the high mobility complex. The amount of shifted complexes correlated well with the amount of Rv1395 present in the extracts (Fig. 2B). The overexpression of Rv1395 observed in the complemented strain is probably due to a cryptic promoter present in the integrative plasmid used for complementation (58). The complexes were specific, i.e. they were lost following the addition of an excess of cold probe and conserved in the presence of an excess of nonspecific DNA (Fig.  2C). The two complexes were lost at different concentrations of the cold probe, indicating that the bound proteins had a different affinity for the DNA. This result indicates that Rv1395 itself or another molecule regulated by Rv1395 is responsible for the observed complexes. To test whether Rv1395 interacted directly with the region between Rv1395 and Rv1394c, we attempted to purify Rv1395 from an overproducing E. coli strain. However, proteins of the AraC family are often insoluble when overproduced in E. coli. To circumvent this problem, we reasoned that a single domain of Rv1395 might remain active in solution more easily. The helix-turn-helix-containing C-terminal domain was thus chosen for purification. This domain (Rv1395Cter, amino acids 229 -344) was His-tagged, overproduced in E. coli, and purified from inclusion bodies in denaturing conditions. Rv1395Cter was then resolubilized and kept in solution in the presence of nonspecific DNA using the method of Egan et al. (57). The soluble protein was tested in a binding assay. Rv1395Cter efficiently bound to the Rv1395-Rv1394c intergenic region, and a shifted product was obtained with as little as 1 nM protein (Fig. 2D). Similarly to the crude extract, two shifted products were observed with the purified Rv1395Cter. Quantification of these data (Fig. 2E) resulted in a K d ϭ 5 nM for the high mobility, high affinity complex and a K d ϭ 125 nM for the low mobility, low affinity complex. These complexes were specific, i.e. lost following the addition of an excess of cold probe and conserved in the presence of an excess of nonspecific DNA (data not shown). The second less mobile complex suggests that a second Rv1395Cter molecule binds either to another site in the DNA or to the bound protein. The complexes observed with Rv1395Cter were more mobile than those obtained with the crude extracts. This may be explained by the difference in size between the full-length protein present in the crude extracts and the purified C-terminal domain. In addition, another molecule present in the extracts may bind to the DNA. These data demonstrate that Rv1395 binds directly and specifically to the region between Rv1395 and Rv1394c by its C-terminal domain.

Identification of the Binding Sites of Rv1395
The very high affinity of Rv1395 for its target DNA enabled us to characterize its binding site. We carried out a DNase I protection assay using the purified Rv1395Cter or a crude extract of M. tuberculosis. When the intergenic region was incubated with various concentrations of purified Rv1395Cter and then treated with DNase I, two regions were protected (Fig. 3A). A first protected region of 26 bp (5Ј-caaagtgttgtgagtttaggacagcc-3Ј) appeared with the same affinity as the first, high affinity complex in the gel-shift assay. A second region of 16 bp (5Ј-aaatgagactatggga-3Ј) was protected at higher protein concentrations and corresponded to the second, low affinity shifted complex in the binding assay. This result shows that Rv1395 binds to two sites in the Rv1395-Rv1394c intergenic region with different affinities. The first region has two hypersensitive sites, whereas the second region has only one (Fig.  3A). Hypersensitive sites indicate increased exposure of DNA to DNase, and they are usually due to the bending of the DNA caused by the bound molecule (65).
To test whether the protected regions defined using the purified C-terminal domain were also protected by the entire protein, a DNase I-footprinting experiment was carried out using crude extracts of M. tuberculosis. We used the comple- mented strain that overexpresses Rv1395, because the signal with the wild type strain was too weak. The protected region obtained with the crude extract was similar to that obtained with the C-terminal domain of Rv1395 but longer (Fig. 3B). Indeed, a further 8-bp region (5Ј-ctggcccg-3Ј) was protected upstream from the second region. This could be due to the fact that the entire protein covers a wider region or, alternatively, that another molecule present in the extract binds to this region. However, this region was not protected with extracts from the Rv1395 mutant strain (data not shown). Furthermore, the first protected region has only one of the two hypersensitive sites observed with the purified protein (Fig. 3B). Again, this could be due to the different length of Rv1395 in the two assays or to the interaction with another molecule that modifies the DNA-bending properties of Rv1395. This result confirms that Rv1395 binds essentially to the same regions both when it is purified in denaturing conditions from an overproducing E. coli strain and when it is found in extracts from mycobacteria.

Identification of the Transcriptional Start Points
As shown above, Rv1395 efficiently binds to the region between Rv1395 and Rv1394c. Because the binding sites are close to the predicted start codons of both the cytochrome P450 and Rv1395 (Fig. 4C), Rv1395 probably binds within both promoter regions and affects their transcription level. To define the promoter regions, we used primer extension analysis to locate the transcriptional start points of Rv1395 and Rv1394c in M. tuberculosis (Fig. 4, A and B, respectively). Both genes were transcribed from two promoters, and P 1 indicates the promoter that is closer to the predicted ATG. Curiously, P 1 coincides with the predicted start codon in both genes (Fig. 4C), a phenomenon that has already been observed in M. tuberculosis (22,40,59), in M. leprae (66) and in other Actinomycetales (67). However, it is possible that the real start codon is located further downstream. Indeed, there are other amino acids that could act as start codons: two methionines are located at positions 17 and 22 with respect to the predicted ATG in Rv1394c and other six valines are found among the first 34 predicted amino acids in Rv1395. It should be noted that the region of similarity between Rv1395 and P. aeruginosa proteins PA5324 and PA5032 starts at amino acid 15 and the 14th amino acid is a valine, suggesting that this latter could be the real start codon.

Characterization of the Transcriptional Properties of Rv1395
Because M. smegmatis is phylogenetically close to M. tuberculosis and devoid of the Rv1395 region, this species was used as a surrogate host to study the transcriptional properties of Rv1395. The gene encoding Rv1395 was integrated into the M. smegmatis chromosome using the same integrative plasmid used to complement the M. tuberculosis mutant strain. Crude extracts were prepared from wild type M. smegmatis and M. smegmatis:Rv1395 and tested in a binding assay with the intergenic region as a probe: two shifted complexes were observed with extracts of the recombinant M. smegmatis but not with extracts of the wild type (data not shown). The complexes were the same sizes as those observed with M. tuberculosis extracts. This result proves that Rv1395 is expressed in M. smegmatis:Rv1395 and that it retains its DNA-binding activity. M. smegmatis:Rv1395 and wild type M. smegmatis were then used to study the impact of Rv1395 binding on the expression level of both Rv1394 and Rv1395.
Rv1395 Is an Activator of the Cytochrome P450 cyp132 Gene-We studied the effect of Rv1395 on the expression of the cytochrome P450 gene by constructing transcriptional fusions with the ␤-galactosidase reporter gene. Two fusions were constructed (Fig. 5I, panel a): a short one including only the P 1 Cyt region (pCyt) and a long one including both P 1 Cyt and P 2 Cyt (pCyt-L). The long fusion was used to investigate whether Rv1395 could activate P 2 Cyt from a downstream position as has been described for Rns (68) and other regulators of the AraC family (69). In this regard, Rv1395 did not bind upstream from P 2 Cyt nor downstream from P 1 Cyt (data not shown).
Wild type M. smegmatis and M. smegmatis:Rv1395 were transformed with pCyt and pCyt-L, and the amount of ␤-galactosidase produced was measured. As seen in Fig. 5I (panel b), the two promoters had a very different basic activity in wild type M. smegmatis, with P 2 Cyt being 16 times stronger than P 1 Cyt. However, the activity of P 1 Cyt (73 Ϯ 17 units) was important, because it was higher than that of the empty vector (1 Ϯ 1 units) (data not shown). Thus, both M. tuberculosis promoters are recognized by the M. smegmatis transcriptional machinery. When Rv1395 was added in trans, the activity of P 1 Cyt increased of a factor 17 in M. smegmatis:Rv1395 (pCyt) (Fig. 5I, panel b). An increase in the activity was also observed with the pCyt-L construct in the presence of Rv1395, but this increase was mainly due to the induction of P 1 Cyt. This indicates that Rv1395 induces the expression of the cytochrome gene by activating P 1 Cyt and that it has no major effect on P 2 Cyt expression. To confirm this observation, we used primer extension to measure the amount of RNA transcripts in M. smegmatis (pCyt-L) and M. smegmatis:Rv1395 (pCyt-L) (Fig. 5I, panel c): P 1 Cyt was clearly induced in the presence of Rv1395.
Rv1395 Represses Its Own Expression-To study the effect of Rv1395 on its own expression, two fusions were made (Fig. 5II,  panel a): a short one containing only P 1 Rv1395 (pTR, where TR stands for transcriptional regulator) and a long one comprising both P 1 Rv1395 and P 2 Rv1395 (pTR-L). Wild type M. smegmatis and M. smegmatis:Rv1395 were transformed with the two constructs and the ␤-galactosidase activity measured. In wild type M. smegmatis, the promoter activity of P 2 Rv1395 was 15 times stronger than that of P 1 Rv1395 (Fig. 5II, panel b). In the recombinant M. smegmatis:Rv1395, both promoters were repressed by Rv1395. P 1 Rv1395 activity was repressed by only 20%, whereas P 2 Rv1395 activity was repressed by 74% (Fig.  5II, panel b). Thus, Rv1395 is a repressor of its own expression, a behavior that is common among transcriptional regulators.
Rv1395 Activity in M. tuberculosis-To confirm that the situation is the same in M. tuberculosis as in M. smegmatis, the same fusions were electroporated into wild type M. tuberculosis and into an Rv1395-overexpressing strain to compare the effect of basic and high Rv1395 expression. Rv1395 was overexpressed in M. tuberculosis by means of the same integrative plasmid previously used in M. smegmatis:Rv1395. The results were very similar to those observed in M. smegmatis (Fig. 5III). Both P 2 promoters had higher activity than their respective P 1 promoters and P 1 Cyt activity increased following Rv1395 overexpression (Fig. 5III, panel a), whereas P 1 Rv1395 and P 2 Rv1395 activities were reduced by 43 and 74%, respectively (Fig. 5III, panel b). Thus, the regulation properties of Rv1395 in M. tuberculosis are the same as those observed in M. smegmatis.

The Binding to the High and Low Affinity Sites Is Cooperative and Necessary for the Full Activity of Rv1395
To define the binding properties of the two binding sites recognized by Rv1395 and their role in the induction/repression activity of Rv1395, each site was mutated separately. As the specific bases within the protected regions recognized by Rv1395 were not known, we decided to mutate 19 bp in the high affinity site (giving rise to mutH) and 15 bp in the low affinity site (named mutL) (Fig. 6A). We simply replaced purines by purines and pyrimidines by pyrimidines. The two mutated regions, mutL and mutH, were first analyzed by binding assay with crude extracts: mutL showed a single shifted band similar to the lower complex observed with the wild type region, whereas mutH did not show any shifted complexes at all, even with the extracts from the Rv1395-overexpressing strain (Fig.  6B). This suggests that Rv1395 binds to the high affinity site independently of the presence of the low affinity site, whereas the binding to the low affinity site is co-operative and requires the presence of the other molecule bound to the high affinity site. We also analyzed the two regions by binding assay with the Rv1395Cter. In both cases only one shifted complex was obtained (data not shown). The binding affinity for mutL was the same as for the "wild type" fragment, with a K d of 5 nM (data not shown). On the contrary, in mutH, the K d for the low affinity site increased by one log (data not shown). This confirms that, in the absence of the high affinity site, Rv1395 binds very poorly to the low affinity site. This cooperation could be explained either by the direct interaction between the proteins bound to the two sites or by the fact that the molecule bound to the first site bends the DNA so to increase the affinity of the second site. Indeed, the cooperativity between the two sites may be higher in the presence of the N-terminal domain, because this region is often responsible for the dimerization of transcription factors.
These mutations were also inserted into the vectors pCyt and pTR-L to study the effect of the mutations on the regulatory activity of Rv1395. The resulting plasmids pCyt-MutL, pCyt-MutH, pTRL-MutL, and pTRL-MutH were electroporated into M. smegmatis and M. smegmatis:Rv1395. In the case of the cytochrome P450, the mutations did not alter the basal promoter activity of P 1 Cyt, but they significantly modified the induction by Rv1395 (Fig. 6C). Interestingly, in the Cyt-MutL fusion, where Rv1395 binds only to the high affinity site, the cytochrome was induced just by a factor two (Fig. 6C), whereas in the "wild type" construct the cytochrome was induced 17-fold in the presence of the regulator (see Fig. 5I, panel b). This decrease in the induction level indicates that both sites are required for cytochrome activation and that, if Rv1395 binds solely to the high affinity site, it is not able to produce an optimal induction. On the other hand, no induction was observed in the Cyt-MutH fusion (Fig. 6C). This can easily be explained by the fact that, as shown previously by the binding assay with crude extracts, Rv1395 does not bind to the low affinity site when the high affinity site is mutated and so no Rv1395 molecule bound to the mutH region. The same reasoning may also explain the results in M. smegmatis:Rv1395 pTRL-MutH (Fig. 6D), where the fusion was not repressed by the presence of Rv1395, probably because Rv1395 did not bind to the mutated DNA. It should be noted that the mutH mutation did not modify P 2 Rv1395 basal activity. On the contrary, the mutL mutation reduced the basal activity of P 2 Rv1395 by more than 10 times (Fig. 6D). This result was quite unexpected, because this last mutation is located several base pairs downstream from the transcriptional start point of P 2 Rv1395. How-ever, in the mutL fusion the binding of Rv1395 to the high affinity site did not repress this weaker P 2 Rv1395 activity. This observation confirms that both sites are required for optimal Rv1395 activity.

Study of the Recognition Motif for Rv1395
The experiments with the mutated fragments mutL and mutH showed that the nucleotides contained in the two binding sites are necessary for recognition by Rv1395. To identify the recognition motif of Rv1395, the sequences of the two sites (shown in Fig. 4C) were searched for conserved patterns. Relative to the hypersensitive sites, a conserved TGTGA sequence was observed repeated symmetrically in the first region (agtgt/ Tgtga) and with one mismatch in the second region (Tgaga), where the capital T is the hypersensitive site. However, regulators usually recognize directed or inverted repeats and not symmetrical elements. Another common motif between the two sites was TGAG-n-GGACA, where n ϭ 4 in the high affinity region and n ϭ 6 in the low affinity region. To determine whether this motif was sufficient for Rv1395 binding and could allow the direct identification of other putative targets, we sought the TGAG-n-GGACA motif and its complement sequence TGTCC-n-CTCA in the whole M. tuberculosis genome in the 200-bp region upstream from all the annotated ORFs. Interestingly, seven genes presented the TGAG-n-GGACA motif and five genes presented its complementary sequence between 27 and 179 bp upstream from their predicted ATG (Table  I). However, no genes contained multiple copies of the repeat in the promoter region. Three genes were chosen for further anal- ysis: Rv2728c, Rv2781c, and Rv2322c, with n ϭ 4, 5, and 6 respectively (Table I). Different values of "n" were chosen to investigate whether the number of bases between the two parts of the motif could explain the different affinity of Rv1395 for the two sites. Furthermore, Rv2322c encodes for an ornithine aminotransferase and Rv1395 is similar to OruR, a transcriptional regulator involved in ornithine metabolism in P. aeruginosa. The promoter regions of the three genes were amplified and tested in binding assays with recombinant Rv1395Cter or with crude extracts. However, no binding was observed in any case (data not shown). This suggests that the TGAG-n-GGACA motif is not sufficient per se for recognition by Rv1395 and that other adjacent bases are also required for the high specificity of Rv1395 for the Rv1394c-Rv1395 region. DISCUSSION The sequencing of M. tuberculosis genome revealed that this pathogen possesses more than 100 putative transcriptional regulators. Nevertheless, only a few regulators have been characterized to date. Here we report the initial characterization of Rv1395, a transcriptional regulator unique to the mycobacteria of the M. tuberculosis complex. Indeed, our comparative genomic approach showed that the locus containing Rv1395 has been conserved throughout evolution in the M. tuberculosis complex, whereas it is extremely heterogeneous in other mycobacteria such as M. leprae, M. avium, M. marinum, and M. smegmatis. Rv1395 belongs to the AraC family of transcriptional activators, a growing family that currently contains over eight hundred members (48). Members of this group have been found in both Gram-positive and Gram-negative bacteria as well as in cyanobacteria, but not in archaeobacteria or in eukaryotes. Proteins belonging to this family are involved in three main regulatory functions: carbon metabolism, stress responses, and pathogenesis (47). Most members are transcriptional activators and consist of two domains: the conserved domain (PROSITE number: PS01124, www.expasy.org/ prosite/) is typical of the family and is characterized by two helix-turn-helix motifs responsible for the recognition of and binding to the target DNA. In most cases, this conserved domain is located in the C-terminal region and is connected to a non-conserved domain via a linker. Similarly, Rv1395 consists of two domains and the C terminus contains two predicted helix-turn-helix motifs (amino acids 262-282 and 310 -332). We used both biochemical and genetic approaches to show that Rv1395 is a transcriptional regulator and binds to DNA by its C-terminal domain. Rv1395 binds with different affinity to two adjacent sites located between two divergent genes, a cytochrome P450 gene (Rv1394c or cyp132) and the Rv1395 gene itself. We studied the promoter regions of both genes, and we identified two transcriptional start points for each gene, a situation already described for other mycobacterial genes (66,70,71). Their putative Ϫ10 and Ϫ35 regions were located (Table II) and compared with the consensus sequences described previously for mycobacteria (72). Mycobacterial consensus sequences are theoretically identical to the E. coli consensus sequences TATAAT and TTGACA, but they have a tendency to deviate toward a higher G/C content due to the high G/C percentage of mycobacterial genomes (72). The Ϫ10 and Ϫ35 hexamers identified in Rv1395 promoters (Table II) shared three out of six bases with the consensus TATAAT and TTGACA, with two or three A/Ts replaced by G/Cs per box as expected. Strikingly, the P 1 promoter of the cytochrome gene had a Ϫ10 box (TACAGT) ( Table II) that differed by only one base from the Ϫ10 box (TACACT) of the blaF* (59) and the pAN (73) promoters. These two promoters are particularly strong in M. smegmatis, whereas this is not the case for P 1 Cyt. This could be due either to the difference in the fifth base of the Ϫ10 box or to the effect of the Ϫ35 boxes. On the other hand, the Ϫ10 element of P 2 Cyt (Table II) was particularly rich in G/C (CGC-CGT) and shared four bases with the Ϫ10 box described for the Antigen 85A (CGCCTG) (74). This suggests that the Ϫ10 box can be richer in G/C than expected and still be perfectly functional in mycobacteria. This is not surprising because M. tuberculosis has 13 sigma factors that are thought to recognize a variety of sequences that could also be very different from the classic E. coli consensus. Indeed, the Ϫ10 and Ϫ35 boxes identified to date vary largely, and the Ϫ35 region in particular tolerates a high level of sequence diversity (75). Recently, the consensus sequences for SigH (17,18) and SigE (16) have been described. However, Rv1394c and Rv1395 promoters do not fit with these elements, suggesting that they are probably recognized by other sigma factors. The Ϫ10 regions of Rv1394c and Rv1395 promoters were also investigated for the "extended Ϫ10" or TGN motif (76), but none were found.
Here we show that Rv1395 is an activator of the cytochrome P450 gene Rv1394c and that it acts on its P 1 promoter (Fig. 7). Rv1395 binds to two sites relative to P 1 Cyt: the high affinity site is located upstream from the Ϫ35 box, around position Ϫ62.5, whereas the low affinity site is centered around position Ϫ41.5 and partially overlaps the Ϫ35 region (Fig. 7). These are the positions typically occupied by regulators of class I and class II promoters, respectively (77)(78)(79). In class I promoters, the activator molecule binds upstream from the Ϫ35 region, typically at Ϫ62, where it makes contacts between its "activating region" (AR1) and the ␣CTD domain of the RNA polymerase (79). This interaction increases the binding constant of the RNA polymerase. In class II promoters, the activator molecule binds around Ϫ42 and overlaps the Ϫ35 region, where it can make multiple interactions with the ␣CTD domain, the ␣NTD domain, and the region 4 of the sigma factor via its AR1, AR2, The TGAG-n-GGACA motif and its complement, TGTCC-n-CTCA, were sought in the M. tuberculosis genome. The TGAG-n-GGACA motif was found in the promoter regions of the first five genes listed in the table and the TGTCC-n-CTCA motif in the following seven genes. The positions of the motifs are indicated relative to the predicted start codon of the respective genes. "n" indicates the number of bases comprised between the two parts of the motif. The predicted function is reported as annotated at genolist.pasteur.fr/tuberculist/.   (49). Combinations of class I and class II promoters give rise to class III promoters, which require multiple activators for their full induction (79). As binding of Rv1395 to both sites is required for the optimal activation of the cytochrome, the P 1 Cyt promoter appears to be a class III promoter and, consequently, Rv1395 a class III activator. Given these observations and the positions of the two sites relative to the P 1 Cyt promoter, it is tempting to speculate the following model: Rv1395 binds first to the high affinity site, around Ϫ62.5, from where either it directly recruits another Rv1395 molecule or it bends the DNA to increase the affinity of the second site (the binding is cooperative). As a consequence, a second molecule of Rv1395 binds to the low affinity site at Ϫ41.5. The two bound molecules can then interact with the RNA polymerase holoenzyme and induce the optimal transcription of the cytochrome. In support of this model, we showed that, if Rv1395 binds only to the Ϫ62.5 site, it induces transcription poorly: in this case the RNA polymerase would interact only with one Rv1395 molecule and in class III promoters all activators are required at the same time.
Rv1395 induces the transcription of the cytochrome P450 Cyp132. Cytochrome P450s constitute a very large family of heme-thiolate monooxygenases, which are found in all living organisms (80). These enzymes oxidize a variety of compounds, including steroids, fatty acids, and xenobiotics (81). Twenty cytochrome P450 genes are present in the M. tuberculosis genome (12), and they are thought to play an important role in the rich lipid metabolism typical of mycobacteria. However, it is generally impossible to predict the nature of the substrates oxidized by cytochrome P450s on the basis of the amino acid sequence only. Therefore, the role of Cyp132 in M. tuberculosis metabolism remains to be defined.
We have also shown that, like most transcriptional regulators, Rv1395 represses its own transcription. Indeed, when Rv1395 binds to the two sites and recruits the RNA polymerase for transcription at P 1 Cyt, it prevents any further RNA polymerase binding at P 1 Rv1395 and at P 2 Rv1395 due to the steric hindrance of the RNA polymerase holoenzyme bound at P 1 Cyt (Fig. 7). Furthermore, because both sites are necessary for repression and one site has a low affinity, Rv1395 has to reach a certain concentration before it can occupy both sites at the same time and thus act as a repressor. This system makes it possible to regulate the amount of Rv1395 present in the cell.
On the other hand, the signals that might induce Rv1395 transcription are not known. Indeed, regulators are often regulated at the transcriptional level by other regulators. The fact that the basal transcription level of P 1 Rv1395 was reduced by more than ten times in the mutL mutant (Fig. 6D) suggests that there could be an activator site just downstream from the promoter (alternatively, the mutL mutation could just prevent the normal activity of the RNA polymerase).
The non-conserved domain is critical for signal recognition in members of the AraC family that are activated by the binding of an effector molecule. This is the case for AraC and arabinose (82) or UreR and urea (83) for examples. In addition to the interaction with small molecules, the non-conserved domain may establish productive contacts with the RNA polymerase holoenzyme and increase the affinity constant or the isomerization of the close complex into an open complex. Furthermore, this region is often responsible for dimerization and increases the cooperativity of the binding to adjacent sites. However, the similarity of the Rv1395 N terminus is restricted to some Pseudomonas putative proteins that have not yet been characterized and to the N-terminal domain of OruR, whose role and function are still unknown. This makes it difficult to predict the FIG. 7. Schematic representation of the mode of action of Rv1395. The upper part shows the transcription levels of the cytochrome P450 gene and of Rv1395 at low concentrations of Rv1395: both distal promoters are stronger than the proximal ones. The lower part shows in gray the activation of P 1 Cyt and the repression of P 1 Rv1395 and P 2 Rv1395 when Rv1395 is bound to the high and the low affinity sites. The numbers indicate the positions of the binding sites relative to P 1 Cyt. The width of the arrows indicating the transcripts is proportional to the transcription level at the relative promoters. nature of the signal to which Rv1395 may respond and the role of this domain in modulating Rv1395 activity.
Thus the next step will be to investigate the regulation of Rv1395 transcription and activity and to determine the role of the cytochrome P450 Rv1394c in M. tuberculosis physiology and virulence.