A Universal Stress Protein (USP) in Mycobacteria Binds cAMP

Background: Mycobacteria utilize cAMP for regulating transcription and protein acetylation. Results: We identify and structurally characterize a universal stress protein as an abundant and specific cAMP-binding protein. Conclusion: The USP domain is a novel cAMP-binding module. Significance: Certain members of the universal stress protein family serve to sequester cAMP, acting as sinks to regulate the availability of cAMP for downstream effectors. Mycobacteria are endowed with rich and diverse machinery for the synthesis, utilization, and degradation of cAMP. The actions of cyclic nucleotides are generally mediated by binding of cAMP to conserved and well characterized cyclic nucleotide binding domains or structurally distinct cGMP-specific and -regulated cyclic nucleotide phosphodiesterase, adenylyl cyclase, and E. coli transcription factor FhlA (GAF) domain-containing proteins. Proteins with cyclic nucleotide binding and GAF domains can be identified in the genome of mycobacterial species, and some of them have been characterized. Here, we show that a significant fraction of intracellular cAMP is bound to protein in mycobacterial species, and by using affinity chromatography techniques, we identify specific universal stress proteins (USP) as abundantly expressed cAMP-binding proteins in slow growing as well as fast growing mycobacteria. We have characterized the biochemical and thermodynamic parameters for binding of cAMP, and we show that these USPs bind cAMP with a higher affinity than ATP, an established ligand for other USPs. We determined the structure of the USP MSMEG_3811 bound to cAMP, and we confirmed through structure-guided mutagenesis, the residues important for cAMP binding. This family of USPs is conserved in all mycobacteria, and we suggest that they serve as “sinks” for cAMP, making this second messenger available for downstream effectors as and when ATP levels are altered in the cell.

individual organisms (2), but their presence across the different kingdoms of life has allowed pathogens to exploit these pathways during their infective as well as disease-causing life cycles (2)(3)(4)(5)(6)(7). The actions of cAMP and cGMP are mediated by downstream effector proteins (8). These proteins usually harbor an evolutionarily conserved cyclic nucleotide binding (CNB) 6 domain (9) fused to an effector domain that can either have catalytic activity (10), regulate the transport of ions and small molecules (11), or bind DNA (12). Structural studies have provided details of the mode of cyclic nucleotide binding to the CNB domain, as well as the mechanism of allosteric regulation of the fused effector domain following binding of the cyclic nucleotide to the CNB domain (8). Apart from the CNB domain, GAF (cGMP-specific and -regulated cyclic nucleotide phosphodiesterase, adenylyl cyclase, and E. coli transcription factor FhlA) domains have also been shown to bind cAMP or cGMP (13,14). GAF domains are structurally distinct from the CNB domain, and the mechanisms of allosteric regulation of the downstream catalytic domains following cyclic nucleotide binding are also understood (14,15).
Mycobacterium tuberculosis is the causative agent of the severe pulmonary disease, tuberculosis. With the availability of the genome sequences of various species of mycobacteria, it was recognized that these bacteria harbor a number of genes that encode adenylyl cyclases (16 -18). Indeed, intracellular levels of cAMP can reach hundreds of micromolars (19,20). Cyclic AMP is also effectively secreted by mycobacteria (19,21,22), perhaps modulating host macrophage function during disease establishment and the formation of the caseous granulomacontaining dormant mycobacteria. In concert with the high levels of cAMP, the genome of M. tuberculosis H37Rv encodes 10 proteins that harbor the CNB domain, and two proteins containing the GAF domain (5). Both M. tuberculosis and the fast growing Mycobacterium smegmatis express proteins that are similar to the bacterial transcription factor, CRP, or cAMP receptor protein (4,23,24). Mycobacterial CRPs contain a CNB domain, but cAMP binding is not essential for binding DNA, in contrast to the well studied Escherichia coli CRP (23)(24)(25). This perhaps is an adaptation to the normally high levels of cAMP in these bacteria. In a previous study, we also characterized unique proteins where a cAMP binding domain is fused to a Gcn5related N-acetyltransferase-like domain, and we showed that mycobacteria regulate protein acylation in a cAMP-dependent manner (26,27). Although bioinformatic and therefore predictive approaches to identify putative cyclic nucleotide-binding proteins have proved valuable so far, in this study we chose to use an unbiased biochemical approach to identify novel cAMP-binding proteins, beyond those homologous to established ones. Using cAMP affinity chromatography, we have isolated and characterized a universal stress protein (USP) Rv1636, which binds cAMP specifically and with high affinity and ATP with lower affinity. We show that orthologs of this protein are conserved in both fast and slow growing mycobacteria, and we have structurally characterized the cAMP-bound form of this USP. The absence of an additional domain associated with the USP domain suggests that these proteins may function as sinks for cAMP, thereby regulating downstream actions of the cyclic nucleotide in the cell.

EXPERIMENTAL PROCEDURES
Mycobacterial Strains and Culture Conditions-M. smegmatis mc 2 155 cells were grown in Middlebrook 7H9 medium (BD Biosciences) supplemented with 0.2% glycerol and 0.05% Tween 80 at 37°C with shaking at 200 rpm. Mycobacterium bovis BCG cultures were grown in the same medium containing oleic acid/albumin/dextrose/catalase (OADC, Difco) supplement at a final concentration of 10% (v/v) in static cultures.
Measurement of Free and Bound cAMP-Both M. smegmatis and M. bovis BCG cultures were harvested at log (A 600 ϳ1) and early stationary (A 600 ϳ3) phases. For M. smegmatis, 9 ml of culture was used in the log phase and 3 ml in the stationary phase. For M. bovis BCG, double the volumes were taken at each stage of growth. The cells were pelleted, and the supernatants were collected for measurement of extracellular cAMP. The cell pellets were washed with cold phosphate-buffered saline and resuspended in 400 l of lysis buffer containing 50 mM Tris-Cl (pH 8.2), 100 mM NaCl, 10 mM 2-mercaptoethanol (2-ME), 10% glycerol, and 1 mM phenylmethylsulfonyl fluoride (PMSF). Cells were lysed by bead beating, centrifuged at 1000 ϫ g, followed by harvesting of the supernatant and further centrifugation at 17,000 ϫ g at 4°C. Protein (400 g) in the supernatant (cytosolic fraction) was subjected to centrifugation through a 3-kDa cutoff membrane filter (Amicon Ultra-0.5 3-kDa Ultracell, Millipore) at 4°C. Cyclic AMP was measured in the eluate (which would represent "free" cAMP) and the original cytosolic fraction (representing "total" cAMP). Subtraction of the free cAMP from the total would provide an estimate of the fraction of cAMP that was "bound" to protein.
Culture supernatant was also subjected to centrifugation through the 3-kDa cutoff membrane filter, and the eluate was used to measure the free cAMP. Neat culture supernatant was used to measure the total extracellular cAMP concentrations. Cyclic AMP was measured by radioimmunoassay following acidification of the samples, as described previously (19).
Mass Spectrometry-Protein bands were excised from the gel, and subjected to tryptic digestion as described in detail earlier (26). The peptides were analyzed by MALDI-TOF (Ultraflex TOF/TOF, Bruker Daltonics, Germany). Data were obtained using Flex Control software (25 KvA Reflector mode, N2 LASER, 337 nm and 50 Hz). The sum of ion intensities from 300 LASER shots was used for each spectrum. The spectra were analyzed using Flex Analysis version 2.0, and proteins were identified using MASCOT on the M. tuberculosis and M. smegmatis proteomics database.
Sequence and Phylogenetic Analysis-Distant and close homologs of Rv1636 were identified using BLAST analysis. The boundaries of the USP domains were identified using the Pfam database, and multiple sequence alignment was performed using ClustalW (28). Phylogenetic relationships were analyzed by the neighbor-joining method using Mega 6 (29).
Cloning, Expression, and Purification of Rv1636 and MSMEG_3811-PCR was carried out on genomic DNA of M. tuberculosis H37Rv and M. smegmatis mc 2 155 using specific primers, sequences of which are available on request. The PCR amplicons were digested with EcoRI and XhoI and cloned into similarly digested pPROEX-HTa to generate pPRO-Rv1636 and pPRO-MSMEG_3811, where both proteins would be expressed with an N-terminal His tag. The clones were confirmed by sequencing (Macrogen, South Korea).
For generating point mutants of Rv1636, a single mutagenic primer-based method was used (30), using pPRO-Rv1636 as template for mutagenesis. All mutants were confirmed by sequencing.
Surface Plasmon Resonance-Surface plasmon resonance (SPR) studies were performed on a BIAcore 3000 platform (GE Healthcare). 8-(6-Aminohexylamino) cAMP (8-AHA-cAMP; BioLog) was dissolved in 100 mM HEPES (pH 8) by careful warming (32). CM5 (carboxymethylated dextran) sensor chip was activated with a mixture of 0.2 M 1-ethyl-3-(3-dimethylaminopropyl)carbodiimide and 0.05 M N-hydroxysuccinimide. 8-AHA-cAMP (3 mM) was injected (2 l/min; 7 min) for immobilization in running buffer containing 100 mM HEPES (pH 8). The surface was deactivated using 1 M ethanolamine-HCl (pH 8.5). A reference channel was activated and deactivated without any 8-AHA-cAMP conjugation. All interactions were studied in buffer containing 10 mM HEPES (pH 7.5), 100 mM NaCl, 10 mM MgCl 2 , 10 mM 2-ME, and 0.005% P20. Rv1636 was used at a concentration of 1 M at a flow rate of 5 l per min. The association and dissociation were both monitored for 10 min at 25°C. MSMEG_3811 (3 M) was injected at a flow rate of 30 l per min, and the association and dissociation were monitored for 100 s at 25°C. Binding of the proteins to cAMP on the chip during the association phase was also monitored in the presence of different concentrations of free 3Ј,5Ј-cAMP, 3Ј,5Ј-cGMP, and Mg-ATP. The proportion of Mg-ATP present in the solution was calculated using Mg-ATP calculator version 1.3. Binding data were analyzed using BIAevaluation 3.0 software and plotted using GraphPad Prism 5.
Isothermal Titration Calorimetry-Isothermal titration calorimetry (ITC) experiments were carried out on a MicroCal iTC200 system (GE Healthcare) in buffer containing 10 mM HEPES (pH 7.5), 100 mM NaCl, 5 mM 2-ME, and 10% glycerol. Binding studies for monitoring interaction with ATP had 10-fold excess of MgCl 2 in the buffer over the ATP used for the injection. Proteins were dialyzed in the buffer mentioned above, and cAMP and ATP were also prepared in the same buffer. All samples and buffers were degassed thoroughly under vacuum. The initial injection volume was 0.4 l over a duration of 0.8 s. All subsequent injection volumes were 2 l over 4 s with a spacing of 150 s between two injections. Data from control experiments where only buffer was injected into the protein solution was subtracted from each injection to compensate for the heat of dilution. Data for the initial injection were not considered. The data were analyzed using the built-in single site fitting model of Origin 7.0 software (OriginLab Corp.).
Crystallization-MSMEG_3811 was expressed in E. coli BL21 (DE3). Harvested cells were resuspended in buffer A (50 mM Tris-Cl (pH 8.0), 100 mM NaCl, 10% glycerol) and 2 mM PMSF and protease inhibitor mixture (Roche Applied Science) were added. Cells were lysed by sonication, and debris was removed by centrifugation at 30,000 ϫ g for 30 min. The supernatant was incubated with Ni-NTA beads (Qiagen) for 1-2 h.
Ni-NTA beads were washed with 15 column volumes of buffer A and 20 column volumes of 50 mM Tris-Cl (pH 8.0), 300 mM NaCl, 10% glycerol, 20 mM imidazole. His-tagged MSMEG_ 3811 was eluted in buffer A supplemented with 300 mM imidazole. The His tag was cleaved off with tobacco etch virus protease during dialysis in buffer A overnight. The digested protein was incubated with Ni-NTA beads to remove the His tag and His-tagged tobacco etch virus protease. MSMEG_3811 was concentrated and subjected to gel filtration on an S200 16/60 column (GE Healthcare) in buffer A.
For MSMEG_3811 crystallization, initial screening was performed with 15 mg/ml protein and 5 mM cAMP against the Qiagen JCSG ϩ and JCSG Core I-IV factorial screens in drops consisting of 0.2 l of protein stock solution and 0.2 l of reservoir solution. The sitting-drop experiments were set up at 20°C using a Phoenix nanodispenser (Art Robbins Instruments) and a total of 35 l of reservoir solution. The initial condition contained 0.1 M MES (pH 6.0) and 1.6 M ammonium sulfate and was refined to 0.1 M MES (pH 6.0), 1.9 M ammonium sulfate, and 6% (v/v) polypropylene glycol 400 (PPG 400) using the Hampton Research additive screen. Refined crystals were grown at 20°C in a 24-well format by using the hanging-drop vapor diffusion method with equal volumes of reservoir and protein stock solution (1:1 l) in the drop and a 1-ml reservoir. The protein stock solution consisted of 25 mg/ml MSMEG_3811 with 5 mM cAMP. Diffraction quality crystals in various shapes appeared after 3 days. Crystals were cryoprotected through soaking in 0.1 M MES (pH 6.0), 2 M ammonium sulfate, 6% (v/v) PPG 400, 20% (v/v) glycerol, and 5 mM cAMP.
Data Collection, Structure Determination, and Refinement-A complete diffraction dataset was collected at 100 K and a wavelength of 0.91841 Å at the Berlin Electron Storage Ring Society for Synchrotron Radiation at synchrotron beamline 14.1 (BESSY BL14.1) operated by Helmholtz-Zentrum Berlin (33). The crystal belonged to space group C222 1 and diffracted close to 2 Å resolution. Diffraction data were indexed, integrated, and merged using the x-ray Detector software (XDS (34)), resulting in a 98.8% complete dataset cut at 2.15 Å resolution based on CC1/2 (35). All further programs for structure solution were from the CCP4 suite (36). The program Balbes (37) was used for selecting a suitable search model (PDB code 2Z08) 7 and to solve the structure of MSMEG_3811-cAMP through molecular replacement. The Balbes solution contained four monomers in the asymmetric unit even though cell content analysis (39) and electron density analysis in Coot (40) suggested six monomers. The model was therefore adjusted in Coot to contain only residues and side chains that already fitted to the electron density in the partial solution, and molecular replacement was repeated with Molrep (41) searching for three physiological dimers. After successful molecular replacement, refinement was done in Refmac5 (42) resulting in initial crystallographic R and R free values of 39.3 and 43.0%, respectively. Coot was used for cycles of manual model building and analysis of model quality followed by TLS (43)

Significant Fraction of Cytosolic cAMP Is Bound to Protein-
Cytosolic as well as extracellular cAMP levels in mycobacteria can reach 100 M, as measured by radioimmunoassay and ELISA procedures (19,22). Such measurements incorporate procedures that dissociate cAMP bound to proteins, therefore estimating concentrations of cAMP that reflect the total amount present in the cellular fraction. Cyclic AMP-binding proteins have been identified and characterized from mycobacteria (23,24,26), suggesting to us that a significant fraction of the cyclic nucleotide may be bound to proteins within the cell. We therefore attempted to detect this fraction from that which was free or not associated with protein. We performed this by subjecting the cytosolic fraction to filtration through a membrane that retained molecules greater than 3 kDa in size. We argued that free cAMP would readily pass through this filter, whereas cAMP that was bound to protein would be retained. Measuring cAMP in both the filtrate and the cytosolic fraction prior to centrifugation and subtracting the concentration of cAMP in the filtrate from the total cAMP measured in the cytosol could provide an estimate of bound cAMP.
In both exponential and stationary phases of growth of M. smegmatis and M. bovis BCG cultures, we determined that the majority of the extracellular cAMP pool was present as free cAMP (Fig. 1A). However, when we monitored the distribution of intracellular cAMP between bound and free fractions, we observed that ϳ50% of the total cytosolic cAMP was bound to protein in M. smegmatis cells (Fig. 1B). Interestingly, almost the entire fraction of cAMP was bound to protein in M. bovis BCG in the log phase, and more than 80% remained bound in the stationary phase of growth. Thus, a significant fraction of cAMP in mycobacteria is protein-associated in the cell.
Identification of a Highly Expressed cAMP-binding Protein in Mycobacteria-Mycobacterial proteins that can bind cAMP include CRPs and a cAMP-regulated protein acyltransferase, both found in slow growing and fast growing mycobacteria (23,24,26). To identify additional proteins that may bind cAMP, we took advantage of affinity chromatography on 6-AH-cAMP agarose beads. We used the cytosolic fraction from M. tuberculosis CDC1551 cells and let it interact with the beads, and following extensive washing of the beads to remove unbound proteins, we eluted bound proteins with 1 mM cAMP. Similar experiments were performed with the cytosol prepared from M. smegmatis. In both cases, a major protein of ϳ14 kDa was associated with the beads and specifically eluted with cAMP, but not 5Ј-AMP (Fig. 1C). Tryptic digestion of the proteins following SDS-gel electrophoresis and by subsequent mass spectrometry analysis identified the two proteins as MT_1672 (identical in amino acid sequence to Rv1636 from M. tuberculosis H37Rv) and MSMEG_3811, with more than 65% sequence coverage. Interestingly, Rv1636 and MSMEG_3811 show more than 80% sequence identity, and their orthologs can be identified in all mycobacteria, including M. leprae, which has undergone a high degree of pseudogenization (Fig. 1D) (45). Surprisingly, neither MSMEG_3811 nor Rv1636 contain a canonical CNB domain, and instead they are annotated as members of the USP superfamily (46,47).
Rv1636 and MSMEG_3811 Bind cAMP with High Specificity-We cloned, expressed, and purified Rv1636 and MSMEG_3811 with an N-terminal His tag ( Fig. 2A), and gel filtration analysis showed that both proteins form a dimer in solution (data not shown). To measure the affinity of cAMP binding to these proteins, we performed SPR experiments, using 8-AHA-cAMP immobilized to a CM5 chip (Fig. 2B). The association (k a ) and dissociation (k d ) rates for both proteins are shown in Table 1. We incorporated varying concentrations of cAMP in the buffer during the binding reaction and by monitoring the reduction in RU, a similar low micromolar IC 50 for both Rv1636 and MSMEG_3811 was observed (Table 1). This result is consistent with the similar K d values calculated from k a and k d , and we can thus conclude that both Rv1636 and MSMEG_3811 bind cAMP with comparable affinities, reflective of their similarity at the sequence level.
CNB and GAF domains show specificity in their interactions with cyclic nucleotides, binding either cAMP or cGMP (12,13). Binding of Rv1636 to the sensor chip was inhibited by cAMP but not by cGMP (Fig. 2C), indicating that Rv1636, and presumably its orthologs, are specific cAMP-binding proteins. This is the first report of the identification of USPs that can bind cAMP. USPs show no sequence similarity to the canonical CNB or GAF domains that are well characterized cyclic nucleotide-binding proteins. We therefore decided to study the nature of the interaction of these proteins with cAMP, using isothermal titration calorimetry that could also provide binding stoichiometry. For both Rv1636 and MSMEG_3811, the tight binding was confirmed (K d ϳ3 M) and found to be enthalpydriven (Table 2 and Fig. 2D). One molecule of cAMP bound per monomer of both Rv1636 and MSMEG_3811 (Table 2).
Rv1636 and MSMEG_3811 Bind ATP-USPs are classified into two groups that differ in their ability to bind ATP (48). A conserved Walker A-like motif, GX 2 GX 9 G(S/T), forms an essential region that interacts with the triphosphate of ATP (46). Sequence analysis revealed that this ATP-binding motif is present in Rv1636 and its orthologs (Fig. 1D). We therefore monitored ATP binding to Rv1636 in the presence of MgCl 2 by surface plasmon resonance. ATP was able to inhibit binding of Rv1636 to the cAMP-coated sensor chip (Fig. 3A), with an IC 50 of 65.6 Ϯ 5.6 M, 10-fold higher than that of cAMP. This result indicates that ATP can compete with cAMP for its binding site, but it interacts with lower affinity. ITC indeed revealed that K d for binding of ATP to Rv1636 was ϳ 31 M, representing a 10-fold lower affinity as compared with cAMP (Fig. 3B). The binding of ATP was also enthalpy driven, and a single binding site for ATP was seen per monomer of protein. Similar binding parameters were observed for MSMEG_3811 (data not shown). Thus, Rv1636 and its orthologs are nucleotide-binding proteins, which show higher affinity for cAMP over ATP.
Structural Basis of cAMP Binding-USPs form a subfamily of the adenine nucleotide ␣-hydrolase like superfamily according to the database for the Structural Classification of Proteins. Their generic structure consists of a core of three ␣/␤/␣ layers and a 5-stranded, parallel ␤-sheet (47). Crystal structures of several USP family members show that ATP binds between ␤1 and ␤4, in a pocket composed of residues from ␣1, ␤1, ␤2, and ␤4 (47). The triphosphate is bound by a loop featuring a conserved Walker A-like motif (see above) and often coordinated by additional magnesium or manganese ions (47,49). To understand the unique binding behavior of Rv1636/MSMEG_3811, i.e. their preference for cAMP over ATP, we attempted to crystallize these proteins. For Rv1636, we could only obtain a low resolution structure for its apo-state (data not shown), which was not pursued further due to the availability of a higher resolution structure (PDB 1TQ8). For MSMEG_3811, we could not obtain crystals of the apo-form but obtained well diffracting crystals for its cAMP complex. Initial attempts to solve the structure of the MSMEG_3811-cAMP complex by molecular replacement phasing with the Rv1636 apo-structure failed. Instead, we identified the universal stress protein TTHA0895 from Thermus thermophilus (PDB ID 2Z08) as a search model that enabled successful molecular replacement phasing of our MSMEG_3811-cAMP complex. The MSMEG_3811/cAMP structure was refined at 2.15 Å resolution to R/R free values of 17.9 and 23.5% ( Fig. 4A; Table 3) and good geometry (98.8% residues in favored regions of the Ramachandran plot, 1.2% in allowed regions, and none in disallowed regions). One cAMP ligand per monomer was well defined by electron density (Fig.  4, B-D). Because of the high similarity of Rv1636 and MSMEG_3811 in their sequence (Fig. 1D) as well as their ligand binding properties (Tables 1 and 2), we used the available Rv1636 apo-structure and our MSMEG_3811-cAMP complex for comparing these two states of this protein family.
The monomer structure of MSMEG_3811/cAMP has the typical open, twisted, 5-stranded parallel ␤-sheet with topology 32145, sandwiched by six ␣-helices of different lengths (Fig.  4A). Like other USPs of this family, e.g. MJ0577 from Methanococcus jannaschii (PDB code 1MJH), MSMEG_3811 forms a type 1 homodimer (Fig. 4A) as defined by Tkaczuk et al. (47), consistent with its behavior in size exclusion chromatography. The dimer interface is formed mainly by the C terminus of each monomer through a conserved, strongly hydrophobic VL(J/ V)V motif in ␤5 (Fig. 1D) (47). In contrast to Rv1636 apo, the loop with the Walker A-like ATP-binding motif in MSMEG_3811 is closer to the nucleotide-binding pocket due to the parallel orientation of ␤4 and ␤5, whereas ␤5 of Rv1636 apo is swapped between monomers and inserted between ␤4and ␤5 of the partner monomer (Fig. 4, E and F). Variations in the topology of this region between USP members have been reported before (see "Discussion"). Cyclic AMP is bound in the same nucleotide binding pocket as ATP in other USPs, with its adenine base making conserved interactions with the amino and carbonyl groups of Ala-40 (47), and the hydroxyl group of  the ribose forming hydrogen bonds to Gly-10 and Gly-114 (Fig.  4, B and D).
Based on the details provided by the crystal structure, the Gly-10 residue in Rv1636 interacts with the 2Ј-OH group of the ribose moiety (Fig. 4, B and D). Because the interaction is through the main chain of the peptide backbone, we introduced a bulkier side chain at this position by mutating the Gly residue to a Thr. In addition, Rv1636-Gly-113 (Gly-114 in MSMEG_ 3811) is part of the ATP-binding loop and interacts with the oxygen that is part of the ester bond between the phosphate and the 3Ј-C of the ribose. We mutated this Gly to Ala to reduce the flexibility of the loop. The mutant proteins were expressed and purified, and their ability to bind cAMP was monitored using SPR. Both Rv1636 G10T and G113A were significantly compromised in their binding ability, resulting in 24.4 Ϯ 8.5 and 40.0 Ϯ 12.2%, respectively, of the ligand binding signal observed for wild type Rv1636. Moreover, no binding to ATP was detected by isothermal titration calorimetry for either of the mutant proteins (data not shown). Therefore, our mutational analysis supports the conclusion from our competition experiment that cAMP and ATP bind to the same pocket in the protein, albeit with different affinities.
To identify structural features hindering ATP binding to MSMEG_3811 and thus causing its cAMP preference, we overlaid MSMEG_3811/cAMP with the TTHA0895-ATP complex (Fig. 4G). Although the binding mode of the adenine and ribose moieties is comparable (Fig. 4, B and G), the cyclic phosphodiester moiety of cAMP forms interactions with Ser-14 and Ser-16 in ␣1, and with Val-116 due to the close proximity of the binding loop comprising the Walker A-like motif (Fig. 4, B and  D). The additional residues of this loop are oriented away from the nucleotide-binding site; however, due to the short but bulky ␣5-helix in its middle in MSMEG_3811 (Fig. 4G). In contrast, the corresponding loop of TTHA0895, which does not comprise this helical insertion, is conformationally less restricted and forms a binding pocket for the ␤and ␥-phosphate groups (Fig. 4G). In the MSMEG_3811-binding site, the terminal ATP phosphates would cause only small clashes, and the protein might in fact adjust and avoid them, but the lack of a significant interaction surface for these phosphates likely contributes to the weak interaction with ATP. Also, the triphosphate of the ATP-binding USP is complexed through two magnesium ions and further interactions with Ser-120, Gln-121, and Ser-122 in the TTHA0895-␣4-helix (Fig. 4G). Ser-120 is part of the conserved Walker A-like motif and also present in MSMEG_3811/ Rv1636. The Gln is replaced by Val in MSMEG_3811/Rv1636 and also in the ATP-binding USP MJ0577. However, Ser-122 is replaced by a Pro in MSMEG_3811/Rv1636, causing a tight and rigid bend of the protein main chain into ␣6. This rigid arrangement likely prevents the small conformational adjustment needed to allow the slightly rotated ribose position of ATP in TTHA0895 (Fig. 4G). Interestingly, the Pro in this position in MSMEG_3811/Rv1636 is absent in all the other USPs with ligands other than cAMP and thus might be part of a signature module for cAMP-binding USPs.
Potential Binding of Other Cyclic Nucleotides-To understand how the nucleotide-binding site of MSMEG_3811 discriminates between cAMP and cGMP, we inserted a hypothetical cGMP ligand by overlaying it with cAMP (Fig. 5A). Although the binding mode of the phosphodiester and the ribose moiety would be comparable, the amino group of the cGMP guanine moiety would have a distance of 2.45 Å to Ala-38 leading to a slight clash, and the carbonyl groups of the guanine moiety and of Ala-40 would be only 3.02 Å apart (Fig.  (red). cAMP is shown as sticks colored according to atom type. D, scheme of the interactions between cAMP and MSMEG_3811. All atoms are colored according to atom type. Hydrogen bonds are shown as dashed lines labeled with the distance between relevant atoms. MSMEG_3811 residues involved in hydrophobic contacts are shown as half-circles. The figure was generated using LigPlotϩ (68). E, topology diagrams of Rv1636 apo (left) and MSMEG_3811/ cAMP (right), generated with TopDraw and Pro-Origami (67,69). Cylinders represent ␣-helices, and arrows correspond to ␤-strands. Green-shaded secondary structure elements belong to the corresponding second monomer (Fig. 4A). The gray-shaded ␣-helix (␣2) in Rv1636 apo is not visible in the structure, but it will likely form a helix according to secondary structure prediction and sequence alignment with MSMEG_3811 (Fig. 1D). F, overlay of MSMEG_3811/cAMP (blue) with Rv1636 apo (PDB code 1TQ8; green). 84 C␣ atoms overlaid with a root mean square deviation of 0.95 Å. cAMP is shown in stick presentation and colored according to atom type. Hydrogen bonds are shown as dashed lines. Residues of the conserved Walker A-like motif are underlined. G, overlay of MSMEG_3811/ cAMP (blue) with the ATP-binding universal stress protein TTHA0895 (yellow) used for molecular replacement. 79 C␣ atoms overlaid with a root mean square deviation of 1.022 Å. Ligands are shown as sticks and colored according to atom type. Relevant residues for ATP binding of TTHA0895 are shown in stick presentation and colored according to atom type. The ATP-binding region of TTHA0895 with Gly-109, Gly-111, Ser-120, Gln-121, and Ser-122, as well as two magnesium ions (green), coordinate the ␤and ␥-phosphate groups of ATP. Residues of the conserved Walker A-like motif are underlined. 5A). As a result of the latter, two potential proton acceptors would be in close contact instead of a favorable hydrogen bond as observed for adenine, likely contributing to the observed specificity of Rv1636/MSMEG_3811 for cAMP.
In mycobacteria, bis-(3Ј,5Ј) cyclic dimeric nucleotides also play an important role as second messengers (50,51), and we therefore analyzed the potential binding of c-di-AMP or c-di-GMP to Rv1636/MSMEG_3811. Because of the observed adenine nucleotide preference, we focused on c-di-AMP. Placing a hypothetical c-di-AMP ligand into the MSMEG_3811-cAMP complex by overlaying it with cAMP would lead to a strong clash of one phosphate group with Val-116 (Fig. 5B). This Val is one of the first residues of the loop featuring the conserved Walker A-like binding motif of USPs (Fig. 1D). The observed clash is probably due to the close proximity of this loop to the nucleotide binding pocket because of the parallel arrangement of ␤4 and ␤5 in MSMEG_3811/cAMP (Fig. 5B). However, based on this structural analysis we assume that c-di-AMP does not bind to Rv1636/MSMEG_3811.

DISCUSSION
In this study, we have attempted to empirically detect cAMPbinding proteins in mycobacteria, and we identified Rv1636/ MSMEG_3811 as low molecular weight and highly abundant cAMP/ATP-binding proteins. The existence of a noncanonical cAMP-binding module identified in these USPs adds a further dimension to our understanding of the diverse functions of this family of proteins. Of note, cAMP-binding USPs as exemplified by Rv1636 and MSMEG_3811 do not contain additional domains in the full-length protein, as is normally observed in CNB and GAF domain containing proteins.
The overall architecture of Rv1636/MSMEG_3811 is a homodimer, assembled via a conserved, hydrophobic VL(J/V)V dimerization motif in ␤5. This arrangement groups them with the type 1 USP proteins as defined by Tkaczuk et al. (47). Although the dimer interface is formed via this C-terminal motif in both proteins, MSMEG_3811/cAMP retains two independently folded monomers, whereas in apo-Rv1636 the ␤5-strand is swapped, leading to additional intermolecular interaction between ␤5 and the partner monomer's ␤4 (Fig. 4F). This interaction is mediated by an LLVVG motif in ␤4, but the motif is also present in MSMEG_3811, where it mediates the corresponding intramolecular interaction. Slight variations in the USP ␤-sheet organization in the interaction region have been observed before (47). However, the extreme case of secondary structure exchanges between monomers ("domain swapping") appears to be of physiological relevance only in few proteins and is more often attributed to the nonphysiological crystallization conditions such as low pH or the high salt concentration in our condition (52). We therefore assume that the observed difference between Rv1636 and MSMEG_3811 is not related to ligand binding.  The affinity of Rv1636/MSMEG_3811 for cAMP is higher than that of the CRPs and the cAMP-regulated protein acyltransferase from M. tuberculosis, KATmt, or Rv0998 (23,26). A recent study has shown Rv1636 to be the 20th most abundant protein in the M. tuberculosis proteome (53), in agreement with our ability to readily purify it through affinity chromatography. Because a major fraction of the intracellular cAMP is bound to proteins in mycobacteria (Fig. 1B), we can speculate that a significant fraction of this cAMP is bound to Rv1636/MS-MEG_3811. Thus, the ability of these proteins to act as "sinks" for cAMP will limit the fraction of free cAMP available for binding to other cAMP-binding proteins, which in turn could be an effective way of regulating signaling events mediated by cAMP, and perhaps secretion of cAMP from the bacterial cell.
USPs are classified into two groups based on their ability to bind ATP. The UspFG-type binds ATP and UspAs and UspA-like proteins do not (48). All known ATP-binding USPs possess a conserved ATP-binding motif (46), but the ability to bind ATP has to date been characterized only from the co-crystal structures. The conserved ATP-binding motif is present in Rv1636/MS-MEG_3811, and we report the first biochemical and thermodynamic analysis of ATP binding by a USP. A few USPs lack a fully conserved ATP-binding motif but continue to bind ATP. These include Klebsiella pneumoniae USP KPN01444, Himanthalia elongata USP HELO4277, and the C-terminal USP domain of T. thermophilus TTHA0350 (47). None of the ATP-binding USPs have been shown to have ATPase activity, and we were also unable to detect ATPase activity in Rv1636 or MSMEG_3811 (data not shown). Therefore, the functional significance of the ATP-binding by USPs still remains unknown.
Aravind et al. (54) had suggested that UspA-like proteins evolved from the UspFG-type ATP-binding proteins. It is probable that cAMP-binding USPs, such as Rv1636/MSMEG_3811, may have independently evolved from UspFG-type proteins, retaining their ATP binding properties while achieving a higher affinity for cAMP. It is not immediately obvious why Rv1636/ MSMEG_3811 assume a different conformation in the Walker A-type motif region, with a short helix ␣5, but this structural difference certainly contributes to the unique specificity for cAMP by providing a binding site fitting the cyclic phosphodiester and not providing favorable interactions for ATP phosphates. Formation of ␣5 might be the structural reason for this difference, because accommodation of this helix leaves little flexibility to the remainder of the loop. Rv1636/MSMEG_3811 have identical sequences in this region, in particular in the IAGRLL motif that forms the helix, and its helix formation propensity, possibly together with the Pro described above, appears to be a key feature discriminating this cAMP-specific USP subfamily from other ATP-binding members.
The genome of M. tuberculosis encodes for 10 USPs, and Rv1636 is the only protein that contains a single USP domain (46). All other USPs have either two tandem USP domains or are associated with other domains of unknown function. The single USP present in Mycobacterium leprae, ML1390, is the ortholog of Rv1636 sharing 89% identity (55). Shown in Fig. 6 is a phylogenetic analysis of all USP domains from M. tuberculosis, M. bovis BCG, M. smegmatis, and M. leprae. One USP from M. smegmatis, MSMEG_3308, clusters closely with Rv1636/ BCG_1674 and MSMEG_3811/ML1390, but an insertion of 14 amino acids is seen in MSMEG_3308 after Rv1636-Glu-45/ MSMEG_3811-Glu-46. This region is not structured in the Rv1636 apo-or MSMEG_3811/cAMP crystal structures. The conserved IAGRLL motif in Rv1636/MSMEG_3811 that forms the ␣5 helix has the slightly altered sequence VAGRLL in MSMEG_3308. In the mass spectrometric analysis of proteins purified by cAMP affinity chromatography using M. smegmatis cytosol, we did not detect peptide fragments corresponding to MSMEG_3308, although its size (17 kDa) is very similar to MSMEG_3811. Ser-14 and Ser-16 of MSMEG_3811 (conserved through this subfamily) coordinate the cAMP phosphate moiety (Fig. 4D). However, the Val of MSMEG_3308 between those Ser residues might create a different environment from that generated by Asp in MSMEG_3811/Rv1636/ ML1390/BCG_1674, thus preventing cAMP binding to MSMEG_3308.
Rv1636 and MSMEG_3811 bind cAMP and also ATP, albeit with ϳ10-fold lower affinity. Because cellular ATP concentrations normally exceed cAMP levels, the intracellular concentrations of these two adenine nucleotides will strongly influence which nucleotide is bound to these USPs. Interestingly, the formation of cAMP can be dependent on the concentration of ATP in the cell (56), and Rv1636/MSMEG_3811 may therefore act as protein regulators of downstream effectors of cAMPbinding proteins, coupling this action with the energy status of the cell. Expression levels of these USPs may also determine the extent of free cAMP present in the cell. MSMEG_3811 is induced upon heat stress, along with a number of other heatshock proteins (57). Levels of MSMEG_3811 appear to be higher during the stationary phase, and the authors (57) suggest that expression of this protein may serve as a generalized stress response. It would be interesting to monitor, for example, levels of free cAMP and, as an example, the consequent acetylation of substrates of the cAMP-regulated protein acetyltransferases (58) under these conditions. Several proteomic studies have suggested Rv1636 to be present in the cell membrane and cell wall fractions (59 -62), perhaps again regulating the activities of cAMP-binding proteins in various subcellular locations in the mycobacterial cell. Interestingly, Rv1636 has been found in the culture supernatant as a secreted protein (63,64) and also expressed during infection in mice (65), again raising the possibility of depleting free cAMP levels either secreted from the bacteria or present in the bacterial cells within the host.
To the best of our knowledge, the evidence for cellular sinks for second messengers has not been alluded to until now. We have earlier suggested that GAF domains can act as sinks for cGMP, in the absence of an associated effector domain (66). The USPs that we have characterized, being devoid of an associated effector domain, perhaps represent the first example of a "bank" for cAMP, where the currency is released depending on the needs of the organism. Thus, the free cAMP in the cell available for regulating cAMP-regulated effectors (such as CRP or the protein acetyltransferase, KATmt) will be determined by the levels of Rv1636 in the cell. The USPs characterized here possess an affinity for cAMP that is higher than that of mycobacterial CRPs (23,24). Thus, cAMP-dependent, CRP-regu-lated transcription may operate only when levels of Rv1636/ MSMEG_3811 fall sufficiently in the cell, thereby increasing free cellular cAMP to levels that will saturate CRP. In a similar manner, cAMP-regulated protein acetylation may also depend on levels of Rv1636 in the cell.
Because Rv1636/MSMEG_3811 can bind both cAMP and ATP, we also suggest an additional role for these sinks. Because ATP is the substrate for adenylyl cyclases, the concentration of free ATP may indirectly regulate cAMP levels in the cell. As cAMP levels increase in the cell, Rv1636/MSMEG_3811 may preferentially be bound to cAMP, thereby releasing ATP into the cell. Thus, cAMP could be an indirect regulator of energy (i.e. ATP)-dependent pathways in mycobacteria. Perhaps bacteria have evolved these USPs to serve this function, adding another twist to the unique ways that mycobacteria utilize cAMP as a second messenger (5).
In conclusion, we report that a group of enigmatic universal stress proteins bind cAMP with high specificity and affinity, and we suggest that although this finding has important implications in thinking about cAMP signaling in mycobacteria, it is conceivable that similar USPs in a variety of bacteria are in fact utilized for regulating the actions of this ancient second messenger within and between cells.