The Rv2633c protein of Mycobacterium tuberculosis is a non-heme di-iron catalase with a possible role in defenses against oxidative stress

The Rv2633c gene in Mycobacterium tuberculosis is rapidly up-regulated after macrophage infection, suggesting that Rv2633c is involved in M. tuberculosis pathogenesis. However, the activity and role of the Rv2633c protein in host colonization is unknown. Here, we analyzed the Rv2633c protein sequence, which revealed the presence of an HHE cation-binding domain common in hemerythrin-like proteins. Phylogenetic analysis indicated that Rv2633c is a member of a distinct subset of hemerythrin-like proteins exclusive to mycobacteria. The Rv2633c sequence was significantly similar to protein sequences from other pathogenic strains within that subset, suggesting that these proteins are involved in mycobacteria virulence. We expressed and purified the Rv2633c protein in Escherichia coli and found that it contains two iron atoms, but does not behave like a hemerythrin. It migrated as a dimeric protein during size-exclusion chromatography. It was not possible to reduce the protein or observe any evidence for its interaction with O2. However, Rv2633c did exhibit catalase activity with a kcat of 1475 s−1 and Km of 10.1 ± 1.7 mm. Cyanide and azide inhibited the catalase activity with Ki values of 3.8 μm and 37.7 μm, respectively. Rv2633c's activity was consistent with a role in defenses against oxidative stress generated during host immune responses after M. tuberculosis infection of macrophages. We note that Rv2633c is the first example of a non-heme di-iron catalase, and conclude that it is a member of a subset of hemerythrin-like proteins exclusive to mycobacteria, with likely roles in protection against host defenses.

The gene designated as Rv2633c in Mycobacterium tuberculosis H37Rv (Mtb) has been shown to be rapidly up-regulated during infection following phagocytosis of the bacterium by macrophages (1,2). It was subsequently shown to be up-regulated during in vitro acidification during macrophage infection (3). Further evidence for a critical role for the Rv2633c protein during in vivo infection stems from a transposon mutation screen that revealed that Mtb with Tn insertions inactivating Rv2633c was significantly attenuated (4). Despite the relevance of this protein to the pathogenicity of Mtb, this protein has not been previously isolated and its physiological function is unknown. The results described above suggest the possibility that Rv2633c may play an important role in the adaptation to host-derived antimicrobial mechanisms such as phagosomal acidification and reactive oxygen species. Mtb has multiple strategies to combat the damaging effects of reactive oxygen species that the host uses as a defense against this pathogen. These include protein defenses, a catalase-peroxidase (KatG), superoxide dismutase, and peroxiredoxins (5,6). Mycobacteria also use mycothiol, which is a thiol present within the cytoplasm that creates a reducing environment for a defense against oxidative stress (7). The results described above, combined with our findings in this study, strongly suggest that Rv2633c is also an important component of the defense strategy against oxidative stress.
Analysis of the sequence of the protein encoded by the Rv2633c gene, which is presented in this paper, reveals the presence of an HHE cation-binding domain that is common in hemerythrins and hemerythrin-like proteins. Contrary to their name, hemerythrins do not contain heme but instead have a di-iron center which is used to bind oxygen (8). These HHE domains are 4-␣-helical bundles that provide a pocket in which O 2 binds to an oxygen-bridged di-iron site. The irons are typically coordinated within the HHE domain via the carboxylate side chains of a Glu and an Asp, and five His residues (Fig. 1).
The hemerythrin domain is found in a wide range of organisms and has been shown to have functions including oxygen binding, iron sequestration, and chemotaxis. Hemerythrins were first found in certain species of marine invertebrates: Sipuncula (peanut worm), Priapulida, Brachiopoda, and several different annelids (9,10). Similar proteins have since been identified in bacterial species, as well as in some archaea (11). It is apparent that hemerythrin-like domains have been adapted for a broad range of functions. The first bacterial hemerythrinlike protein to be studied was McHr from Methylococcus capsulatus (12). It was predicted to be a transporter that delivers O 2 to the particulate methane monooxygenase for methane oxidation (12 cro ARTICLE an iron storage protein during the development of Theromyzon tessulatum, a species of leech (14). A hemerythrin-like protein found in Mycobacterium smegmatis, MsmHr, was shown to regulate the expression of the sigma factor SigF in response to H 2 O 2 rich environments. Studies in which the gene was knocked out, overexpressed, or mutated demonstrated a correlation between the levels of the protein and susceptibility to oxidative stress (15). That protein has not been isolated and it should be noted that there is not significant sequence similarity in the overall protein sequence between Rv2633c and MsmHr.
In this study, the recombinant Rv2633c protein was cloned from Mtb, expressed in Escherichia coli and purified. Physical properties of the protein were determined and an enzymatic activity was identified. The results indicate that Rv2633c is a non-heme di-iron protein that functions as a catalase. Furthermore, sequence and phylogenetic analysis presented herein reveals that Rv2633c is a member of a subset of hemerythrinlike proteins exclusive to mycobacteria, including known pathogens.

Sequence and phylogenetic analyses
Inspection of the primary sequence of Rv2633c revealed the presence of an HHE cation-binding domain that is common in hemerythrins (Fig. 1). A basic local alignment search tool (BLAST) 2 search was used to compare the hemerythrin-like domain in Rv2633c to conserved sequences, and the constraintbased multiple alignment tool (COBALT) was used to create a multiple sequence alignment of proteins with sequences most related to Rv2633c. Protein alignments of Rv2633c, excluding Mtb, returned highly similar sequences of putative proteins in other closely related Mycobacterium species. A BLAST protein search, excluding all mycobacterium species, yielded no sequences with similarity comparable to those of the mycobac-teria. Thus, Rv2633c and the similar genes from Mycobacteria represent a distinct subclass of hemerythrin-like proteins. These genes are each annotated as hypothetical proteins with hemerythrin and hemerythrin-like domains, with no known structure or function. Alignment of the sequence of the protein encoded by the Rv2633c with its homologs in other mycobacteria ( Fig. S1) revealed an interesting finding. The close homologs are present in pathogenic and opportunistic species. In contrast, M. smegmatis does not contain a gene with significant similarity to Rv2633c. Moreover, no other species of bacteria have a gene product with statistically significant homology across the entire protein sequence. This analysis strongly suggests that the gene product of Rv2633c is involved in the unique ability of opportunistic and authentic pathogens of this clade. A phylogenetic tree of these proteins based on alignment is shown in Fig. 2.

Physical properties of the Rv2633c protein
The yield of the purified protein was ϳ1 mg from each liter of cell culture. The predicted molecular mass from the sequence of Rv2633c, including the hexahistidine tag, is 19,064 Da. The purified protein migrated on SDS-PAGE as a single band of approximately that mass. When subjected to analytical sizeexclusion chromatography (Fig. 3) it migrated with an apparent mass of ϳ36.5 Ϯ 3.9 kDa (average of results from three different protein preparations). This suggests that the native form of the protein is a homodimer. The observation that Rv2633c elutes as a dimer is noteworthy, as hemerythrins which act as O 2 carriers typically exist as octomers (8). The metal content of the purified protein was determined using sector-field ICP-MS for the presence of copper, iron, manganese, nickel, and zinc. The sulfur content of each sample was also determined so that molar metal/protein ratio of each metal could be accurately determined by comparison with the known sulfur content of the Within the HHE cation-binding domain of hemerythrin one iron is coordinated by nitrogens from three histidine residues and oxygens from aspartate and glutamate residues. The other iron is coordinated and by nitrogens from two other histidine residues and oxygens from the same aspartate and glutamate residues. There is also an oxygen bridging the two irons (8). The amino acid sequence of Rv2633c derived from the gene sequence is presented with the residues characteristic of the HHE domain underlined. The evolutionary history was inferred by using the Maximum Likelihood method based on the JTT matrix-based model (26). After the sequences were identified by BLAST, evolutionary analyses were conducted in MEGA7 (27) to generate data for both this phylogenetic analysis and the protein sequence alignment (Fig. S1). The tree is drawn to scale, with branch lengths measured in the number of substitutions per site. The analysis involved 17 amino acid sequences. All positions containing gaps and missing data were eliminated. There were 158 positions in the final dataset.

Novel M. tuberculosis catalase
protein that was determined from the number of Met and Cys residues in the protein. The purified Rv2633c contained 1.7 Ϯ 0.2 iron per protein molecule and no significant amounts of the other metals. The irons were tightly bound, and were not removed by exposure to 25 mM EDTA. These results are consistent with a tightly bound di-iron site.
The purified protein exhibited an absorbance spectrum with a broad absorbance between 300 -360 nm (Fig. 4). This feature is characteristic of oxygen-bridged dinuclear iron systems in the ferric state and is believed to result from an oxo to Fe III charge transfer transition (16). No spectral change was observed after addition of a large excess of a variety of strong reductants: sodium dithionite, dithiothreitol, and tris(2-carboxyethyl)phosphine. Furthermore, Rv2633c showed no reactivity toward O 2 , as judged by the lack of change of the absorbance spectrum after removal of the excess reductant and exposure to air. These results indicate that the protein environment of the metal site strongly stabilizes the diferric state of the protein. These properties are not consistent with the typical physiological role of hemerythrins, which is to bind and transport O 2 . They are also not consistent with a possible function as an O 2 sensor.

Enzymatic properties of Rv2633c
Catalase activity of Rv2633c was suspected as formation of bubbles and foam was observed on addition of H 2 O 2 to solutions of the protein (Fig. 5). Catalase activity is typically measured by monitoring H 2 O 2 consumption by one of two methods. One is a discontinuous assay in which aliquots are removed from the mixture at time points and then used in a reaction which gives rise to a color change that is proportional to [H 2 O 2 ]. The other, used in this study, is a continuous assay that directly measures [H 2 O 2 ] by monitoring its absorbance at 240 nm, which provides a more accurate measurement and far more time points. However, bubbles can interfere with the measurement. To avoid this interference during the spectrophotometric catalase assay, a very low concentration of enzyme (1 nM) was used. A steady-state kinetic analysis of the catalase activity of Rv2633c was performed and yielded values of k cat ϭ 1475 Ϯ 96 s Ϫ1 and K m ϭ 10.1 Ϯ 1.7 mM (Fig. 5). At concentrations of H 2 O 2 higher than 30 mM, an initial rate could not be accurately determined because the absorbance of the initial concentration of H 2 O 2 was Ͼ1.3 and outside the linear range. However, the data that were obtained fit very well to the Michaelis-Menten equation with an R 2 value of 0.993. Two compounds which are known to inhibit heme-dependent catalases, cyanide and azide, were tested as inhibitors of Rv2633c activity. In each case, inhibition was observed (Fig. 6). IC50 values were determined empirically from the decrease in initial rate at increasing concentrations of inhibitor and used to calculate K i with eq 2. For NaCN, an estimated IC50 value of 7.2 M was used to determine a K i value of 3.6 M. Alternatively, direct analysis of the data by eq 3 yielded a K i value of 3.8 Ϯ 0.4 M. For NaN 3 , an estimated IC50 value of 75.1 M was used to determine a K i value of 37.5 M. Direct analysis of the data by eq 3 yielded a K i value of 37.7 Ϯ 3.9 M. The close agreement in K i values for each inhibitor that were obtained using eqs 2 and 3 supports the validity of the values, and is consistent with cya-

Novel M. tuberculosis catalase
nide and azide each acting as a tight-binding inhibitor. In contrast to the pronounced changes in the Soret region of the absorbance spectrum that these inhibitors induce in heme-dependent catalases, addition of up to 1 mM cyanide or azide had little effect on the relatively nondescript visible absorbance spectrum of Rv2633c. The dependence of the catalase reaction rate on pH was also examined. At pH 8.5, the reaction rate was ϳ4-fold less than at pH 7.5. At pH 7.0 there was an increase in the initial rate but the protein was very unstable and it was not possible to get an accurate measurement. At pH 6.5, the protein rapidly precipitated. The internal pH of Mtb was previously determined using pH-sensitive probes, and the basal pH was found to be 7.65 (17). In that study, it was also shown that after an 8 h incubation in buffer at pH 4.5, the internal pH only decreased to 7.2. Thus, the conditions used to determine the kinetic parameters for the catalase activity of Rv2633c approximate the physiological conditions in which the enzyme functions in vivo during normal and stressful conditions. Rv2633c was also tested for peroxidase activity using a variety of potential co-substrates with H 2 O 2 . No peroxidase activity was observed using either ascorbate, pyrogallol or o-dianisidine as an electron donor at concentrations up to 100 mM.

Discussion
Catalases are widespread throughout nature. The vast majority of catalases are iron-containing proteins with the iron pres-ent in a heme cofactor (18). Some catalases also exhibit peroxidase activity, such as KatG of Mtb (19). Those bi-functional catalase-peroxidase enzymes are also heme-dependent. There is also a comparatively small family of other catalases, which do not contain heme but instead use two manganese ions in the active site (20). Interestingly, these metals are carboxylate bridged in a structure very similar to the di-iron metal site of hemerythrins, but with manganese rather than iron. However, despite the similarity of the di-metal binding sites, Rv2633c behaves more like a heme-dependent catalase than a manganese catalase with respect to inhibitors. The manganese catalases are relatively insensitive to cyanide and azide, yet Rv2633c is inhibited by these compounds in the micromolar range, as are heme-dependent catalases. This is a consequence of the presence of iron rather than manganese in the hemerythrin-like metal binding active site. Two other well-characterized enzymes, ribonucleotide reductase and methane monooxygenase, have carboxylate bridged di-iron sites similar to that of hemerythrin (21). However, in these enzymes the sites are primarily used to activate O 2 , rather than for catalase activity. Thus, Rv2633c may be considered a highly unusual and perhaps unprecedented example of a catalase with a non-heme iron active site.
Rv2633c expression is up-regulated in Mtb during macrophage infection. It is common in nature to have an inducible enzyme that is produced in response to high levels of its substrate to supplement the activity of a constitutive enzyme, which reacts with the same substrate. Catalases from different organisms exhibit a wide range of k cat and K m values (22). The other catalase present in Mtb is the bifunctional catalase-peroxidase, katG. That enzyme was reported to exhibit a k cat /K m at pH 7.0 of 1.0 ϫ 10 6 s Ϫ1 (23), compared with the value determined here at pH 7.5 of 1.5 ϫ 10 5 s Ϫ1 . Considering that katG has additional physiological roles as a peroxidase, it may be desirable for the inducible Rv2633c not have a significantly higher catalytic efficiency that would deprive katG of its substrate for other reactions. The importance of this inducible catalase activity is highlighted by the fact that Tn insertions inactivating Rv2633c significantly attenuated the growth of Mtb (4). Furthermore, the finding that proteins with significant sequence similarity to Rv2633c are found only in pathogenic and opportunistic strains of Mycobacteria strongly suggests that this protein evolved to confer resistance to the host immune response to infection by Mycobacterial species.
This study of Rv2633c describes a unique catalase activity for a hemerythrin-like protein and its apparent role during infection by Mtb and other related pathogenic Mycobacterial species. In addition to the potential protective role against endogenous and exogenous oxidants, the generation of O 2 as a product of its catalase activity could conceivably aid survival within the hypoxic granuloma environment. Future studies should elucidate what particular features of protein structure allow this di-iron center to bind H 2 O 2 rather than O 2 and perform a reaction not previously reported for a di-iron hemerythrin-like protein. The finding that the only proteins that have significant overall sequence similarity to Rv2633c are those of other Mycobacterium species also suggests that this protein could serve as a target for novel therapeutics targeting its enzy-  A) and azide (B). The line is the fit of the data by eq 3. Each data point was the average of a minimum of two replicates with error bars shown. In some cases the error bars are not evident because the values were so similar that the bars are obscured by the data point. The goodness of the fits were R 2 ϭ 0.990 for A and R 2 ϭ 0.988 for B.

Novel M. tuberculosis catalase
matic function either directly, or indirectly by inhibition of upstream regulatory pathways.

Expression and purification of Rv2633c
For heterologous expression of the Rv2633c protein, the Rv2633c gene was cloned from Mtb and inserted into pET23a vector, which adds a hexa-histidine tag at the C terminus to facilitate purification. Briefly, the pET23a-2633c recombinant expression plasmid was created by FastCloning (24) the Rv2633c gene into the pET-23a(ϩ) vector (Novagen). The Rv2633c gene was PCR amplified from chromosomal DNA purified from Mtb H37Rv using the primers 2633c_pet23_FC-F (5Ј-gaaataattttgtttaactttaagaaggagatatacatATGAATGCCTAC-GACGTATTAAAGC-3Ј) and 2633c_pet23_FC-R (5Ј-tcagtgg-tggtggtggtggtgGATGGCCTTCAGGAGGTCTG-3Ј). The pET vector portion was PCR amplified using the primers pet23_ 2633c_FC-F (5Ј-CACCACCACCACCACCACTGA-3Ј) and pet23_2633c_FC-R (5Ј-ATGTATATCTCCTTCTTAAAGT-TAAACAAAATTATTTCT-3Ј). Lowercase letters in primer sequences indicate 5Ј extensions that are complementary to the vector primers, which are required for restriction enzymeand ligase-free Fastcloning method. The DNA fragments were combined, digested with DpnI to eliminate parental plasmid DNA, and transformed into E. coli 10-beta (New England Biolabs). Positive clones were identified by PCR screening and confirmed by sequencing. The pET23a-2633c plasmid was then transformed into Rosetta 2 (DE3) E. coli for expression.
Cells were cultured in LB broth with 100 g/ml ampicillin and 34 g/ml chloramphenicol, which was supplemented with 440 mg/L FeSO 4 ⅐7H 2 O. Cells were induced with 1.0 mM isopropyl ␤-D-1-thiogalactopyranoside for 3 h at 30°C. After growth and harvesting, cell extracts were prepared by either sonication or French Press (multiple preparations were studied in this work). The extract was subjected to affinity chromatography using a cobalt affinity resin (HisPur, Thermo Scientific) in 50 mM Tris-HCl, pH 7.5. The protein was eluted from the column with ϳ25 mM imidazole. The purity of the protein was assessed by SDS-PAGE. When necessary, the protein was further purified by size exclusion chromatography using a HiPrep 16/60 Sephacryl S-300 HR column on an AKTA PURE FPLC system (GE Healthcare). This technique was also used to determine its native mass. Chromatography was performed in 50 mM potassium phosphate buffer plus 150 mM NaCl at pH 7.5. The flow rate was 0.6 ml/min. The void volume was calculated using blue dextran. Proteins used as molecular weight markers were glutamate dehydrogenase (332 kDa), methylamine dehydrogenase (124 kDa), MauG (42.3 kDa) and amicyanin (11.5 kDa). A plot of the elution volume/void volume versus log molecular weight was used to estimate the mass of Rv2633c.

Sequence analysis
NCBI Basic Local Alignment Search Tool (BLAST) sequencing was used to compare the hemerythrin-like domain in Rv2633c to conserved sequences to predict its structure and function (25). NCBI COBALT was used to create a multiple sequence alignment of proteins with sequences most related to Rv2633c. For molecular phylogenetic analysis of Rv2633c from Mtb and its nearest orthologs, the evolutionary history was inferred using the Maximum Likelihood method based on the JTT matrix-based model (26). Evolutionary analyses were conducted in MEGA7 (27).

Enzymology
Steady-state kinetic studies of catalase activity were performed at 37°C in 50 mM potassium phosphate buffer (pH 7.5). The Rv2633c protein was present at a fixed concentration of 1.0 nM. Reactions were performed in the presence of varied concentrations of H 2 O 2 . The reaction rate was determined by monitoring the decrease in absorbance at 240 nm which corresponds to the concentration of H 2 O 2 (⑀ 240 ϭ 43.6 M Ϫ1 cm Ϫ1 ) (28). Data were fit to the Michaelis-Menten equation (eq 1) where k cat is turnover number for the protein, K m is the Michaelis constant, [S] is H 2 O 2 concentration, [E] is the enzyme concentration and v is the initial reaction rate.
/͓E͔ ϭ k cat ͓S͔/͑K m ϩ ͓S͔͒ For inhibition studies, the catalase reaction of Rv2633c was performed as described above in the presence of varying concentrations of either NaCN or NaN 3 . The reactions were initiated by the addition of 10 mM H 2 O 2 . IC50 values were determined empirically from the decrease in initial rate at increasing concentrations of inhibitor. The K i for each inhibitor was then determined using eq 2. Alternatively, the data were fit by eq 3 for analysis of tight binding inhibitors (29,30). In these equations [E] is the total enzyme present, S is H 2 O 2 , I is the inhibitor, V o is the rate in the absence of inhibitor and v i is the rate in the presence of each concentration of inhibitor.
To test for peroxidase activity the same buffer and Rv2633c concentration was used. The reaction was initiated by the addition of 10 mM H 2 O 2 in the presence of potential peroxidase substrates. Possible peroxidase activity was monitored spectrophotometrically by an increase at 460 nm for o-dianisidine, an increase at 318 nm for pyrogallol, or a decrease at 290 nm for ascorbate.