Crystal Structure of Reduced MsAcg, a Putative Nitroreductase from Mycobacterium smegmatis and a Close Homologue of Mycobacterium tuberculosis Acg

Background: Acg proteins are up-regulated during dormancy in tuberculosis. Results: Acg proteins bind flavin mononucleotide like nitroreductases but with the active site closed by a lid. They are not reduced by NADPH or NADH. Conclusion: Acg proteins may have evolved from active nitroreductases to sequester FMN instead. Significance: Turning off a flavin-dependent pathway may be important in tuberculosis dormancy. This paper presents the structure of MsAcg (MSMEG_5246), a Mycobacterium smegmatis homologue of Mycobacterium tuberculosis Acg (Rv2032) in its reduced form at 1.6 Å resolution using x-ray crystallography. Rv2032 is one of the most induced genes under the hypoxic model of tuberculosis dormancy. The Acg family turns out to be unusual flavin mononucleotide (FMN)-binding proteins that have probably arisen by gene duplication and fusion from a classical homodimeric nitroreductase such that the monomeric protein resembles a classical nitroreductase dimer but with one active site deleted and the other active site covered by a unique lid. The FMN cofactor is not reduced by either NADH or NADPH, but the chemically reduced enzyme is capable of reduction of nitro substrates, albeit at no kinetic advantage over free FMN. The reduced enzyme is rapidly oxidized by oxygen but without any evidence for a radical state commonly seen in oxygen-sensitive nitroreductases. The presence of the unique lid domain, the lack of reduction by NAD(P)H, and the slow rate of reaction of the chemically reduced protein raises a possible alternative function of Acg proteins in FMN storage or sequestration from other biochemical pathways as part of the bacteria's adaptation to a dormancy state.

This paper presents the structure of MsAcg (MSMEG_5246), a Mycobacterium smegmatis homologue of Mycobacterium tuberculosis Acg (Rv2032) in its reduced form at 1.6 Å resolution using x-ray crystallography. Rv2032 is one of the most induced genes under the hypoxic model of tuberculosis dormancy. The Acg family turns out to be unusual flavin mononucleotide (FMN)-binding proteins that have probably arisen by gene duplication and fusion from a classical homodimeric nitroreductase such that the monomeric protein resembles a classical nitroreductase dimer but with one active site deleted and the other active site covered by a unique lid. The FMN cofactor is not reduced by either NADH or NADPH, but the chemically reduced enzyme is capable of reduction of nitro substrates, albeit at no kinetic advantage over free FMN. The reduced enzyme is rapidly oxidized by oxygen but without any evidence for a radical state commonly seen in oxygen-sensitive nitroreductases. The presence of the unique lid domain, the lack of reduction by NAD(P)H, and the slow rate of reaction of the chemically reduced protein raises a possible alternative function of Acg proteins in FMN storage or sequestration from other biochemical pathways as part of the bacteria's adaptation to a dormancy state.
In the 21st century, tuberculosis remains a major burden to the world, causing around 1.4 million deaths and about 8.8 million new cases worldwide in 2010. The disease is deeply associated with poverty, and thus the developing world bears the bulk (95%) of the victims (1). It is estimated that a third of the world population is infected by Mycobacterium tuberculosis, the causative agent of tuberculosis. Among the carriers of M. tuberculosis, 5-10% will develop active tuberculosis during their life time (2), whereas the rest will remain in the latent phase of the disease. Thus, understanding the physiological state of the bacteria during this latency is of critical interest for the fight against tuberculosis. A particular problem is that bacteria in the dormant phase are far less sensitive to antibiotics, which means a long course of antibiotic treatment is required (3).
The exact nature of the dormant state within the lungs is hard to probe. A generally accepted laboratory model of dormancy, the Wayne model, is induced by hypoxia (4). Expression studies show that the cells adapt to hypoxia due to a two-component regulatory system (2CR) 3 called dosRS (5,6). The dosRS 2CR controls the transcription of a regulon formed of around 50 genes induced under hypoxia (5) but also under nitric oxide exposure (7). Other conditions that trigger the dosRS regulon in vitro include S-nitrosoglutathione (8) and carbon monoxide (9). These signals are thought to be present in macrophages after their infection by M. tuberculosis. Inactivation of the whole dos operon unexpectedly produced a hypervirulent mutant in mice and in macrophage models of tuberculosis (10).
Among the 50 genes under the control of dosRS 2CR, few are of known function, making it difficult to understand the phys-iological outcome of the induction of the regulon. The most highly induced protein by the dosRS 2CR is Acr1 (Rv2031), which was previously shown to be the most abundant protein in dormant cells and to be essential for growth of the bacteria in macrophages (11,12) while being induced by hypoxia or nitric oxide exposure. Acr1 (␣-crystallin 1) is a small heat shock protein that forms a tetrahedral dodecamer (13), one of two small heat shock proteins in M. tuberculosis. Small heat shock proteins bind to partially unfolded proteins in vitro and prevent aggregation (14). The best studied example of a small heat shock protein is ␣-crystallin, which is proposed to prevent formation of cataract in the eye lens, where the cells do not undergo further protein synthesis or division after birth (15). Deletion of Acr1 led to an attenuated phenotype in macrophage infection model (12), whereas another study reported that the deletion of Acr1 induces a faster growth of the M. tuberculosis in macrophages and in mice (16). The discrepancy is due to differences in the knock-out strategy.
A study of the promoter region of acr1 by Purkayastha et al. (17) showed that the divergently transcribed genes acr1 and Rv2032 are co-regulated under hypoxic conditions but also inside macrophages and led to the renaming of Rv2032 as acg (acr co-regulated gene). The authors tentatively annotated Acg (the product of acg) as an unusual member of the classical nitroreductase family (Pfam ID PF00881). Acg, which is 331 amino acids long, is somewhat larger than the classical nitroreductases, which are around 220 amino acids long. There are two additional paralogues of Acg in the M. tuberculosis genome Rv3127 and Rv3131, all of which are controlled by the dosRS 2CR (8,18).
Nitroreductases are broad substrate range enzymes that utilize FMN as co-factor, are homodimers, and use NAD(P)H as the source of reducing power. They catalyze the reduction of nitro compounds to hydroxylamino or amino forms. They are found in all kingdoms and are present in higher eukaryotes. The physiological role of nitroreductases is unclear (19); in some cases they are involved in catabolism of xenobiotics such as trinitrotoluene or in activation or deactivation of drugs, but these compounds are of recent origin, and therefore, this is unlikely to be the role for which these proteins originally evolved. Two classes of enzyme can be identified depending on the reduction mechanism: type I, oxygen-insensitive nitroreductases, and type II, oxygen-sensitive nitroreductases. The oxygen-insensitive enzyme catalyze the reduction of the nitro group by a direct two electron transfer, whereas the oxygensensitive enzyme catalyze the reduction through a one-electron transfer and the formation of a nitro anion radical before a second one-electron transfer to yield the product. However, in the presence of oxygen the nitro anion radical can be reoxidized to the nitro compound in a futile cycle that utilizes reducing power, generates superoxide anions and does not yield reduced substrate. The most studied class, the type I is further divided in two groups, A and B, depending on their usage of only NADPH or either NADPH or NADH, respectively, but all follow a pingpong bi-bi mechanism (19).
Homologues of Acg are largely restricted to Actinomyces but are also found in some Burkholderia species. Three of the six putative nitroreductases in M. tuberculosis belong to the Acg family. The other three are a classical nitroreductase (Rv3368), Rv0306, a bluB gene involved in dimethybenzimidazole biosynthesis, and Rv3262, a ␥-glutamyl ligase involved in F420 biosynthesis. In this paper we present the structure of the first member of the Acg family of proteins in its reduced state, which gives insight into the role of these proteins in dormant tuberculosis.

EXPERIMENTAL PROCEDURES
Cloning-The genomic DNA of M. tuberculosis and M. smegmatis were kindly provided by Prof. Neil Stoker at the Royal Veterinary College, London, UK and by Dr. Sanjib Bhakta at Birkbeck, University of London, London, UK, respectively. The genes were cloned in pNIC28-Bsa4 using the ligation-independent cloning strategy developed at the Structural Genomic Consortium, Oxford, UK by Dr. Opher Gileadi (20). The MSMEG_5246 gene was amplified by PCR using the following primers designed using the gene sequence from the M. smegmatis MC 2 -155 genome (GenBank TM accessions number ABK74735.1) flanked with the ligation-independent cloning forward and reverse sequences, respectively, 5Ј-tacttccaatcc-ATGTCCGACACCAGGCTCGACG-3Ј and 5Ј-tatccaccttta-ctgTCAACTCCTCGGGCGAACTTCG-3Ј. The Rv2032 (Acg) gene was amplified by PCR using the following primers designed using the gene sequence from the M. tuberculosis H37Rv genome (GenBank TM CAA17246.1) flanked with the ligation-independent cloning forward and reverse sequences respectively, 5Ј-tacttccaatccATGCCGGACACCATGGTG-3Ј and 5Ј-tatccacctttactgCTACCGGTGATCCTTAGCCCGA-3Ј. The cloned sequences were verified by sequencing.
Protein Expression and Purification-The pNIC28-Bsa4-MSMEG_5246 plasmid was transformed into Escherichia coli Rosetta2(DE3) strain (Novagen), and the cells were grown in ZYM5052 auto-inducing medium (21) at 37°C for 3 h followed by 24 h at 25°C. The selenomethionine-labeled protein was produced in glucose-free selenomethionine medium (Athen-aES) supplemented with 5 g/liter glycerol, 0.5 g/liter glucose, and 2 g/liter lactose to obtain an auto-inducing medium and 80 mg/liter selenomethionine for labeling. 6.4 mg/liter methionine and 100 nM of vitamin B12 were also added to the medium to help repress the biosynthesis of methionine (21).
The cells were harvested by centrifugation and were resuspended in 100 mM Tris-HCl, pH 8.0, 500 mM sodium chloride, 0.02% (w/v) sodium azide before 0.5 mg of DNase I (Roche Applied Science) and 1.5 mg of lysozyme (Sigma) per 50 ml of cell suspension were added along with 1 mM PMSF and 1 tablet of cOmplete EDTA free protease inhibitor mix (Roche Applied Science). The cells were broken by sonication on ice. The cell lysate was cleared by centrifugation at 48,000 ϫ g at 4°C for 1 h. The cleared lysate was then applied to a XK 16/20 column (GE Healthcare) filled with 25 ml of nickel immobilized metal ion affinity chromatography-Sepharose HP (GE Healthcare) equilibrated with 100 mM Tris-HCl, pH 8.0, 500 mM sodium chloride, 25 mM imidazole, 0.02% (w/v) sodium azide. After extensive washing the bound protein was eluted in 100 mM Tris-HCl, pH 8.0, 500 mM sodium chloride, 300 mM imidazole, 0.02% (w/v) sodium azide. 1 mM ␤-mercaptoethanol and 1 mg of tobacco etch virus protease per 5 ml of eluted fraction were added and the sample dialyzed overnight at 4°C against 25 mM Tris-HCl, pH 8.0, 100 mM sodium chloride, 0.02% (w/v) sodium azide. The untagged protein was then applied to an anion exchange HiLoad 26/10 Q Sepharose HP column (GE Healthcare) equilibrated with 25 mM Tris-HCl, pH 8.0, 100 mM sodium chloride, 0.02% (w/v) sodium azide. The protein was eluted with a linear gradient from 100 to 500 mM NaCl in 25 mM Tris-HCl, pH 8.0, 0.02% (w/v) sodium azide. The fractions were pooled and concentrated using a Vivaspin centrifugal concentrator with a cutoff of 10 kDa (Sartorius) and loaded on a gel filtration HiPrep 16/60 Sephacryl S-200 HR column (GE Healthcare) equilibrated with 25 mM Tris-HCl, pH 8.0, 100 mM sodium chloride, 0.02% (w/v) sodium azide. Finally, the purified protein was concentrated to 10 mg/ml based on the extinction coefficient at 450 nm of 12,700 liter/cm/mol before being aliquoted and flash-frozen in liquid nitrogen and store at Ϫ80°C.
The quality and oligomeric state of the purified protein was assessed by a combination of analytical gel filtration, SDS-PAGE stained by Coomassie blue based stain, and analytical ultracentrifugation sedimentation velocity experiments as well as 1 H NMR to assess the proper folding of the sample.
Analytical Ultracentrifugation-Sedimentation velocity experiments were carried out on a ProteomeLab TM XLI analytical ultracentrifuge (Beckman Coulter) with an An-60 Ti rotor using 2-channel centerpieces and sapphire windows at 60,000 r.p.m. Scans were recorded at 20°C in absorbance mode at 280 and 450 nm as well as in interference mode. The protein samples were in 25 mM Tris, pH 8.0, 100 mM NaCl at 45, 23, and 4.3 M. Protein partial specific volume and buffer density and viscosity were calculated using SEDNTERP. The experimental data were analyzed using Sedfit by fitting to the c(s) with one discrete component model with MsAcg.
Mass Spectrometry-MsAcg was subjected to limited proteolysis by chymotrypsin for 4 h at 22°C at 1.2 mg/ml protein and a protease ration of 1:2000 (w/w). The reaction was stopped by the addition of PMSF, and the sample was applied on an analytical gel filtration column for purification and analysis. Mass spectrometry experiments were carried out on a Synapt HDMS (Waters Ltd, Manchester, UK) mass spectrometer (22). The instrument was mass-calibrated using a 33 M solution of caesium iodide in 250 mM ammonium acetate. 20 M MsAcg sample was buffer-exchanged into 250 mM ammonium acetate, pH 7.5, using Biospin-6 columns (Bio-Rad) before being delivered to the mass spectrometer by means of nanoelectrospray ionization using gold-coated capillaries prepared in house. Typical instrumental parameters for native MS were as follows: source pressure 5 mbar, capillary voltage 1.0 kV, cone voltage 50 V, trap energy 8 V, transfer energy 6 V, bias voltage 5 V, and trap pressure 3.6 ϫ 10 Ϫ2 mbar. For tandem MS, trap energy was increased to 80 V, with all other parameters unchanged. Data acquisition and processing were carried out using MassLynx (Version 4.1) software (Waters Corp., Milford, MA). Mass assignment was achieved by the method described in Tito et al. (23).
Protein Crystallization-The crystallization trials were carried out with an Oryx Nano crystallization robot (Douglas Instruments) contained within in an anaerobic chamber (Coy) with a 95% nitrogen, 5% hydrogen atmosphere in the laboratory of Prof. Holger Dobbek at Humbolt-Universität zu Berlin. Sitting drop vapor diffusion trials were set up with 200 nl of protein ϩ 200 nl of reservoir solution drops in MRC 2 subwell plates using 80 l of reservoir solution. The protein was kept reduced by the addition of 5 mM dithionite to the protein and the reservoir solution before setting up the crystallization trays.
The first crystals were obtained with the Morpheus screen (Molecular Dimension) (24) after 3-4 weeks in 0.1 M MES/ imidazole, pH 6.5, 10% (w/v) PEG 20,000, 20% (v/v) PEG MME 550 with the halide, alcohol, ethylene glycol, amino acid, and carboxylic acid mixes. Final selenomethionine-labeled protein crystals were obtained after optimization in 0.1 M MES/imidazole, pH 6.5, 30% (v/v) PEG MME 550, 0.02 M sodium formate, 0.02 M ammonium acetate, 0.02 M trisodium citrate, 0.02 M sodium potassium DL-tartrate, and 0.02 M sodium oxamate reduced by 50 mM dithionite. No oxamate was present in the higher resolution dataset. The crystals were cryopreserved for data collection in the same buffer as the crystallization conditions with 20% (v/v) ethylene glycol as cryoprotectant.
Structure Determination and Refinement-A native protein data set at 2.2 Å and selenomethionine-labeled protein data sets at 2.4 and 1.6 Å resolution were collected on BL 14.1 or 14.2 at the BESSY II synchrotron operated by Helmholtz Zentrum Berlin für Materialen und Energie (25). The datasets were indexed and integrated using XDS (26) before being converted into CCP4 format by COMBAT (27). The datasets were scaled using POINTLESS/SCALA. The structure was solved by Se-SAD using the SAS protocol of Auto-Rickshaw, the EMBL-Hamburg automated crystal structure determination platform using the 2.4 Å dataset (28,29). Anomalous scatterer structure factor amplitudes (FA) values were calculated using the program SHELXC (30). All of the four heavy atoms requested were found using the program SHELXD (31). The correct hand for the substructure was determined using the programs ABS (32) and SHELXE (33). Initial phases were calculated after density modification using the program SHELXE (33). 97.27% of the model was built using the program ARP/wARP (34,35). The model was refined using REFMAC (36), and the model building was done using COOT (37). The native dataset to 2.2 Å was refined from this starting structure using REFMAC (36). A further 1.6 Å dataset was initially refined in REFMAC and then further refined with phenix.refine (38).There were no significant differences between the structures (r.m.s.d. 0.14 Å over 325 residues). Figures were drawn using CCP4MG (39). The Butterfly angle was determined using Chimera (40) axes/plane/centeroid function using 10 atoms per plane as defined in Lennon et al. (41) and other stereochemistry analyzed with Molprobity (42). The structure factors and the structure coordinates of the final high resolution selenomethionine structure of MsAcg have been deposited to the Protein Data Bank at PDBe under accession number 2ymv.
Anaerobic Titration-All titration experiments were carried using sealable quartz cuvette (Hellma, Germany) with a new septum for each experiment on a Cary 300 UV-visible spectrophotometer (Agilent). The cuvettes were filled and sealed in an anaerobic chamber (Coy) filled with nitrogen and around 3% hydrogen atmosphere. All buffers and solutions were made anaerobic by repeated cycles of evacuation and flushing with nitrogen in the airlock of the anaerobic chamber and kept on ice in the anaerobic chamber throughout the experiment. The stoichiometric titration solution consisted of 40 -45 M MsAcg in 1 ml of 20 mM Tris-HCl, pH 8.0, 100 mM NaCl, 0.02% (w/v) NaN 3 for the dithionite titration and of 13-15 M MsAcg in 1 ml of 20 mM Tris-HCl, pH 8.0, 100 mM NaCl, 0.02% (w/v) NaN 3 for the reoxidation experiment by nitrofuran. The titrants were either dithionite (calibrated against oxidized FMN), NADPH, and NADH (both in 0.01 M NaOH), calibrated spectrophotometrically (FMN ox : max ϭ 445, ⑀ 445 ϭ 12,600 M Ϫ1 cm Ϫ1 ; NADH: max ϭ 340, ⑀ 340 ϭ 6,220 M Ϫ1 cm Ϫ1 ; NADPH: max ϭ 340, ⑀ 340 ϭ 6,220 M Ϫ1 cm Ϫ1 ). Small amounts of titrant (5 l) were injected through the septum with a gas-tight syringe (Hamilton), and a UV-visible spectrum was recorded between 300 and 700 nm. The nitrofuran reoxidation experiment was done by sequential addition of 5 l to a total of 30 l of dithionite at 1 mM in the same buffer as the protein to achieve almost total reduction. 5-l additions of 1 mM nitrofuran in N,N-dimethylformamide were injected, and a spectrum between 600 and 300 nm was acquired after each addition. For the kinetic experiment, the protein was reduced following the same protocol as the other experiments followed by a single injection of 10 l of 1 mM nitrofuran in N,N-dimethylformamide, and the absorbance was measured at 450 nm for 90 min.
Determination of the Mid-point Potential of the FMN-Reduction potentials were determined according to published methods (43)(44)(45)(46). The spectrophotometer was blanked with pH 7 buffer (50 mM potassium phosphate, 50 mM potassium chloride) containing 300 M xanthine, 5 mM glucose, 1 g/ml catalase, and 10 g/ml glucose oxidase in a sealed cuvette. The dye (Nile blue, reduction potential Ϫ116 mV (47)) and protein were then titrated up to an absorbance of 0.1 at their respective maxima. Xanthine oxidase was added to a concentration of 40 g/ml (43,48). All enzymes and dyes used were sourced from Sigma. Spectroscopic readings were recorded systematically as the reduction potential of the cuvette is lowered. Readings were taken every 15-30 s at the start and gradually reduced to once every 5 min as the reaction slowed. Once the spectra remained constant for at least 10 min sodium dithionite was added, after which readings were taken until the spectra remained constant for at least a further 5 min. Reduction potentials are reported versus the normal hydrogen electrode (NHE). The results are the average of three separate determinations.

RESULTS
Protein Characterization-MsAcg, the gene product of MSMEG_5246, is one of four Acg family members in the fast growing mycobacteria M. smegmatis. Purified MsAcg has a deep yellow color as expected for flavoproteins. The cofactor extracted by thermal denaturation co-migrates with FMN and not FAD or riboflavin on thin layer chromatography. The overall purity of the sample is Ͼ95% as estimated from the SDS-PAGE and migrates just below the 40-kDa marker corresponding to the expected 36-kDa molecular mass. The oligomeric state of the protein was characterized in solution by analytical gel filtration (38.4 kDa) and analytical ultracentrifugation and shown to be monomeric. Chymotrypsin-proteolysed protein retains FMN on gel filtration. The limited proteolysis of MsAcg resulted in cleavage at two sites: after residues 187 and 222. This was shown by native and tandem mass spectrometry (MS). Analysis of the chymotrypsin-proteolysed MsAcg by native nanoelectrospray MS showed predominantly two species corresponding to the cleaved (residues1-187 plus 223-330) and full-length (residues 1-330) MsAcg, both with FMN bound (Fig. 1A). Further investigation of these two species was made by tandem MS whereby each species was isolated using the quadrupole and subjected to collision induced dissociation to disrupt non-covalent interactions. Tandem MS of the cleaved MsAcg resulted in three species, one containing residues 223-330 and two containing residues 1-187 with and without FMN bound (Fig. 1B). Tandem MS of the full-length MsAcg (Fig. 1C) resulted in the same species and two further species that contain residues 1-222, again with and without FMN bound. MS of the proteolysed MsAcg protein under denaturing conditions, however, showed no evidence for the presence of full-length protein indicating that the full-length (residues 1-330) MsAcg observed in the native spectrum comprises the cleaved products held together by non-covalent interactions.
Structure of MsAcg-The three-dimensional structure of full-length MsAcg in the reduced form was obtained by selenium-SAD phasing in space group P321 with a 2.4 Å dataset and refined against a second 1.6 Å selenomethionine containing dataset ( Table 1). The cloned protein is 330 residues long, as an extra serine is left after the cleavage of the His tag (residue 1); all residues could be seen in the electron density map with the exception of residues 1, 2, and 330. Electron density in the region of FMN is shown in Fig. 2C.
The structure of MsAcg consists of a monomeric protein of two domains centered on ␤ sheets. The N-terminal domain is formed of four antiparallel ␤ strands, ␤1, ␤2, ␤4, ␤3, plus a complementary fifth ␤ strand, ␤9, parallel to the first strand. The sheet is surrounded by ␣ helices ␣1, ␣2, and ␣3. The second domain consists of a four-stranded antiparallel ␤ sheet formed by ␤ strands ␤5, ␤6, ␤8, and ␤7 and surrounded by helices ␣4, ␣5, ␣6, ␣7, ␣8, and ␣9 (Fig. 2, A and B). A single FMN binds to the second domain. The fold of the monomer resembles a dimer of a normal nitroreductase. Below we compare MsAcg with NfnB from the M. smegmatis (49). Structure comparison against the PDB was done using the Fold server at European Bioinformatics Institute (50), and this proved to be sensitive to the protocol used. There is no one clear top hit, so we have selected NfnB as being from the same species and a strong hit based on a balance of the various match criteria. Of 174 matches to nitroreductase chains in the PDB to the full-length MsAcg at normal accuracy (40% match limit), NfnB is first in terms of sequence identity (21%), 4th in alignment length (245) (top hit 246), and top for secondary structure element matches (83%). On the statistical measures Q (53rd, 0.25 max 0.32), P (12th, 14.3 max 15.4), and Z (12th, 12.4 max 13.3), NfnB is less prominent but not much lower in score than the higher hits.
Secondary structure matching superposition, carried out using CCP4MG (39), shows that one molecule of MsAcg superimposes on a homodimer of NfnB, 234 residues per chain, (PDB ID 2wzw) with an overall r.m.s.d. of 2.4 Å for a length of super- DECEMBER 28, 2012 • VOLUME 287 • NUMBER 53

JOURNAL OF BIOLOGICAL CHEMISTRY 44375
position of 233 residues. The C-terminal region of MsAcg, i.e. between residues 109 and 328, aligns with the B chain of NfnB (r.m.s.d. of 2.5 Å over 147 residues), whereas the N-terminal region of MsAcg (residues 4 -103) shows a superposition of 86 residues with a r.m.s.d. of 2.2 Å with chain A of NfnB. This shows that MsAcg could be the product of a gene duplication and fusion. The match is, however, not perfect as a portion of the chain B of NfnB, i.e. from residues 84 to 137, does not align with MsAcg, and a fragment of MsAcg between residues 170 and 231 does not align with NfnB despite being similar lengths (Fig. 3). The N-terminal region of MsAcg aligns with chain A of NfnB preserving the core ␤ sheet and the loop that contributes to FMN binding to chain B but omitting most of the helical regions that contribute to the FMN binding site in chain A and the final ␤ strand that extends the ␤ sheet of chain B (Fig. 3). This shows that the N-terminal domain is reminiscent of a nitroreductase fold but has suffered from drastic evolution leading to a reduction to almost half the size of the C-terminal region and loss of the second FMN binding site. Although FMN binds to the C-terminal domain of MsAcg, residues identified as interacting with FMN (see below) are located in a region of the nitroreductase fold that is deleted in the N-terminal domain, thus making impossible the binding of FMN to the N-terminal ␤ sheet and excluding the presence of a second active site.
There is one other monomeric nitroreductase structure known, that of Tm1586 from Thermotoga maritima (PDB ID 1vkw), which is only 205 residues. 4 Again this contains two ␤ sheets in a single chain similar to MsAcg. Tm1586 has no FMN bound but does have a sulfate from the crystallization condition at the expected position of the FMN phosphate in the N-terminal ␤ sheet, so probably binds FMN but weakly enough to have been displaced at some point in the crystallization or purification. The C-terminal ␤ sheet of Tm1586 does not conserve the FMN binding residues.
Upon examination of the surface representation of M. smegmatis NfnB (Fig. 4A) and indeed almost all other nitroreductase structures, the FMN is easily accessible to both solvent and substrates, whereas in the full-length MsAcg it is almost impossible to access the FMN except via a very narrow cavity, much smaller than most substrates or reducing agents (Fig. 4B). The region of the C terminus 170 -231 that does not overlap with the B chain of NfnB blocks access, and we term this the lid (shown in green in the figure). However, when the lid is 4 Joint Center for Structural Genomics, unpublished information. removed from the model, the FMN is nicely exposed (Fig.  4C). The only other structure of a PF00881 protein with a concealed FMN is BluB, which catalyzes the cleavage of FMNH2 to dimethylbenzimidazole (51). Although the analogous region of BluB contributes part of its lid, there is no conservation of sequence or backbone position. Mycobacteria contain more direct homologues of BluB than the Acg family, so Acg is unlikely to be carrying out dimethylbenzimidazole biosynthesis.
Oxygen can penetrate the lid in the crystal and reoxidize FMN to the yellow state, with concomitant loss of diffraction of the crystals. Therefore, very small molecules can access FMN in the closed form but larger substrates will require conformational change. The proteolytic sensitivity of the protein at the beginning and end of the lid region indicates that there are likely to be conformationally flexible hinges for the lid region allowing substrate access.
The structure of the co-factor was obtained in its reduced form (Fig. 2C). The isoalloxazine ring was refined without planarity constraints to allow the FMN to adopt a bent conformation. Indeed the isoalloxazine ring is bent along the N5-N10 axis toward si face, with a butterfly angle of 23.3°. The range of this angle is quite wide in flavins in all oxidation states (41); however photoelectron reduction by X-rays of the flavin (52) may mean that the reported oxidation state may not be that in the crystal. Nevertheless the butterfly angle is larger than that of M. smegmatis NfnB (49) (PDB ID 1wzw, 13.4°) or Enterobacter cloacae NfsB in its oxidized state (PDB ID 1kqc, 15.5°) and similar to E. cloacae NfsB in the reduced state (53) (PDB ID 1kqd, 22.3°), consistent with the reduced form. This along with crystal color shows that we have a reduced FMN in our crystals.
The interaction between the reduced FMN co-factor and the polypeptide chain of MsAcg were plotted using LIGPLOT at the PDBsum server and summarized in Fig. 4D. The interactions are of two types, hydrogen bonds and hydrophobic interactions. A major cluster of hydrogen bonds is found around the phosphate group of the FMN. The phosphate group contacts two arginines (Arg-121 and Arg-317) that will form salt bridges as well as hydrogen bonds. Additionally two threonines interact with the phosphate (Thr-123 and Thr-315). Another cluster is found around the isoalloxazine ring, with O4 contacting Arg-215 and also the amino group of Thr-274. Similarly, nitrogen N5 from FMN contacts Thr-274 through its side chain hydroxyl. Finally, O2 interacts with the guanidinium group of Arg-125, whereas the oxygen O2* of the , where ͗I(hkl)͘ is the average intensity of the reflection hkl. b R pim ϭ ⌺ hkl ͌(1/(n Ϫ 1))(⌺ i n ͉I i (hkl) Ϫ ͗I(hkl)͉͘)/⌺ hkl ⌺ i I i (hkl), where ͗I(hkl)͘ is the average intensity of the reflection hkl. c R work ϭ ⌺ hkl ʈF obs ͉ Ϫ ͉F calc ʈ/⌺ hkl ͉F obs ͉, where F obs and F calc are the observed and calculated structure factors, respectively. R free was calculated analogously for the test reflections, which were randomly selected and excluded from the refinement. d Compounds found in the crystallization condition, i.e.imidazole, formate, acetate, PEG DECEMBER 28, 2012 • VOLUME 287 • NUMBER 53 ribitol also interacts with the same residue. The hydrophobic interactions are spread over the whole FMN molecule (Fig.  4D). Arg-215 is part of the lid domain and is the least conserved interaction compared with FMN binding to E. cloacae NfsB or M. smegmatis NfnB.

Structure of Mycobacterial Acg
Enzymological Studies-Neither NADPH nor NADH reduced the FMN of MsAcg, MTbAcg, or proteolytically treated MsAcg that partially removes the lid under aerobic or anaerobic conditions either by monitoring the nicotinamide at 340 nm or by monitoring the flavin at 445 nm. A control of E. coli NfnB behaved as expected (data not shown). Both MsAcg and MTbAcg were reduced under anaerobic conditions by dithionite and rapidly reoxidized if oxygen was introduced into the cuvette. A single isosbestic point at 332 nm and a lack of any absorbance at 600 nm indicates that the semiquinone state is not formed and the reduction is a direct 2-electron reduction (Fig. 5A).
The mid-point reduction potentials were determined to be Ϫ165 Ϯ 3 and Ϫ164 Ϯ 3 mV (data not shown) for MTbAcg and MsAcg, respectively. The reduction potentials were calculated from experiments using Nile blue dye, which has a mid-point potential of Ϫ116 mV (47). These mid-point potentials of the FIGURE 2. Structure of MsAcg. A, shows a stereo view of the structure of MsAcg in ribbon representation with two views 90°degrees apart around the horizontal. B, shows a schematic of the secondary structure. The protein is formed of two domains, one around each ␤ sheet. The two ␤ sheets are formed of ␤ strands ␤1, -2, -4, -3, and -9 and ␤5, -6, -8, and -7, respectively, and surrounded, respectively, by ␣ helices ␣1, -2, and -3 and ␣4, -5, -6, -7, -8, and -9. There is a single FMN co-factor associated with the second ␤ sheet. C, shown is a final mF o Ϫ dF c electron density map in the region of the FMN. 10 Å extent of map, clipped to an 8 Å slice in z at 1.0 is shown in pale brown. The same map clipped to the FMN is shown in blue. The FMN is shown as green ball-and-stick carbons, and the adjacent protein residues are in coral carbons.
Acgs are somewhat higher and, therefore, more weakly reducing than FMN alone, which has a 2-electron reduction potential of Ϫ207 mV (54) and than E. cloacae nitroreductase, an archetype of the FMN-dependent dimeric nitroreductases and 88% sequence identical to E. coli NfsB, which has a midpoint reduction potential of Ϫ196 mV (55).
The addition of the nitroreductase substrates nitrofuran, nitrofurantoin, and menadione to the anaerobic dithionite-reduced enzyme resulted in reoxidation of the FMN, implying turnover of the enzyme. The amount of dithionite was titrated to just reduce the FMN without excess dithionite. However, reduced FMN free of protein showed the same reaction, and indeed the rate of reoxidation of nitrofurantoin was significantly faster with free FMN compared with the enzyme bound FMN (Fig. 5, B and C).

DISCUSSION
The Acg family of proteins has evolved from the classic nitroreductase homodimer by gene duplication and fusion and then loss of one of the two active sites. This has resulted in a monomeric protein with a single active site but with an overall fold resembling the nitroreductase homodimer. The proteins have also evolved to have a lid over the FMN rather than an open binding pocket as seen in the other nitroreductase proteins. The region between the main protein and the lid is sensitive to protease, indicating these might be flexible hinges allowing the lid to open, although some of the cleaved lid remains bound even after gel filtration, indicating tight noncovalent binding of the lid to the rest of the protein, so the rate of the lid opening is probably slow.
The lid does seem to be able to open, as we were able to show reoxidation of chemically reduced enzyme by larger nitroreductase substrates that would not have direct access to the active site in the conformation trapped in the crystal. The rate is actually slightly slower than that for the equivalent reaction carried out by reduced FMN in solution, so arguably we are merely measuring the rate of dissociation of the cofactor. However the fact that stoichiometric amounts of FMN are retained on the protein throughout purification would argue that cofactor binding is extremely tight and that we are seeing substrate interacting with protein bound cofactor. The lid domain interacts directly with the FMN through Arg-215, which interacts both with the ribitol and the O2 of the nicotinamide. Whereas the other interactions with the FMN are homologous to that seen in E. cloacae, which has a much lower midpoint potential, this interaction is unique to MsAcg and, therefore, may modify the midpoint potential as well as contributing to the fairly tight binding of the lid.
It is unclear what the physiological reducing agent is within Mycobacteria for Acg proteins. NAD(P)H is the normal reducing agent for nitroreductases; some proteins accept either as a substrate, but others are specific to NADPH. The only structure of a nitroreductase with NADPH bound is M. smegmatis NfnB (PDB ID 2wzw). The specific binding is to the adenosine region, but this is by a region of the fold that is not common to all nitroreductases. This region, which forms the rim of the pocket in M. smegmatis NfnB, comes from the same chain as the ␤ sheet. However, the analogous region in the E. coli and E. cloacae structure is provided by the other chain, and the channel to the FMN is formed across the dimer interface, and it is not clear how NADPH binds in this case. As there is lack of conservation of the NADPH binding mode in other nitroreductases, it is impossible to say that NAD(P)H binding has definitely been lost.
A ⌬acg strain of M. tuberculosis was not able to grow or survive in either resting or activated macrophages in vitro (56). The knock-out strain was not able to kill immunosuppressed (SCID) mice, indicating that Acg has roles in growth within macrophages and virulence. The ⌬acg mutant is more sensitive to the nitroreductase-activated antibiotics nitrofuran and, to a lesser extent, nitrofurantoin than the wild type or the complemented strain. These antibiotics are activated by nitroreductases to what is thought to be to the hydroxylamine form of the drug (57), which is a DNA-damaging agent. As mutations of type 1 oxygen-insensitive nitroreductases can confer resistance to nitrofurantoin in E. coli (58), it suggests that the futile cycling of an oxygen sensitive nitroreductase is not the cause of toxicity. This implies that nitroreductase activity increases in the ⌬acg mutant of M. tuberculosis. Therefore, Acg acts as an inhibitor of  DECEMBER 28, 2012 • VOLUME 287 • NUMBER 53 another nitroreductase, presumably the single classical nitroreductase Rv3368, rather than being a nitroreductase itself. This then produces an alternative model for the function of the Acg family as sequestering flavin. Acg would then bind FMN under its lid, depleting the classical nitroreductase of flavin and thus reducing the susceptibility to classical nitroreductase-activated drugs. The lack of a physiological reducing agent would not then be an issue, as the Acg proteins are not designed to function as enzymes. However, flavins are normally tightly bound in enzymes (K d of 10 Ϫ7 -10 Ϫ10 M Ϫ1 ) (59) so the inhibition is more likely to come from competition among newly synthesized apoenzymes than active extraction from holoenzymes. The inhibition of nitroreductase activity by Acg may well be a side effect rather than a cause, with another pathway being the primary target.

Structure of Mycobacterial Acg
Flavin biosynthesis is essential for M. tuberculosis, as it has no flavin uptake pathway, and transposon-mediated disruption of several of the genes are fatal. Riboflavin biosynthesis enzymes are conserved in Mycobacterium leprae, regarded as a minimal mycobacterial genome. However, regulation of flavin synthesis in M. tuberculosis is not understood, as there does not appear to be a riboswitch as found in Bacillus subtilis (60). More than 3% of M. tuberculosis genes are flavoproteins, the highest in a survey of 22 genomes (61), and is particularly rich in FAD-dependent acyl-CoA dehydrogenases required for lipid degradation. Inhibition of these enzymes by lack of free FMN to be converted to FAD may aid the switch to lipid body formation seen in persister-like bacteria (62). It may be relevant that the gene primarily implicated in this switch tgs1 Rv3130 is close to two of the Acg family members (Rv3127 and Rv3131) in the M. tuberculosis genome and also under control of the DosR regulon.
The Acg family of proteins has almost certainly evolved from the classical homodimeric nitroreductase by gene duplication, fusion, and reductive evolution to give proteins with a single FMN binding site protected by a lid domain. Lack of reduction by the normal physiological reducing agent NADPH raises the possibility that nitroreductase activity is no longer the function, although the protein will carry out these reactions when chemically reduced. It may be that the lid over the active site has evolved to store flavin or inhibit certain flavin-dependent met-