Crystal Structure of Full-length Mycobacterium tuberculosis H37Rv Glycogen Branching Enzyme

The open reading frame Rv1326c of Mycobacterium tuberculosis (Mtb) H37Rv encodes for an α-1,4-glucan branching enzyme (MtbGlgB, EC 2.4.1.18, Uniprot entry Q10625). This enzyme belongs to glycoside hydrolase (GH) family 13 and catalyzes the branching of a linear glucose chain during glycogenesis by cleaving a 1→4 bond and making a new 1→6 bond. Here, we show the crystal structure of full-length MtbGlgB (MtbGlgBWT) at 2.33-Å resolution. MtbGlgBWT contains four domains: N1 β-sandwich, N2 β-sandwich, a central (β/α)8 domain that houses the catalytic site, and a C-terminal β-sandwich. We have assayed the amylase activity with amylose and starch as substrates and the glycogen branching activity using amylose as a substrate for MtbGlgBWT and the N1 domain-deleted (the first 108 residues deleted) MtbΔ108GlgB protein. The N1 β-sandwich, which is formed by the first 105 amino acids and superimposes well with the N2 β-sandwich, is shown to have an influence in substrate binding in the amylase assay. Also, we have checked and shown that several GH13 family inhibitors are ineffective against MtbGlgBWT and MtbΔ108GlgB. We propose a two-step reaction mechanism, for the amylase activity (1→4 bond breakage) and isomerization (1→6 bond formation), which occurs in the same catalytic pocket. The structural and functional properties of MtbGlgB and MtbΔ108GlgB are compared with those of the N-terminal 112-amino acid-deleted Escherichia coli GlgB (ECΔ112GlgB).

Tuberculosis is still a major killer infectious disease, at least in developing countries. Mycobacterium tuberculosis (Mtb), 7 the causative bacterial agent of tuberculosis, survives for a long period of time intracellularly and causes latent tuberculosis. If the organism is physiologically inactive for a long period of time, its storage sugars become very important for its survival. Therefore, understanding the nature of the enzymes that are involved in the metabolism of these storage sugars is very important. Glycogen is one of the most important storage sugars in the living world and provides nutrition to the host. Furthermore, the cell envelop of Mtb has a very important role during host-pathogen interactions. The outermost layer of the cell envelop of Mtb consists of a loosely bound structure, known as the capsule (1)(2)(3). It has been demonstrated that the major components of the capsular material are carbohydrates and proteins with a very small amount of lipids (4 -6). The major carbohydrate constituent (ϳ80%) of the Mtb capsule is a high molecular mass (Ͼ100,000 Da) ␣-glucan, which is composed of a (34-␣-D-Glc-13) core and branched at position 6, every 5 or 6 residues, by (34-␣-D-Glc-13) oligoglucosides (4,7,8). ␣-Glucan mediates non-opsonic binding of Mtb to CR3 (complement receptor3) (9) and is instrumental in blocking CD1 expression in Mtb (10). Stokes et al. (11) have shown that the capsular material of Mtb also displays antiphagocytic properties with certain types of macrophages.
Glycogen synthesis is an endergonic process. Glycogen, composed of branched polymer chains of glucose, is synthesized from monomers of UDP-glucose. The glycogen branching enzyme (GlgB), also known as amylo(␣134 -6)transglycosylase, catalyzes the transfer of a fragment of 6 -7 glucose units from a non-reducing end to the hydroxyl group of C6 of a glucose unit, either on the same glucose chain or adjacent chains (Scheme 1). This enzyme belongs to carbohydrate binding module family 48 and glycoside hydrolase family 13 (GH13) (12). The details of a plausible reaction mechanism for GlgB are discussed later. Although GlgB is essential in several organisms, Sassati et al. (13) and Sambou et al. (14) have shown, in particular, that a functional GlgB, encoded by the ORF Rv1326c, is essential for normal growth of Mtb.
The GlgB is classified under the ␣-amylase family of enzymes, which includes ␣-amylase, isoamylase, pullulanase, and cyclodextrin glucanotransferase. The x-ray crystal structures of several ␣-amylase family enzymes show that they all have a common catalytic (␤/␣) 8 barrel (also known as the TIM barrel) core, an N-terminal domain, and a C-terminal domain. Devillers et al. (15) have demonstrated that the N-terminal domain of Escherichia coli GlgB provides support for glucan substrate during cleavage and transfer of ␣(1-4)glucan chains. The M. tuberculosis H37Rv strain harbors open reading frame Rv1326c, which has been annotated (16) to encode for an ␣(1-4)glucan branching enzyme. Our previous study (17) has verified that the recombinant ␣(1-4)glucan branching enzyme of M. tuberculosis H37Rv (MtbGlgB) utilizes amylose as substrate.
Any defect or deletion of the GlgB enzyme leads to glycogen storage diseases and GlgB deficiency, a genetic disorder mainly in the American quarter horse. The lack of GlgB is directly linked to type IV glycogenesis (eponym: Andersen disease) (18), which is a rare but fatal genetic disorder in infants (19) and horses (20).
Sequence alignment and structural prediction results have shown that GlgB and isoamylase are structurally very close members of the GH13 family. These enzymes bind to sugars at the 1-6 position. The crystal structure of N-terminal 112amino acid-deleted E. coli GlgB (EC⌬112GlgB) has been reported (21). Even though sequence alignment shows good homology between Mtb and E. coli GlgB proteins, there are marked insertions and deletions in both sequences (supplemental Fig. S1). Furthermore, the E. coli crystal structure lacks the N-terminal N1 domain (residues 1-105, MtbGlgB numbering). This domain, a ␤-sandwich, is observed in several GH13 members and is implicated in substrate specificity, recognition, and binding (22).
Along with this background, we still have several unanswered questions like: what are the structural features of the full-length GlgB protein that contribute to substrate specificity and influence its amylase and branching enzymatic activities? What are the structural similarities and differences between the GlgB enzyme from E. coli and Mtb? What are the structural features of the GlgB enzyme that could be potentially exploited toward successful therapeutic applications, especially against tuberculosis? To address these questions, we have determined the crystal structure of the full-length protein (MtbGlgBWT) at 2.33-Å resolution. Our present study adds more light to our current knowledge and understanding of this metabolically important enzyme.

EXPERIMENTAL PROCEDURES
Cloning of GlgB the 108-Amino Acid N1 Domain Deletion Mutant-The cloning details of the full-length Mtb glgB gene have already been published (17). The gene encoding the N-terminal 108-amino acid deletion mutant of Mtb glgB was PCRamplified using the following primers: forward primer, 5Ј-AAT TAA TTA GGA TCC ATG ACC CTG GGC GAG GTC GAC CTG-3Ј and reverse primer, 5Ј-ATA TAT CTC GAG CTA GGC GGG CGT CAG CCA CAG C-3Ј (BamHI and XhoI restriction sites in the forward and reverse primers, respectively, are underlined) and cloned between the BamHI and XhoI sites of the pET29a vector (Novagen).
Protein Expression and Purification-The full-length MtbGlgBWT protein was expressed and purified as described previously (17) but with some modifications. In brief, the protein was overexpressed in E. coli BL21(DE3). Protein solubility was enhanced by the coexpression of the GroEL/GroES chaperons (in the pKY206 vector) under a constitutive promoter. The pKY206 transformed cells with MtbglgB were grown in LB medium, containing tetracycline (12.5 g/ml) and kanamycin (30 g/ml), to an A 600 of 0.6 at 37°C. Protein expression was induced by 0.3 mM isopropyl ␤-D-1-thiogalactopyranoside. After induction, the cells were grown initially at 30°C for 2 h to get enough GroEL/GroES expressed and later the culture was shifted to 16°C for overnight. The harvested cell pellet was re-suspended in lysis buffer containing 50 mM Tris-HCl (pH 8.0), 300 mM NaCl, 25 mM imidazole, 5% glycerol, 1 mM phenylmethylsulfonyl fluoride, and subjected to French-press. The crude lysate was centrifuged at 38,000 ϫ g using the JA 25.15 rotor (Beckman) for 30 min at 4°C, and the supernatant was incubated with Co 2ϩ containing Talon resin (Clontech) for affinity purification. The column was washed with wash buffer SCHEME 1. Formation of a glycogen branch. GlgB catalyzes the transfer of a fragment of 6 -7 glucose units from a non-reducing end to the hydroxyl group at C6 of another glucose unit, either in the same chain or an adjacent chain.
X-ray Crystallography-The MtbGlgBWT protein was concentrated to 12 mg/ml and setup for crystallization using the hanging drop, vapor-diffusion method at room temperature with various commercially available crystallization kits. Crystals first appear after 4 weeks in JB Screen Classic 2 (Jena Bioscience), condition D1 (100 mM MES (pH 6.5), 30% (w/v) polyethylene glycol 4000). Further optimization of this condition (using a matrix of pH 6, 6.5, and 7 and the precipitant concentrations 25%, 30, and 35%) produced crystals of approximate dimensions 0.3 ϫ 0.1 ϫ 0.05 mm in 2 weeks.
A crystal of suitable size was transferred to a cryoprotectant solution containing 100 mM MES (pH 6.5), 25% (v/v) glycerol, 33% (w/v) polyethylene glycol 4000. X-ray diffraction data were collected from a single crystal at temperature 100 K at beamline LS-CAT 21-ID-G of Advanced Photon Source (Chicago, IL) on an MAR Mosaic225 charge-coupled device detector. The diffraction images were processed using HKL2000 (23). The asymmetric unit contains one full MtbGlgB WT molecule (731 amino acids, Mathew's coefficient of 2.56 Å 3 Da Ϫ1 , solvent content 51.9%). The structure was determined by the molecular replacement method using the Molrep program (24) with the EC⌬112GlgB structure (PDB ID: 1M7X) (21) as the input search model. Model building was carried out with the help of Arp/wARP (25) and O (26). A model for the N1 domain was predicted using the Geno3D server (27) and was subsequently adjusted in electron density. The structure was refined with the CNS program suite (28), and the geometry of the molecule was checked with PROCHECK (29). The correctness of the model, including the N1 domain structure, was verified with the simulated annealing omit maps option of the CNS program. Electron density for the first nine N-terminal amino acids was not observed in the map. The electron density along residues 367-376 and 422-433 is very poorly defined, and some of these residues occupy the disallowed region on the Ramachandran plot. All drawings were prepared with the MOLSCRIPT (30) and RASTER3D (31) programs. Table 1 shows the data and refinement statistics.
Enzyme Activity and Kinetics-The amylase activity of the purified MtbGlgBWT and Mtb⌬108GlgB proteins on amylose and starch (as substrates) was assayed as described earlier (17). In brief, a 100-l reaction mixture containing the required amount of substrate in 50 mM citrate buffer (pH 7.0) and 1.5 M purified MtbGlgB protein. After 30 min of incubation, at 25°C, the reaction was stopped using KI/I 2 solution, and the change in optical density was recorded at 660 nm.
We have also assayed the branching activity of MtbGlgBWT and Mtb⌬108GlgB by measuring the amount of glycogen formed. The principle of the assay is: the formed glycogen is converted to glucose (by glucoamylase), which is then specifically oxidized to produce a product that reacts with the OxiRed probe to generate a specific color (570 nm). The glycogen assay kit was purchased from BioVision. To make sure that both the proteins were active, a parallel iodine assay was also performed. 2 g of amylose was treated with 1.5 M MtbGlgBWT and Mtb⌬108GlgB, for 30 min at 30°C, in duplicate. A similar reaction was maintained that underwent the iodine test as a control to confirm the utilization of amylose by the proteins. After 30 min, 2 l of the glucoamylase mixture, provided in the kit, was added, and the sample was incubated again for 30 min at 25°C. After that, 50 l of the developing reagent was added to the reaction mixture, and the mixture was incubated at 25°C for another 30 min before measuring the optical density at 570 nm. To estimate the amount of glycogen formed during this reaction, a glycogen standard curve was prepared by using the glycogen standard provided in the kit. Suitable controls were used throughout. The experiment was repeated with a fresh batch of proteins, and the average value of two different sets of experimental results is provided.
The effects of different known inhibitors were also studied. Stock solutions of ADP, ADP-glucose, castenospermine, nojirimycin, tunicamycin, and acarbose (some of the known inhibitors of the GH13 family of enzymes) were prepared in 50 mM citrate buffer (pH 7.0). To study the effect of these inhibitors, a 100-l reaction was set up, which contained 0.2 mg/ml amylose in 50 mM citrate buffer (pH 7.0) and then the required amount of the inhibitor. The reaction was started by adding 0.3 M purified MtbGlgB proteins. After 30 min of incubation, the reaction was stopped using KI/I 2 solution, and the change in

RESULTS
Structural Overview-The crystal structure of the full-length GlgB of M. tuberculosis H37Rv (MtbGlgBWT) has been determined at 2.33-Å resolution. As shown in Fig. 1A, the GlgB molecule consists of four domains: N-terminal ␤-sandwich domain N1 (residues 1-105, colored brown), N-terminal ␤-sandwich N2 (residues 106 -226, magenta), the central (␤/␣) 8 domain (residues 227-631, green), and the C-terminal ␤-sandwich domain (residues 631-731, colored red). The (␤/␣) 8 domain, as the main body of the structure, is sandwiched between the N2 and C-terminal ␤-sandwiches. It consists of the well characterized (␤/␣) 8  Abad et al. (21) have given a clear description of the active site of EC⌬112GlgB. In short, the catalytic active site (Fig. 1B) superimposes well with that of other GH13 family members. It contains the seven residues (Asp-341, His-346, Arg-409, Asp-411, Glu-464, His-531, and Asp-532, MtbGlgB numbering) that are highly conserved in all members of ␣-amylase family 13. The overall surface around the active site pocket in EC⌬112GlgB, similar to that of isomaltulose synthase (also known as PalI, Uniprot entry: Q8KR84, another GH13 family member) (33), is highly negatively charged. This highly negative character of the (␤/␣) 8 barrel domain is important for sugarprotein interactions.
The N-terminal ␤-sandwich N1 domain (Fig. 1C) is very similar to the N2 domain. These two domains superimpose with an r.m.s.d. of 1.5 Å for 95 C␣ atoms. The topology of this domain is a typical immunoglobulin fold. The two ␤ sheets of the sandwich are formed by strands ␤1, ␤2, ␤5, and ␤4 and strands ␤3, ␤6, and ␤7, respectively. The top two hits in a DALI search, using the MtbGlgB N1 and N2 domains, are pullulanase (PDB: 2FH8, Z ϭ 11.6 and r.m.s.d. ϭ 2.7 for 217 C␣) and the pullulanase type I protein (2E8Z/11.0/3.7 for 205 residues).
Even though the ECGlgB (residues 117-728) and MtbGlgB (residues 106 -731) structures superimpose in their corresponding domains with an overall r.m.s.d. of 1.12 Å for 553 C␣ atoms (Fig. 1D), there are some local misalignments, especially in the coil region between residues Thr-230 and Pro-250. Also, due to unsuccessful attempts to crystallize the full protein, the E. coli GlgB structure represents only the N1 domain truncated form. Based on the significant level of sequence homology between the N1 domains (29% identity and 49% similarity) of the two proteins, we can expect that the N1 domain of E. coli GlgB should also have the same ␤-sandwich topology.
Enzyme Activity/Kinetics-The GlgB enzyme catalyzed two reactions, concurrently, in the same active site. A long glucose chain was cleaved at the 134 glycosidic bond (amylase activity) and a branching glucose chain was formed by making a new 136 glycosidic bond (glycogen formation branching reaction). We measured the amylase as well as glycogen-forming efficiencies of MtbGlgBWT and its N1 domain deletion mutant (Mtb⌬108GlgB). In the amylase activity study, when amylose was used as the substrate, the specific activity of MtbGlgB WT was 63.75 units/mg, whereas that for Mtb⌬108GlgB was 42 units/mg of protein. Similar to the specific activity, the rate of reaction for MtbGlgBWT protein was also faster: V max ϭ 1.35 mg ml Ϫl min Ϫl , where V max is expressed in milligrams of product formed by 1.5 M enzyme; K m ϭ 0.56 mg/ml; K cat ϭ 9 min Ϫl . Also, MtbGlgB has lower amylase activity when compared with that reported for E. coli GlgB. The corresponding values for Mtb⌬108GlgB were V max ϭ 0.62 mg ml Ϫl min Ϫl , K m ϭ 0.33 mg/ml, and K cat ϭ 4.13 min Ϫl (Fig. 2, A and B). However, the enzymatic assay using ECGlgB (21) reports equal K m values for the WT and ⌬112 proteins. Even though the amylase activity of Mtb⌬108GlgB was lower when amylose was used as substrate, both proteins showed almost equal amylase activity when soluble starch was used as substrate.
Next we tested the branching activity of MtbGlgB using a glycogen assay kit. The amount of glycogen formed by MtbGlgBWT was estimated from a glycogen standard curve to be 1.46 g, whereas the value for Mtb⌬108GlgB was 1.44 g. 2 g of amylose and 1.5 M protein was used in the reaction. BAY e4609 has been shown to be an effective inhibitor for the E. coli GlgB enzyme (21,34,35). Nonetheless, we wanted to test the effect of some of the known inhibitors of the GH13 family proteins on MtbGlgB. Inhibitors such as ADP, ADP glucose, tunicamycin, castenospermine, nojirimycin, or acarbose had no effect on the enzymatic activity of either MtbGlgBWT or Mtb⌬108GlgB, when amylose was used as substrate (supplemental Fig. S3). This clearly suggests the need for more inhibitor screening experiments to identify additional inhibitors, apart from BAY e4609.

DISCUSSION
Influence of the N1 Domain-Recently, Palomo et al. (36) have studied the influence of the N-terminal domains of the Deinococcus glycogen branching enzymes on the unique glycogen-branching patterns. The present study shows that the N1 ␤-sandwich has a differential preference in substrate recognition and binding during amylase activity, when amylose is used as a substrate. The N1-truncated Mtb⌬108GlgB protein is ϳ50% less active. However, to our surprise, there was no difference in substrate binding or amylase activity when starch was used as a substrate or in the amount of glycogen formed during  632-731, red). B, electron density around the catalytic pocket formed by the residues that are conserved in the GH13 family (MtbGlgB numbering). The 2F o Ϫ F c map is contoured at the 1.5 level. In C: I, the N-terminal N1 domain has the immunoglobulin sandwich fold. ii, the topological arrangement of the seven ␤-strands. The two sheets of the sandwich are made by strands ␤1, ␤2, ␤5, and ␤4 and strands ␤3, ␤6, and ␤7, respectively. Each of the flanking helices comes from the N1 and N2 domains, respectively. iii, superimposition of the N1 (brown) and N2 (magenta) domains of MtbGlgB. The r.m.s.d. is 1.5 Å for 95 C␣ atoms. D, superimposition of MtbGlgB (residues 106 -731 and domain colors as in A) and EC⌬112GlgB (113-528, blue). The r.m.s.d. is 1.12 Å for 553 C␣ atoms. Note the region between residues 230 and 250 (MtbGlgB numbering) is misaligned very much. branching activity. Thus it is difficult to draw the same conclusion as reported in the EC⌬112GlgB study about substrate preference. The existence of an additional N-terminal ␤-barrel domain in an ␣-amylase has been reported for AmyB from Halothermothrix orenii (37). However, there are marked structural differences between the N-terminal domains of AmyB and MtbGlgB.
Catalytic Mechanism-The substrate recognition scheme and binding sites for several amylase family enzymes, like PalI (33), AmyA from Halothermothrix orenii (38), ␣-amylase (39), CGTase (40), TAKA-amylase with substrate analogs (41), amylosucrase with D-glucose and mutated amylosucrase with sucrose (42,43), have been identified. A substrate binding model has been proposed for EC⌬112GlgB (21). GlgB enzymes from different organisms transfer chains of different lengths (the number of glucose units). We propose a two-step reaction mechanism, for the amylase reaction (breaking a 134 glycosidic bond) and isomerization (the branching reaction, by making a new 136 glycosidic bond), which happens in the same catalytic pocket (Scheme 2). The potential catalytic triad (Asp 411 , Glu 464 , and Asp 532 ) and two histidine residues (His 346 and His 531 ) of GlgB are highly conserved in almost all ␣-amylase and glycosyltransferase enzymes. These residues form a catalytic pocket (Fig.  1B) that binds amylose and breaks the 134 glycosidic bond. The similarity of the active site architecture strongly suggests that the amylase reaction occurs via a general acid catalysis, as in all glucoside hydrolases (43). Glu 464 acts as the general acid catalyst to protonate the oxygen of the 134 glycosidic linkage for the amylase reaction. Asp 411 , the attacking nucleophile, forms a bond with C1 to form a ␤-glucosyl-enzyme intermediate, which is presented to the OH group at C6 of another glucose unit, either on the same or another chain to form a 136 glycosidic bond. Truncation Mutants-Three MtbGlgB N-terminal deletion mutants (at residues 112, 121, and 171) were overexpressed, purified, and tested for enzymatic activity. As expected (data not shown), these proteins were less active in proportion to the number of amino acids truncated. Interestingly, the deletion also affects the amount of protein expression. In PalI (33), we had experimentally shown that selected truncations at the C-terminal domain showed reduction in activity (again, proportional to the length deleted). However, our attempt to prepare three C-terminal truncation mutants for MtbGlgB did not express any protein. Based on our PalI results, we predict that in MtbGlgB also the central catalytic domain might undergo conformational changes, and the active site gradually becomes inactive when the N-and C-terminal domains are truncated and the presumable structural pressure is relieved.
Clinical Significance-The evolution of drug-resistant strains of Mtb has created a significant concern. Mutational studies using the H37Rv⌬glgB/pVVglgBtb strain have shown the inability of the bacterium to survive, and this promising result should be further exploited to make use of GlgB as an attractive therapeutic target against Mtb. Also, comparison of the sequences of GlgB from several pathogenic and non-pathogenic bacterial genera clearly presents a case of domain closeness among them (supplemental Fig. S1). This sequence similarity indicates a good possibility of potential drugs, based on the MtbGlgB structure, for treatment against a wide spectrum of bacterial infection. A recent screening study for inhibitors against human pancreatic ␣-amylase (34) has shown that the two potential lead compounds are very ineffective against other closely related GH13 members. Binderup et al. (35) have used BAY e4609, a pseudo oligosaccharide in a branching enzyme inhibition study. However, many known GH13 family inhibitors have no effect on either MtbGlgBWT or Mtb⌬108GlgB. This warrants a more thorough drug screening against this protein, which is very likely a potential and important drug target in the fight against Mtb.