Crystal structure of the "cab"-type beta class carbonic anhydrase from the archaeon Methanobacterium thermoautotrophicum.

The structure of the "cab"-type beta class carbonic anhydrase from the archaeon Methanobacterium thermoautotrophicum (Cab) has been determined to 2.1-A resolution using the multiwavelength anomalous diffraction phasing technique. Cab exists as a dimer with a subunit fold similar to that observed in "plant"-type beta class carbonic anhydrases. The active site zinc is coordinated by protein ligands Cys(32), His(87), and Cys(90), with the tetrahedral coordination completed by a water molecule. The major difference between plant- and cab-type beta class carbonic anhydrases is in the organization of the hydrophobic pocket. The structure reveals a Hepes buffer molecule bound 8 A away from the active site zinc, which suggests a possible proton transfer pathway from the active site to the solvent.

␥-CA has thus far been isolated and characterized only from the methanoarchaeon Methanosarcina thermophila (14 -16), where it is proposed to facilitate the transport of CH 3 COO Ϫ and to convert CO 2 to HCO 3 Ϫ outside the cell to assist the removal of excess CO 2 generated during the growth of this organism on acetate. The M. thermophila ␥-CA exists as a trimer, with the active site located at the interface between two subunits. Each subunit is organized around a left-handed ␤-helix that is completely distinct from the ␣-CA fold, although the active site is also coordinated by three histidines, along with two water molecules (17,18).
The ␤ class includes CAs from plants, algae, bacteria, and archaea (2,3). In higher plants, ␤-CAs play an important role in photosynthesis, by concentrating CO 2 in the proximity of ribulose bisphosphate carboxylase/oxygenase for CO 2 fixation (19). The purification and characterization of carbonic anhydrase (Cab) from the thermophilic Methanobacterium thermoautotrophicum extends this class into the archaea (20). Cab is at the phylogenetic extreme of the ␤ class carbonic anhydrases and forms an exclusively prokaryotic clade consisting primarily of sequences from Gram-positive bacteria (2). In the obligate chemolithoautotroph M. thermoautotrophicum, Cab converts CO 2 to HCO 3 Ϫ , suggesting that the physiological role of this enzyme may be to provide HCO 3 Ϫ to enzymes important in CO 2 fixation pathways of the microbe (21,22).
␤ class carbonic anhydrases can be further divided into "plant"-and "cab"-type, based on the active site sequence conservation (20,23) (Fig. 1). Two crystal structures of plant-type ␤-CA were recently reported from Porphyridium purpureum (P. purpureum ␤-CA) (24) and Pisum sativum (P. sativum ␤-CA) (23). The basic fold of ␤-CA consists of a four-stranded, parallel ␤-sheet core with ␣-helices forming right-handed crossover connections (23,24). The oligomerization state of ␤-CA is variable, however, and P. purpureum ␤-CA and P. sativum ␤-CA exist as a dimer and octamer, respectively, although the dimer of P. purpureum ␤-CA resembles a tetramer, where two monomers are fused together. In contrast to the protein ligation by three histidines observed in ␣and ␥-CAs, the active site zinc in ␤-CAs is coordinated by two cysteines and one histidine, as anticipated from extended X-ray absorption fine structure spectroscopy studies (21,25,26). The fourth ligand is different in the two ␤-CAs structures. In the P. sativum ␤-CA structure, an acetate molecule is bound to the zinc, whereas in the P. purpureum ␤-CA structure, the side chain of aspartic acid (Asp 151 ) acts as the fourth ligand. In the P. sativum ␤-CA * This work was supported by National Institutes of Health Grants GM44661 (to J. G. F) and GM45162 (to D. C. R.); a National Science Foundation predoctoral fellowship (to P. S.); and NASA-AMES Cooperative Agreement NCC2-1057 (to the Pennsylvania State University Astrobiology Research Center). This work is based upon research conducted at the Stanford Synchrotron Radiation Laboratory, which is funded by the Department of Energy (Office of Basic Energy Sciences and Office of Biological and Environmental Research) and the National Institutes of Health (National Center for Research Resources, NIGMS). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The structure, this conserved Asp interacts with a conserved Arg (Fig. 1).
Apart from the conserved zinc ligands and the Asp/Arg pair, the active site of Cab differs significantly from the plant-type ␤-CAs. Cab active site residues His 23 , Met 33 , Lys 53 , Ala 58 , and Val 72 are replaced in the plant-type ␤-CA by Gln, Ala, Phe, Val, and Tyr, respectively. Residues that are equivalent to Cab residues Met 33 , Lys 53 , Ala 58 , and Val 72 make up the hydrophobic pocket in the P. sativum and P. purpureum ␤-CA structures. Substitutions of the two aromatic side chains by Lys and Val in Cab suggest a significant redesign of the hydrophobic pocket in cab-type ␤-carbonic anhydrase. Cab has a CO 2 hydration activity with a k cat of 1.7 ϫ 10 4 s Ϫ1 and K m for CO 2 of 2.9 mM at pH 8.5 and 25°C (20). Cab is inhibited by iodide, nitrate, and azide; however, in contrast to plant-type ␤-CAs, chloride and sulfate have no effect on Cab activity. These active site substitutions, together with the different effects of inhibitors, imply that there might be mechanistically relevant differences in the organization of the active sites between cab-type and planttype enzymes. Here we present the first structure of the cabtype ␤-carbonic anhydrase from thermophilic methanoarchaeon M. thermoautotrophicum, determined at 2.1-Å resolution.

EXPERIMENTAL PROCEDURES
Crystallization-Cab was overexpressed in Escherichia coli and purified by a heat denaturation step followed by ion exchange chromatography as previously described (20). Crystals were grown by the hanging drop method at 22°C using a 5 mg/ml protein solution and a precipitant solution containing 100 mM Hepes, pH 7.5, 35% ethanol, 12% 2-methyl-2,4-pentanediol, and 50 mM calcium acetate. The crystals belong to the orthorhombic space group P2 1 2 1 2 1 with unit cell dimensions a ϭ 54.9 Å, b ϭ 113.2 Å, c ϭ 156.2 Å and three dimers per asymmetric unit. Crystals were transferred stepwise to mother liquor solution containing 30% 2-methyl-2,4-pentanediol as a cryoprotectant and flash frozen.
Data Collection and Processing-A three-wavelength multiwavelength anomalous dispersion data set was collected at Ϫ160°C on beamline 9-2 of the Stanford Synchrotron Research Laboratory with an ADSC charge-coupled device detector. The fluorescence spectrum measured around the zinc edge of a single crystal was used to select the inflection point ( ϭ 1.2832 Å), the absorption edge ( ϭ 1.282 Å), and a high energy remote wavelength ( ϭ 1.033 Å) for optimization of the anomalous signal. All data were reduced using DENZO and scaled using SCALEPACK (27) (Table I).
Phase Determination-The structure was determined by multiwavelength anomalous dispersion using the signal from only the intrinsic zinc atoms. The program SOLVE (28) was used to find the positions of the heavy atoms using the three wavelength multiwavelength anomalous dispersion data set using data from 20-to 2.4-Å resolution. Two zinc ions were identified and when used for phasing yielded a figure of merit of 0.44. Four additional zinc sites were located in an anomalous difference Fourier map, yielding a figure of merit of 0.54. Each of the six zinc sites corresponded to a Cab monomer, which are organized into three tight dimers in the asymmetric unit. The initial noncrystallographic symmetry (NCS) transformations were established by the relationships between the dimers, and the initial mask was calculated with the program NCSMSK (29). The program DM (30) was used for NCS averaging of the electron density maps, solvent flattening, and phase extension from 2.4-to 2.1-Å resolution. The resulting map was of good quality and allowed building of most of the protein.
Refinement-Alternate cycles of manual model building using the program O (31) and positional and individual B-factor refinement with the program CNS (32) reduced the R and R free to 21.1 and 24.6% respectively, where R free is calculated for 5% (2875) of the reflections in the resolution range 18 -2.1 Å. The model was initially refined with strict NCS restraints, which were released later in the refinement. The r.m.s. deviation of bond lengths and angles are 0.013 Å and 1.7°respectively, with 87.9% in the most allowed region and 11.1% in the additionally allowed region of the Ramachandran plot. The average Bfactors are 28.6 Å 2 (main chain), 32.4 Å 2 (side chains), 18.6 Å 2 (zinc atoms), and 36.3 Å 2 (solvent). An average B-factor of 30.5 Å 2 is calculated for all protein atoms. The final model contains 7543 protein atoms, 3 Hepes molecules, 6 zinc atoms, 6 calcium atoms, and 409 water molecules, for a total of 8015 atoms (Table II).

RESULTS AND DISCUSSION
Structural Organization of Cab-The overall fold of the Cab monomer ( Fig. 2A) consists of a four-stranded parallel ␤-sheet core with strand order 2-1-3-4. Monomers in each dimer are related by a 2-fold axis centered between strands ␤2. The overall dimensions of the dimer are ϳ40 ϫ 45 ϫ 50 Å (Fig. 2B). Three dimers, designated AB, CD, and EF, are present in the asymmetric unit. In the structurally conserved ␤-sheet region (residues 26 -32, 52-57, 80 -88, 149 -157, and 163-167), the r.m.s. deviations in C␣ positions between the dimers average 0.24 Å. Using C␣ positions for residues 24 -170, the correspond- ing r.m.s. deviations are 0.34 Å between dimers AB and CD, 1.0 Å between dimers AB and EF, and 1.1 Å between dimers CD and EF. The relatively large r.m.s. deviations for the latter two pairs of dimers reflect the poor ordering for residues 91-126 in monomer E. Excluding these residues, the r.m.s. deviations drop to ϳ0.28 Å. The r.m.s. deviation in the ␤-sheet region between the A and B, C and D, and E and F subunits of the Cab are ϳ0.6 Å, while the r.m.s. deviation for all C␣ between the subunits are ϳ0.8 Å.
The regions with the largest conformational variation include the N-terminal residues 1-23 (involved in crystal packing), residues 92-95, and residues 120 -125. While the N-terminal residues 1-23 are well defined in monomers A, C, and E, residues 13-23 are disordered in monomers B, D, and F. Residues 92-95 and 120 -125 form a hinge region for helices ␣4 and ␣5 and are well ordered in monomers B and D, slightly disordered in monomers A, C, and F, and disordered in monomer E. The variability in region 90 -125 between the six crystallographically independent monomers suggests that these residues are conformationally mobile. Overall, with the exception of the noted regions, the three dimers in the asymmetric unit are very similar. Unless stated otherwise, only dimer AB will be used in future discussions.
Six calcium ions were located on the surface of Cab and at the crystal packing interfaces between Cab dimers. Most likely, these calcium ions do not play any structural or catalytical role and are a result of the crystallization conditions that contained 50 mM calcium acetate.
Cab Oligomerization-␤-CAs have been found in different oligomeric states ranging from dimers (Oryza sativa, P. purpureum) to tetramers (E. coli) to octamers (P. sativum) (20,23,24,33). Although analytical ultracentrifugation results suggest that Cab exists as a homotetramer (21), Cab appears to form a dimer in the crystal. There are numerous interactions stabilizing the dimer involving residues in ␤-strand ␤2 and helices ␣2, ␣3, ␣4, and ␣5. Hydrogen bonds between residues 56 and 57 from strand ␤2 in both subunits result in a formation of a 10-stranded ␤ sheet. Helices ␣4 and ␣5 extend out and make extensive contacts with the other monomer in the dimer (Fig.  2B). The interface area between these two subunits was found to be ϳ2110 Å 2 /subunit, which is ϳ21% of the total (ϳ10,000 Å 2 ) monomer accessible surface area as calculated with GRASP (34). Residues A2-A23 pack against a symmetry-related molecule burying 860 Å 2 , resulting in the formation of a continuous ribbon through the crystal (Fig. 3). Residues B2-B12 most likely fold back and pack against the B monomer in the same way that A2-A12 packs against the symmetry-related molecule. Residues B13-B23 have no visible electron density and are disordered. A similar type of crystal packing forming a continuous ribbon has been observed in the sterile ␣ motif domain of the human EphB2 receptor and, together with other evidence, was proposed to be functionally important (35). The crystal packing interaction seen in Cab can also be described as a linear (open-ended) domain swapped oligomer, where the swapped domain consists of a 12-residue ␣-helix (36,37). It is unclear whether residues 1-24 could facilitate the formation of a higher oligomerization state under physiological conditions or whether residues A2-A23 also fold back and pack against monomer A. Based on modeling considerations, although the N-terminal region is highly flexible, formation of a Cab tetramer similar to the one seen in the P. purpureum structure is unlikely in the absence of conformational changes due to steric clashes of helices ␣4 and ␣5. Other contacts between Cab dimers in the crystal are relatively small (Յ540 Å 2 ) and, based on their complementarity and size (38,39), are also unlikely to support formation of stable tetramers.
Comparison of Cab and P. sativum and P. purpureum Structures-The fold of Cab is similar to that of the P. sativum and P. purpureum ␤-CAs (Fig. 4). While the ␤-sheet core is conserved, significant secondary structure differences are evident, mostly in the regions at the N terminus, at the C terminus, and in the region containing residues 90 -125. The N termini of the P. sativum and P. purpureum ␤-CA structures extend out, forming a long helix that packs against a second monomer, making additional dimer interactions. In Cab, the N terminus is involved in crystal packing and does not adopt the same conformation observed in the P. sativum and P. purpureum ␤-CA structures. Cab is one of the smallest ␤-CAs known and lacks an extended C terminus. In the P. sativum ␤-CA structure, the C terminus forms a long ␤-strand that mediates  Structure of "cab"-type ␤ Class Carbonic Anhydrase octamerization. In Cab, residues 90 -125 form two helices (␣4, ␣5) that project out to cover the second monomer (Fig. 2B) and fold back to start helix ␣6. In both the P. sativum and P. purpureum ␤-CA structures, this segment is longer, forms three helices instead of two, and folds back earlier to create two additional turns of helix ␣6. In the central ␤-sheet region, the r.m.s. deviations in C␣ positions between Cab and P. sativum and P. purpureum structures are 0.62 and 0.56 Å, respectively.
Active Site-The active site cleft is located at the C terminus of the parallel ␤-sheet and is largely sequestered from solvent. Each Cab subunit contains one zinc atom that resides at the interface of the two monomers (Fig. 2B), although the coordination residues (Cys 32 , Cys 90 , and His 87 ) originate from the same monomer. One water molecule completes the tetrahedral coordination sphere of the zinc. Although the crystallization conditions contained 50 mM calcium acetate, no acetate was found in the active site, unlike in the P. sativum ␤-CA structure Rather unexpectedly, the two active sites A and B in the Cab dimer exhibit significant differences, which are reflected in the r.m.s. deviations between monomers in a dimer (ϳ0.65 Å) being consistently larger than between the equivalent monomers in different dimers such as A and C, or B and D (0.24 Å). The active sites in monomers A, C, and E have a Hepes buffer molecule bound near the active site. The sulfate group of the Hepes is located ϳ8 Å from the zinc atom, and the sulfate oxygens hydrogen bond with Lys 53 N , Ser 35 O ␥ , and the amide nitrogen of Ser 35 (Fig. 5B). The equivalent to Ser 35 is present in both plant-type ␤-CAs and in Cab, while Lys 53 is unique to the cab-type ␤-CAs. Asp 34 , which makes a hydrogen bond to the zinc-coordinating water molecule, is also within hydrogen bonding distance (ϳ3.0 Å) to the Hepes sulfate group. Another difference between the two active sites is the conformation of residues 13-24. In the active site of subunit A, residues B13-B24 are disordered and have no visible electron density. In the active site of subunit B, however, residues A13-A24 are well defined and extend out to participate in crystal packing. Superposition of residues in the active sites of subunits A and B (Fig. 5B) indicates that the Hepes molecule would sterically clash with residues Arg 16 and Asp 17 if residues 13-24 adopted the same conformation as in active site B.
Comparison of P. sativum ␤-CA and Cab Active Sites-The zinc-coordinating residues (Cys 32 , His 87 , and Cys 90 ) of Cab and P. sativum ␤-CA superimpose closely (Fig. 6A). The water molecule serving as the fourth zinc ligand in Cab adopts a position similar to the O1 of the acetate ligand in the P. sativum ␤-CA structure. The conserved residue Asp 34 is held in place by forming two hydrogen bonds to Arg 36 , as do the corresponding residues Asp 162 and Arg 164 in the P. sativum ␤-CA. The following five active site substitutions distinguish the cab-type and plant-type ␤-carbonic anhydrases: H23Q, M33A, K53F, A58V, and V72Y. The superposition of Cab and P. sativum ␤-CA structures clearly shows the distinctions in active site organization by these residues. His 23 , found on a flexible loop, is ϳ25 Å away from the catalytic zinc and probably is not a part of the active site. On the other hand, the equivalent residue Gln 151 in the P. sativum ␤-CA structure occupies a position similar to the Hepes sulfate group that is ϳ8 Å away from the zinc atom and is possibly involved in ligand binding (23). The hydrophobic pocket of Cab is quite different from that of plant-type ␤-CA (Fig. 6A). In P. sativum ␤-CA structure, the hydrophobic pocket is formed by Phe 179 , Val 184 , and Tyr 205 . The corresponding residues in Cab (Lys 53 , Ala 58 , and Val 72 ) constitute a more open and less hydrophobic pocket.
Comparison of P. purpureum ␤-CA and Cab Active Sites-Again, the protein ligands to the zinc in the two structures superpose closely (Fig. 6B). The fourth ligand is different, since the side chain of Asp 151 in the P. purpureum ␤-CA structure coordinates the zinc instead of the water molecule seen in Cab structure. Asp 151 of P. purpureum ␤-CA is equivalent to Asp 34 of Cab. As a consequence of the zinc coordination by Asp 151 in the P. purpureum ␤-CA structure, this residue cannot pair with the conserved Arg, and Arg 153 has flipped away from the active  site. The adjacent Ser 152 has moved by ϳ4 Å and adopts a different conformation in the P. purpureum ␤-CA structure (Fig. 6B). The hydrophobic pocket arrangement of P. purpureum ␤-CA is very similar to that of P. sativum ␤-CA, and the differences between the cab-type and plant-type hydrophobic pocket have already been discussed.
Mechanism-The catalytic mechanism of carbonic anhydrases has been most extensively studied for the ␣-CA class (5, 6, 40 -42). The zinc hydroxide mechanism established for this class provides an appropriate framework for discussing the catalytic mechanism of Cab. In the first part of the CO 2 hydration reaction, CO 2 binds in the hydrophobic pocket and probably interacts with the amide nitrogen of Thr 199 . This threonine is known as the "gatekeeper," and the side chain plays an important role in the ␣-CAs, together with Glu 106 , in orienting the CO 2 molecule for attack by the zinc-bound hydroxide. In the P. sativum ␤-CA structure, Asp 162 , Gly 224 , and Gln 151 are thought to play the same role in orienting CO 2 for this attack (23). In Cab, Asp 34 and Gly 91 are in the same orientation as Asp 162 and Gly 224 in P. sativum ␤-CA structure and might also help to orient CO 2 . His 23 , the equivalent of P. sativum ␤-CA Gln 151 , is, however, disordered in active sites A, C, and E, and in active sites B, D, and F it is at the beginning of a segment of residues that pack against the symmetry-related molecule and lies 25 Å away from the active site zinc.
The second, rate-limiting, step in the CO 2 hydration reaction involves the regeneration of a hydroxide ion from the zincbound water molecule. In ␣-CAII, the zinc ion is located in a deep funnel and requires a proton shuttle to transfer the proton to the bulk solvent. His 64 of ␣-CAII adopts multiple conformations, which facilitates accepting the proton from the zincbound water molecule and delivering it to buffer in bulk solution (43). In ␥-CA, Glu 84 exhibits multiple conformations and has been proposed to participate in a proton shuttle (18,44). Residues with multiple conformations have not been described in the active site of any ␤-CA structure determined so far. Since the ␤-CA active site is closer to the surface of the protein than the ␣-CA active site, a protein-mediated proton shuttle might not be necessary. The ␤-CA reaction rate depends on buffer concentration, implying that proton transfer can be rate-limiting under certain conditions (21). In the Cab structure, a Hepes buffer molecule was found near the active sites A, C, and E. The Hepes sulfate group is located ϳ8 Å away from the zinc atom and lies within hydrogen bonding distance of Asp 34 , which makes a hydrogen bond to the zinc bound water molecule. In the P. purpureum ␤-CA structure, the equivalent of Asp 34 acts as the fourth zinc ligand, and in the proposed mechanism it plays a role in the proton transfer (24). Therefore, the most plausible pathway for proton transfer in Cab is from the zincbound water molecule to Asp 34 and then to the sulfate group of the bound Hepes molecule or a solvent molecule. The conformation of residues 1-25 in active sites B, D, and F is incompatible with Hepes binding, and residues 13-25 must adopt different conformations for Hepes to bind (Fig. 5B). The mobility of residues 1-25 and 92-125 might allow buffer molecules to diffuse into the active site and serve as the proton acceptor necessary to regenerate the zinc-bound hydroxide.