X-ray structure of beta-carbonic anhydrase from the red alga, Porphyridium purpureum, reveals a novel catalytic site for CO(2) hydration.

The carbonic anhydrases (CAs) fall into three evolutionarily distinct families designated alpha-, beta-, and gamma-CAs based on their primary structure. beta-CAs are present in higher plants, algae, and prokaryotes, and are involved in inorganic carbon utilization. Here, we describe the novel x-ray structure of beta-CA from the red alga, Porphyridium purpureum, at 2.2-A resolution using intrinsic zinc multiwavelength anomalous diffraction phasing. The CA monomer is composed of two internally repeating structures, being folded as a pair of fundamentally equivalent motifs of an alpha/beta domain and three projecting alpha-helices. The motif is obviously distinct from that of either alpha- or gamma-CAs. This homodimeric CA appears like a tetramer with a pseudo 222 symmetry. The active site zinc is coordinated by a Cys-Asp-His-Cys tetrad that is strictly conserved among the beta-CAs. No water molecule is found in a zinc-liganding radius, indicating that the zinc-hydroxide mechanism in alpha-CAs, and possibly in gamma-CAs, is not directly applicable to the case in beta-CAs. Zinc coordination environments of the CAs provide an interesting example of the convergent evolution of distinct catalytic sites required for the same CO(2) hydration reaction.

Carbonic anhydrase (CA, 1 EC 4.2.1.1) is a zinc-containing enzyme which catalyzes the reversible hydration of CO 2 . CA is ubiquitously distributed in nature and is involved in fundamental biological processes such as photosynthesis, respiration, pH homeostasis, and ion transport (1,2). Based on sequence homologies, CAs are classified into three evolutionarily distinct groups, designated ␣-, ␤-, and ␥-CAs (3). ␣-CAs are found in mammals, algae, and prokaryotes, ␤-CAs in higher plants, algae, and prokaryotes, and ␥-CA has been identified in the archaeon, Methanosarcina thermophila, although gene homologies have been recently found in higher plants and prokaryotes (4).
The x-ray structures of ␣-CAs from several mammals (5, 6) have revealed a common overall structure dominated by ␤-sheets. The catalytically active zinc is liganded by three histidine residues with a hydroxide or a water molecule as a fourth ligand, giving a tetrahedral coordination geometry. For the CO 2 hydration reaction, the zinc-bound hydroxide initiates a nucleophilic attack on the substrate CO 2 to form zinc-bound HCO 3 Ϫ which is then displaced by a water molecule (7). The regeneration of zinc-bound hydroxide requires an intramolecular proton shuttle from the zinc-bound water to the bulk solvent for each cycle of catalysis. The proton transfer is ratelimiting in the catalysis (8).
The x-ray structure of ␥-CA from M. thermophila has been reported (4). This is a trimeric CA with a peculiar left-handed ␤-helical folding motif. The active site zinc is located at subunit interfaces and coordinated by two histidines from one subunit and one histidine from a neighboring one. In addition, at an electron dense region which occupies the fourth coordination site, a putative water molecule is recognized. Although the polypeptide assembly forming the active site of the ␥-CA is different from those of the ␣-CAs, the geometry around the zinc of ␥-CA is quite similar to those of ␣-CAs.
Kinetic studies of ␤-CAs from higher plants show high catalytic efficiency at high pH with k cat values between 10 5 and 10 6 s Ϫ1 per subunit and k cat /K m values between 10 7 and 10 8 M Ϫ1 s Ϫ1 (9, 10), indicating that the higher plant ␤-CAs are catalytically as efficient as human CA II. On the other hand, extended x-ray absorption fine structure suggests that the coordination sphere of zinc in ␤-CAs of higher plants should have one or more sulfur ligands (10,11) in contrast to those in ␣and ␥-CAs. Moreover, the circular dichroic spectrum of a higher plant ␤-CA is distinct from that of human CA II (9). However, x-ray structure of any of the ␤-CAs has not yet been elucidated.
In photosynthetic CO 2 fixation, CO 2 and not HCO 3 Ϫ is the substrate for ribulose-1,5-bisphosphate carboxylase/oxygenase. However, HCO 3 Ϫ is the dominant species in alkaline chloroplast stroma, and the rate of spontaneous interconversion between CO 2 and HCO 3 Ϫ is insufficient to cope with the metabolic demand (2). It has thus been proposed that CA plays an essential role in the fixation of CO 2 in the Calvin-Benson cycle. CA in the red alga, Porphyridium purpureum, is thought to be involved in the CO 2 -concentrating mechanism which maintains a favorable CO 2 level at the carboxylation site and compensates for the relatively lower affinity of algal ribulose-1,5-bisphosphate * This work was supported in part by the New Energy and Industrial Technology Development Organization (NEDO). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The  (12,13). The CA monomer has a molecular mass of Ϸ55 kDa and contains two atoms of zinc per monomer. We have isolated cDNA clones of the CA (14). The clones encode a 571-residue polypeptide in which two domains, each equivalent to that of other ␤-CAs (Ϸ25-30 kDa), are arranged in tandem and exhibit Ϸ70% identity with each other. Thus, it is suggested that the CA gene of P. purpureum has been formed by duplication and fusion of a primordial ␤-CA gene. In this report, we describe the x-ray structure of P. purpureum CA. The study not only provides an understanding of the protein fold of the ␤-CA family but also reveals the novel architecture of the catalytic site, which is quite distinct from those of the ␣and ␥-CA enzymes.

EXPERIMENTAL PROCEDURES
Crystallization-Protein crystallization are described elsewhere. 2 Briefly, the hanging-drop, vapor diffusion method was used to crystallize P. purpureum CA, wherein 5 l of the purified protein preparation (30 mg ml Ϫ1 in 20 mM NaCl, 20 mM Tris-HCl, pH 8.5) was equilibrated against 5 l of the reservoir solution (24% polyethylene glycol 4000, 300 mM ammonium sulfate, 50 mM sodium cacodylate, pH 6.75) at 20°C.
Data Collection and Processing-For data collection under cryogenic conditions, crystals were soaked in the reservoir solution for 1 h, supplemented with 5% (v/v) glycerol and flash-frozen in liquid nitrogen at 100 K. Diffraction data for the frozen crystal were collected at 100 K using RIKEN beamline I (BL45XU) at SPring-8, Japan (15). Multiwavelength anomalous diffraction (MAD) data for intrinsic zinc atoms were collected at remote (1.0000 Å), peak (1.2823 Å), and edge (1.2825 Å) wavelengths.
Diffraction data were recorded on a R-AXIS IV image-plate detector with a crystal-to-detector distance of 250 mm. Two data sets at remote and edge wavelengths were simultaneously recorded in the same image plate. Diffraction data were processed with DENZO and intensity data were scaled together using SCALEPACK (16). Results of data collection are given in Table I.
Structure Determination and Refinement-The anomalous Patterson map showed very clear peaks for all vectors among four zinc atoms in the asymmetric unit. The x-ray structure of CA was determined at 2.2-Å resolution using intrinsic zinc MAD phasing (17). Refinement of atomic parameters of zinc and MAD phase determination were performed at 2.2-Å resolution by program SHARP (18). The figure of merit was 0.48 for 50,294 reflections. The electron density map calculated with the MAD phases was refined by the solvent flattening method and by the noncrystallographic symmetry averaging method using program SOL-OMON (19). The R-factor, R ϭ ⌺(F o Ϫ F c )/⌺F o , for the refined electron density map was 0.246, where F o and F c are observed and calculated structure factors, respectively. An initial model was built using program O (20) and TURBO-FRODO. The model was refined at 2.2-Å resolution with program X-PLOR (21) using non-crystallographic symmetry restraints for the two monomers in the asymmetric unit. The model were revised manually in omit maps or F o Ϫ F c maps after each cycle of X-PLOR refinement. Twenty cycles of simulated annealing, positional, and temperature factor refinement reduced the R-factor to 0.208 and the R free to 0.274, where R free was a R-factor estimated for 5% reflections excluded from the refinement. In the final model, interpretable electron density begins with residue Val 84 and ends with Gly 564 without interruption. The geometry of the final model, as calculated using program PROCHECK (22), is satisfactory with 91.7% of the residues falling into the most favored regions, 8.3% into the additional allowed regions in a Ramachandran plot. The final model also includes 4 zinc ions and 613 water molecules for the dimer. Results of MAD phasing and refinement statistics are summarized in Table I.

RESULTS AND DISCUSSION
Structural Duplication in a Monomer-The monomeric structure of P. purpureum CA is shown in Fig. 1A. There are essentially two symmetrical structural motifs in one monomer, resulting from two homologous repeats (Ϸ70% sequence identity) in the CA polypeptide (Ref. 14, Fig. 2). The two motifs are related to each other by a pseudo 2-fold axis. Upon superposition of these two halves (N-terminal half, residues 86 -309 and C-terminal half, residues 340 -563), the root mean square deviations between the C␣ atoms are 0.73 Å in monomer A and 0.74 Å in monomer B.
Each motif includes an ␣/␤ domain and three projecting ␣-helices. The ␣/␤ domain consists of ␣-␤-␣ units exhibiting Rossmann fold (23) and an anti-parallel ␤-strand (Fig. 1A). The motif is obviously distinct from that of either ␣or ␥-CAs. The three projecting ␣-helices run on the surface of the ␣/␤ domain formed by the opposite half of the same monomer. Six main chain hydrogen bonds are formed between the N-and C-terminal halves of a monomer. This means that the N-and Cterminal halves contact each other so intimately that they are unable to divide into two independent structural domains. Internal sequence repeats suggest that the structure of the monomer (Ϸ55 kDa) has evolved through gene duplication and fusion of an ancestral CA monomer. On the other hand, ␤-CA of pea has been reported to be an octameric enzyme with monomers of Ϸ27 kDa (9). It is likely that two monomers of the unduplicated ␤-CA oligomers, such as pea CA, contact each other similarly as do the two halves of P. purpureum CA.
Dimeric Structure with a Pseudo 222 Symmetry-An asymmetric unit cell of a crystal contains a dimer of two identical subunits, each with a molecular mass of 55 kDa (Fig. 1B). This CA also exists as a dimer in solution based on gel filtration. 3 The dimer has approximate dimensions of 90 Å ϫ 70 Å ϫ 60 Å. The two monomers are related by a 2-fold axis perpendicular to the pseudo 2-fold axis in the monomer. Consequently, the dimer has a pseudo 222 symmetry. Each monomer makes close contacts with its symmetry counterpart with a contact surface area of 4,000 Å 2 of each monomer (19% of a monomer's surface area). A long turn segment (residues 310 -339) connecting the where E i is the phase-integrated lack of closure for the reflection h, F i is the structure factor of the data collected at i , and F 0 is the structure factor collected at 1.0000 Å.
N-and C-terminal halves sticks out from one monomer toward the surface of the counter monomer (Fig. 1B).
Zinc Binding Environment-The two zinc-binding sites reside in two clefts (Fig. 3A) on both sides of a monomer. They are located at the C-terminal ends of the parallel ␤-sheets, as often found in ␣/␤ enzymes with Rossmann folds. One of the catalytic zinc atoms is coordinated in a tetrahedral manner with the S-␥ atom of Cys 149 , the O-␦2 of Asp 151 , the N-⑀2 of His 205 , and the S-␥ of Cys 208 in the N-terminal half and the other is with equivalent atoms of Cys 403 , Asp 405 , His 459 , and Cys 462 in the C-terminal half (Fig. 3B). These residues are strictly conserved among all ␤-CAs sequenced to date (Fig. 2). Here, zinc coordinated by the former tetrad is designated Zn-N and that by the latter Zn-C.
The electron density of a water molecule is not found within coordination radius to the zinc atoms, not only in the initial MAD map but also in the F o Ϫ F c map calculated after the final refinement cycle, indicating that no water or hydroxide participates in zinc coordination in the present x-ray structure. This is the most remarkable difference from the structures of ␣and ␥-CAs in which a hydroxide or water molecule occupies the fourth liganding site and the fourth ligand has been suggested to be a nucleophile of the CO 2 hydration reaction. Thus, we propose that the zinc-hydroxide mechanism is not directly applicable to ␤-CAs. Previously, two research groups proposed a Cys-His-Cys-H 2 O ligand scheme to bind zinc at the active site in higher plant ␤-CAs based on extended x-ray absorption fine structure (10,11) presuming that a zinc-hydroxide catalytic mechanism similar to that of ␣-CAs exists in ␤-CAs.
The three zinc-liganding residues other than the Asp suggested in these earlier studies are in agreement with those found in the structure of P. purpureum CA. Kinetic studies of ␤-CAs of higher plants have revealed that a basic form of the enzyme is the active species and the pK a for the activity linked group is approximately 8.5 (10). The present x-ray structure has been determined at pH 6.75. Considering that the side chain carboxyl groups of Asp 151 and Asp 405 would be more deprotonated at higher pH values, these residues should firmly bind the zinc under more alkaline conditions. In other words, zinc liganding by these Asp residues at neutral pH may not be a crystallization artifact and can also be observed in the basic forms of the enzyme. Based on the structure of P. purpureum CA, the extended x-ray absorption fine structure data indicate that one of the nitrogen/oxygen atoms involved in the zinc coordination is not that of a water molecule, but of the Asp residue. It is also of interest to note the results of the sitedirected mutagenesis analysis of potential zinc ligands in the ␤-CA of spinach (15), in which the Asn residue was substituted for Asp 152 (corresponding to Asp 151 and Asp 405 in P. purpureum CA). The mutant enzyme retained little CO 2 hydration activity, although the mutant enzyme could bind 80% of the zinc compared with the wild type CA. This result indicates that only one of the two carboxyl oxygen atoms of Asp 152 is necessary for zinc binding, and the other carboxyl oxygen or its negative charge is essential for catalytic activity.
The zinc-binding geometries of the N-and C-terminal halves of the P. purpureum CA monomer are shown superimposed on each other in Fig. 3C. Using the zinc atoms and all atoms of the four ligands as references, a root mean square deviation of 0.73 Å is obtained. Coordination geometries of the two zinc atoms are very similar to each other. However, side chain conformations of the two equivalent Asps are different from each other. These residues have higher temperature factors than the other three zinc ligands and may have lower occupancy and flexibility resulting in conformational variations. The similarity in the coordination geometries is consistent with the results obtained with CA mutants of P. purpureum CA. When each of the zinc-binding residues in the N-and C-terminal halves was mutated, each of the mutants retained nearly half of the wild-type CA activity. 3 However, results with these mutants differ from previous observations, in which the C-terminal halfpolypeptide lost activity but the N-terminal half one retained activity (14). Since the C-terminal half-polypeptide with the long turn segment was expressed, this segment may have seriously impaired folding around the active site.
Active Site Clefts-Out of 23 residues which are strictly conserved in the ␤-CAs in Fig. 2, including that from higher plants, algae, and prokaryotes, 14 residues are clustered on concave surfaces of the clefts (Fig. 3A). For purpose of clarity, only the cleft surrounding Zn-C is discussed here, but that surrounding Zn-N is basically equivalent. It should be noted that Zn-C is not exclusively surrounded by residues from Cterminal half of the protein. The conserved residues in the cleft are classified into two groups: group I includes residues located around Zn-C and reside in the C-terminal half of the protein, whereas group II includes residues that surround Zn-C and are present in the N-terminal half of the CA polypeptide. Group I residues are Ser 406 , Arg 407 , Gly 463 , Tyr 543 and 4 zinc ligands (Cys 403 , Asp 405 , His 459 , and Cys 462 ), while group II are Gln 140 , Pro 142 , Gly 165 , Phe 168 , Tyr 190 , and Leu 195 .
The four zinc ligands are involved in a number of hydrogen bond formations with residues in the immediate vicinity of the active site cleft (Fig. 3D). Each of the zinc-liganding cysteine residues is fixed by two NH-S hydrogen bonds. Tyr 190 (group II) and Tyr 543 (group I) are each hydrogen-bonded to zinc ligands, Asp 405 and His 459 , respectively, through water(s). These hydro- The arrangement of the amino acid residues around Zn-C is depicted here, but that around Zn-N is fundamentally equivalent. Conserved residues in the cleft are shown. Group I and group II residues (see Fig. 2) are colored blue and red, respectively. Pro 142 , Phe 168 , and Leu 195 form a hydrophobic environment adjacent to the zinc ligand, Asp 405 . B, stereo diagram of the zinc-binding ligands of P. purpureum CA. Experimental electron density derived by MAD phasing obtained after solvent flattening and 2-fold noncrystallography symmetry averaging shows zinc liganding participants (orange, zinc; green, sulfur; red, oxygen; and blue, nitrogen atoms). Cages of electron density were drawn at 1.9 in blue. In addition, the (F o Ϫ F c ) difference map calculated before assignment of the water molecule is drawn at 3.0 in green. A water molecule is hydrogen-bonded to O-⑀1 of the Asp 405 . Only Zn-C binding site is shown here, but the nature of the site around Zn-N is fundamentally equivalent (see C). C, stereo diagram showing superposition of the zinc ligands of the N-on the C-terminal halves, colored blue and green, respectively. Using the zinc atom and all atoms of the four ligands as references, the root mean square deviations obtained are 0.73 Å. Coordination geometries of Zn-N and Zn-C are very similar. However, side chain conformations of two equivalent Asp residues differ from each other. These residues with higher temperature factors may have flexibility affecting conformational variations. gen bonds may provide a possible proton pathway between an activated water and the bulk solvent (see below). Pro 142 , Phe 168 , and Leu 195 (all from group II) provide a hydrophobic environment by being positioned just besides Asp 151 . This may act as a site for possible CO 2 association (see below). If these conserved residues of group II are actually essential for catalytic activity, they may provide an explanation for internally duplicated structures of the CA involving close contact between the N-and C-terminal halves in forming the active site clefts.
Proposed CO 2 Hydration Mechanism-Site-directed mutagenesis of zinc ligands of higher plant ␤-CAs (11, 24) and of P. purpureum CA 3 have demonstrated that zinc is essential for catalysis. Although there is no water bound directly to zinc in the x-ray structure, a water molecule exists near each of the sites of zinc in an F o Ϫ F c electron density map (Fig. 3B). The water molecule is hydrogen-bonded to O-⑀1 of the zinc-liganding Asp 151 and Asp 405 , but the water does not exist in the direct zinc coordination radius. The CO 2 hydration reaction should require a catalytic water molecule which acts as a nucleophile and which must be at least transiently bound to the zinc. In addition, averaged temperature factors of main chain atoms for tripeptides starting from zinc-liganding Asp are 37 Å 2 and 39 Å 2 for Asp 151 -Ser 152 -Arg 153 and Asp 405 -Ser 406 -Arg 407 of monomer A, respectively, and 37 Å 2 and 40 Å 2 for those of monomer B, respectively, while those for all residues of monomer A and B are 21 Å 2 and 21 Å 2 , respectively. The tripeptide sequence is strictly conserved in ␤-CAs. The ligands other than the Asps have temperature factors of main chain atoms as low as around 12 Å 2 . Higher temperature factors for the tripeptide segments, including the Asp ligands, suggest that the segments are mobile.
We propose a possible mechanism of CO 2 hydration cycle as shown in Fig. 4. Hydrophobic pockets formed by Pro 142 , Phe 168 , and Leu 195 beside Zn-C and Pro 396 , Phe 422 , and Leu 449 beside Zn-N are candidates for the site of CO 2 association. Presum-ably, the CO 2 association triggers the subsequent catalytic steps. The zinc-bound aspartate functions as a base to accept a proton from its hydrogen-bonded water and yields a nucleophilic hydroxide (Fig. 4, step 1). As a consequence, the protonated aspartate will be released from the zinc and the resulting nucleophilic hydroxide moves toward and binds the zinc (Fig. 4,  step 2). In the next step, the hydroxide attacks the CO 2 molecule to generate zinc-bound HCO 3 Ϫ . The proton is transferred from the protonated aspartate to the bulk solvent or buffer, possibly through one of the hydrogen-bonded pathways immediately surrounding the zinc ligands ( Fig. 3D and Fig. 4, step 3). Then the zinc-bound HCO 3 Ϫ is replaced with a deprotonated Asp, releasing the HCO 3 Ϫ and leaving a zinc-bound Asp (Fig. 4,  step 4). Finally, a water molecule binds O-⑀1 of the zinc-bound Asp to regenerate the initial stage (Fig. 4, step 5).
The present results show that ␤-CA differs from ␣and ␥-CAs, not only in overall protein folding, but also in the nature of the architecture coordinating and surrounding the catalytic zinc. The absence of a water molecule in the zinc coordination sphere suggests that the zinc-hydroxide mechanism in ␣-CAs, and possibly in ␥-CAs, is not directly applicable in the case of ␤-CAs. Knowledge of the x-ray structure of P. purpureum CA should provide a more detailed description of the catalytic mechanism of ␤-CAs.