Expression, assay, and structure of the extracellular domain of murine carbonic anhydrase XIV: implications for selective inhibition of membrane-associated isozymes.

Carbonic anhydrase (CA) XIV is the most recently identified mammalian carbonic anhydrase isozyme, and its presence has been demonstrated in a number of tissues. Full-length CA XIV is a transmembrane protein composed of an extracellular catalytic domain, a single transmembrane helix, and a short intracellular polypeptide segment. The amino acid sequence identity of human CA XIV relative to the other membrane-associated isozymes (CA IV, CA IX, and CA XII) is 34-46%. We report here the expression and purification of both the full-length enzyme and a truncated, secretory form of murine CA XIV. Both forms of this isozyme are highly active, and both show an abrogation of activity in the presence of 0.2% SDS, in contrast to the behavior of murine CA IV. We also report the crystal structure of the extracellular domain of murine CA XIV at 2.8 A resolution and of an enzyme-acetazolamide complex at 2.9 A resolution. The structure shows a monomeric glycoprotein with a topology similar to that of other mammalian CA isozymes. Based on the x-ray crystallographic results, we compare and contrast known structures of membrane-associated CA isozymes to rationalize the structural elements responsible for the SDS resistance of CA IV and to discuss prospects for the design of selective inhibitors of membrane-associated CA isozymes.

based on amino acid sequence identity to active isozymes (3). The CAs are ubiquitous in mammalian tissues, but individual isozymes display tissue-specific distributions (1). Further distinctions in isozyme localization are due to the cytosolic, membrane-associated, or secretory nature of specific CAs. The varying tissue distributions have been exploited for the development of CA inhibitors targeted to specific regions of the body, the most notable example being topically applied compounds such as dorzolamide and brinzolamide for the treatment of glaucoma (4). Subtle structural differences among the CA isozymes also hold promise for the development of isozymespecific inhibitors, certain examples of which have been demonstrated (5).
The most recently identified mammalian CA isozyme is CA XIV. Through the use of Northern blotting and reverse transcriptase-polymerase chain reaction techniques, CA XIV mRNA has been demonstrated in kidney, liver, brain, skeletal muscle, heart, and lung (6 -8). The protein itself has been identified in murine and human brain, murine liver, and rat and murine kidney (8 -10). CA XIV is a bitopic membrane protein with an extracellular N-terminal catalytic domain, a single membrane-spanning segment, and a small intracellular C-terminal polypeptide containing potential phosphorylation sites (6,7). The first 15 amino acids are hydrophobic and constitute a signal sequence, and the catalytic domain contains one putative N-glycosylation site (6,7). This topology is similar to that of the other transmembrane isozymes, CA IX and CA XII. A fourth isozyme, CA IV, is also membrane-associated, but the post-translational attachment of a glycosylphosphatidylinositol group to the C terminus of CA IV serves as the membrane anchor rather than the polypeptide itself (11). The amino acid sequence identity of human CA XIV relative to the other three membrane-bound CA isozymes is 34 -46%. CA XIV also shares 38% sequence identity with CA VI, an extracellular, secreted isozyme found in saliva.
Despite similarities in amino acid sequences and overall topology, the membrane-associated CAs differ in tissue distribution. CA XIV is found in regions of liver cells distinct from the location of CA IV (10), although certain regions of the kidney show positive immunostaining for both of these isozymes, suggesting redundant function (8). Intriguingly, the presence of an extracellular carbonic anhydrase has long been suspected in mammalian brain (12,13). Known CA inhibitors, including compounds that are impermeable to cells, were shown to enhance the extracellular alkaline shift observed in slices of hippocampus after synaptic transmission (12)(13)(14). Recent immunostaining results identify CA XIV on neurons and axons in both mouse and human brain, suggesting that this isozyme is responsible for modulating pH shifts during excitatory synaptic transmission (9). The other two transmembrane isozymes, CA IX and CA XII, show a varied tissue distribution, but both are overexpressed in certain cancers, and their transcription is regulated by the von Hippel-Landau tumor suppressor (15)(16)(17)(18)(19)(20)(21)(22).
We report here the expression, purification, and assay of the soluble, extracellular domain of murine CA XIV and its structure determination by x-ray crystallographic methods. The xray structure confirms that CA XIV is a glycoprotein and helps define its quaternary structure relative to its solution behavior and its similarity to the related CA XII isozyme. The structure of murine CA XIV complexed with acetazolamide is also presented. Based on the structures of CAs IV, XII, and XIV, we rationalize the structural elements responsible for the unique SDS resistance of CA IV. This resistance allowed CA IV to be solubilized from tissues by SDS and purified in the presence of SDS, conditions under which other known CAs were inactive (11); subsequently, resistance to SDS became a practical means of determining the contribution of CA IV to total activity in tissues. Finally, prospects for the design of inhibitors selective for the extracellular CA isozymes are discussed based on the structural data.

EXPERIMENTAL PROCEDURES
Expression and Purification-The cDNA was cloned by PCR using mRNA from C57BL6 mouse kidney using primers designed by Mori et al. (6). The sequence 2 differs from that reported for CA XIV from BalbC (6) and predicts a His instead of a Gln at residue 108 (see Fig. 3) and a 3-base pair in-frame deletion of Ala-263. Mammalian expression vectors containing the cDNA of the wild-type, full-length membrane form and secretory form (I261X) of murine CA XIV were constructed as described (9,23). Stable Chinese hamster ovary clones expressing the secretory or the full-length membrane form of murine CA XIV were isolated and characterized by CA activity following established procedures (23). A full-length, membrane form of murine CA XII cDNA was subcloned into the mammalian expression vector pCXN (24) and transiently expressed in COS-7 cells (19). The enzyme expression was analyzed by CA activity measurements (25).
The I261X secretory form of murine CA XIV, designated CA14x, was purified from secretion medium using CA inhibitor affinity chromatography (11). The secretion medium was applied to a CA inhibitor affinity column equilibrated with 10 mM HEPES (pH 7.5). The unbound protein was removed by washing with equilibration buffer containing 0 mM and then 150 mM NaCl. The bound enzyme was eluted with 0.1 M sodium acetate containing 0.5 M sodium perchlorate (pH 5.5). The eluted enzyme was concentrated and dialyzed against 10 mM Tris-SO 4 (pH 7.5). The homogeneity of the enzyme was assessed by size exclusion chromatography using Sephacryl S-300, SDS-PAGE, and specific activity (25).
Activity Assays-CA activity was measured by the procedure of Maren (26), as described (25). SDS-resistant CA activity was determined on affinity pure CA samples or membrane-bound CA samples preincubated with 0.2% SDS at room temperature for 30 min prior to activity measurements. The protein concentration was determined by the micro Lowry procedure (27). CA activity is expressed in enzyme units/mg of cell protein for unpurified enzyme or in enzyme units/mg of affinity pure CA.
Crystallization and Data Collection-The CA14x protein was crystallized at room temperature by the hanging drop vapor diffusion method. Drops containing 1.8 l of 7 mg/ml enzyme in 20 mM sodium phosphate (pH 7.2) and 150 mM NaCl were mixed with 1.8 l of precipitant buffer (5.5% (w/v) polyethylene glycol 4000, 0.1 M sodium acetate, pH 4.8, 20 mM NaCl) and equilibrated over a well containing 1.0 ml of precipitant buffer. Crystals appeared in the drops within 48 h and grew as long, thin rods to maximum dimensions of 0.7 ϫ 0.03 ϫ 0.03 mm 3 . For data collection, a microspatula was used to break the rods into shorter pieces that were subsequently harvested into a stabilizing buffer containing 10% (w/v) polyethylene glycol 4000, 0.1 M sodium acetate (pH 4.8), 20 mM NaCl, and 10% (v/v) glycerol. After sequential transfers to stabilizing solutions containing 15 and 25% (v/v) glycerol, the crystals were flash cooled in liquid nitrogen. Diffraction data to 2.8 Å resolution were collected from a single CA14x crystal at beamline X25 of the National Synchrotron Light Source at the Brookhaven National Laboratories. The data were processed with the HKL suite (28) and TRUNCATE (29). The crystals belonged to space group P2 1 with unit cell dimensions a ϭ 59.0 Å, b ϭ 75.6 Å, c ϭ 73.2 Å, and ␤ ϭ 98.9°; with two molecules in the asymmetric unit, the Matthew's coefficient V M ϭ 2.65 Å 3 /Da (53% solvent content). For preparation of the CA14x-acetazolamide complex, the crystals were soaked in a stabilizing solution containing 10% (w/v) polyethylene glycol 4000, 22% (v/v) glycerol, 0.1 M sodium acetate (pH 5.5), 20 mM NaCl, and 5 mM acetazolamide for 90 h prior to flash cooling in liquid nitrogen. The data were collected at beamline X12C of National Synchrotron Light Source and processed as described. The data collection statistics are recorded in Table I.
Structure Determination and Refinement-The molecular replacement calculations were performed with AMoRe (30) using the atomic coordinates of CA XII (Protein Data Bank code 1JCZ) as a search probe (31). Rotation searches using diffraction data from 15-3.5 Å resolution yielded two clear solutions, as ranked by correlation coefficient. Subsequent translation searches placed two molecules in the asymmetric unit, and rigid body refinement lowered the R factor to 0.453. Model building was performed with the program O (32), and simulated annealing, conjugate gradient positional refinement, and temperature factor refinements were performed with CNS (33). Strict noncrystallographic symmetry constraints were applied during initial refinement cycles and were subsequently released into heavily weighted restraints ( ϭ 300 kcal/mol Å 2 ) for all atoms except those that displayed obvious differences between the two molecules, e.g. due to crystal lattice contacts. The data were refined against a maximum likelihood target function as implemented in CNS, and a bulk solvent correction was employed (k sol ϭ 0.35 e Å Ϫ3 was defined) (34). Automatic B factor corrections were not used. Solvent molecules were built into the model at positions where the F o Ϫ F c maps contained peaks of Ն3.0 that displayed appropriate hydrogen bonding interactions. The final model had R cryst ϭ 0.234 (R free ϭ 0.274) and included two copies of the CA14x polypeptide, two zinc ions, two acetate ions, four N-acetylglucosamine rings, one mannose ring, and 90 solvent molecules. Geometric parameters were analyzed with PROCHECK (35); a total of 85 and 14% of the backboneconformations adopt most favorable and additionally allowed conformations, respectively.
The structure of the CA14x-acetazolamide complex was solved using the difference Fourier method starting from the wild-type CA14x structure less all zinc ions, acetate ions, solvent molecules, and sugar moi- Expression, Assay, and Structure of Murine CA XIV eties. Rigid body, positional, and grouped temperature factor refinements in CNS resulted in a final model having R cryst ϭ 0.207 (R free ϭ 0.253). The refinement protocol was the same as that used for the native CA14x structure, except that an initial B factor correction was applied to the data. The final model contained two CA14x polypeptide chains, two zinc ions, two acetazolamide molecules, four N-acetylglucosamine rings, two mannose rings, and 32 water molecules. The data refinement statistics for both structures are recorded in Table I.

RESULTS AND DISCUSSION
Molecular Characterization and Enzyme Activity-Affinitypurified CA14x migrates as a 44-kDa polypeptide on SDS-PAGE (Fig. 1). However, the calculated molecular mass of CA14x deduced from its amino acid sequence is 29.5 kDa. Therefore, the increase in apparent molecular mass of CA14x and the presence of one consensus sequence for N-glycosylation suggest that murine CA XIV is a glycoprotein. Accordingly, the affinity-purified mouse CA14x produced in the glycosylationdefective Lec-1 cell line shows a slightly smaller apparent molecular mass on SDS-PAGE (data not shown). Moreover, CA14x eluted as single peak of 120-kDa mass during size exclusion chromatography on Sephacryl S-300. Because the mass of the monomeric glycopeptide estimated by SDS-PAGE was 44 kDa, these results suggest either that native CA14x exists as a multimer in solution or that its migration on an S-300 column deviates from that of an ideal globular protein.
X-ray crystallographic results support the latter explanation (see below).
The specific activity of pure recombinant glycosylated CA14x produced in Chinese hamster ovary cells was compared with unglycosylated murine CA II and glycosylated murine CAs IV and XII. The results are presented in Table II. The specific activity of CA XIV (3284 enzyme units/mg) is higher than any of the other isozymes investigated, including CA II. Unlike murine CA IV, CA14x and the secretory form of CA XII were SDS-sensitive. Membrane-associated CA XII and CA XIV were also SDS-sensitive, indicating that CA IV is more stable than even the wild-type transmembrane isozymes. Because CA IV, CA XII, and CA XIV share a common disulfide bond between Cys-23 and Cys-203, the increased stability of CA IV must be ascribed to the additional disulfide bond between Cys-6 and Cys-13 of this isozyme.
Overall Structure of Murine CA XIV-The structure of the extracellular catalytic domain of murine CA XIV reveals a polypeptide fold characteristic of the ␣-CA isozymes in which a 10-stranded ␤-sheet forms the core of the molecule (Fig. 2). A single disulfide linkage is present between residues Cys-23 and Cys-203 that is identical to disulfide bonds found in the membrane-associated isozymes CA IV and CA XII. This disulfide bond helps to stabilize a polypeptide loop in the active site containing Thr-199, a residue that promotes efficient catalysis by orienting the nucleophilic zinc-bound solvent molecule through a hydrogen bonding interaction (36,37). Additionally, because CA14x was produced in Chinese hamster ovary cells, the molecule is glycosylated. Both molecules in the asymmetric unit exhibit electron density consistent with N-glycosylation of Asn-195. The CA14x-acetazolamide structure exhibits the highest quality electron density for the carbohydrate. In molecule B, four of the sugar rings that form the core pentasaccharide commonly found in N-glycosylation are visible: two N-acetylglucosamine and two mannose moieties. Molecule A exhibits electron density for only the two N-acetylglucosamines.
The catalytic domain of CA XIV exhibits only minor differences compared with the structure of CA XII (31), the most closely related transmembrane isozyme. Two loop regions vary between these isozymes; CA XIV has an insert in the Gly-151-Glu-153 loop and a deletion in the Thr-233-Pro-240 loop relative to CA XII. A structure-based sequence alignment is pre- a Affinity pure carbonic anhydrase equivalent to 1 enzyme unit, or cell membrane suspension equivalent to 0.5 enzyme unit, was exposed to 0.2% SDS at room temperature for 30 min before activity assay.
b Murine CA IV has lower activity than murine CA II, as shown, although human CA IV and CA II both have comparable, high level activity (51). sented in Fig. 3. The catalytic domains of CA XII and CA XIV contain N-glycosylation sites at differing locations in their sequence, but the protein backbones still superimpose well (root mean square deviation ϭ 1.1 Å for 255 C␣ atoms). The most striking difference between CA XII and CA XIV is quaternary structure; CA XII is a dimer with 2200 Å 2 buried surface area between monomers, whereas CA XIV appears monomeric. The largest surface area buried between adjacent CA XIV molecules in the crystal lattice is 730 Å 2 (365 Å 2 /monomer), which in-cludes the surface area of the carbohydrate, whereas statistical analyses of the buried surface between a large sampling of biological dimers suggests a minimum buried surface area of 1700 Å 2 (860 Å 2 /monomer) (38). The packing of these two CA14x molecules also occludes the active site of one, further arguing against the existence of a functional CA XIV dimer, and the majority of amino acids present in the human CA XII dimer interface are not conserved in human or murine CA XIV. Even so, the proposed transmembrane ␣-helix of full-length CA XIV contains one GXXXG sequence motif, a feature that has been shown to promote helix-helix interactions in membranes (39,40). The CA XII transmembrane sequence contains a similar motif that is proposed to facilitate dimerization (31). Although the crystal structure of CA14x presents no evidence for a dimeric protein, we cannot exclude the possibility that the transmembrane ␣-helix of CA XIV promotes homodimerization or heterodimerization of the full-length enzyme in the membrane.
Two other membrane-associated isozymes exist with similarity to CA XIV, namely CA IV and CA IX. The structure of the extracellular catalytic domain of CA IX is not known, but crystal structures of human and murine CA IV have been reported (41,42). Comparing the structures of CAs IV and XIV reveals an overall similarity of the ␤-sheet superstructure but highlights two regions of notable difference. First, the N terminus of CA IV has an insert of 5 amino acids relative to other ␣-CAs. This insert contains an additional, short ␣-helix that is not present in the CA XII or XIV structures. As a result of this extra sequence, the N-terminal portion of the CA IV structure makes van der Waals' contact with the Leu-230 -Gln-238 loop region of the molecule. Such interactions are absent in other CA structures (Fig. 4). CA IV also contains an extra disulfide bond within this small insert between Cys-6 and Cys-13. This extra disulfide bond and the contact between the N-terminal insert and the Leu-230 -Gln-238 loop appear to stabilize this isozyme against inactivation by SDS, resistance to which is unique to CA IV. The second notable difference between the CA IV and CA XIV structures is found in the loop region from Ser-125 to Gly-140. In CA XIV and in other CA isozymes, this region contains a short ␣-helix that flanks the top edge of the active site cleft; however, in CA IV this same region exists as an extended loop exhibiting substantial disorder (41,42). Indeed, the difference in this loop region has been suggested to account for differences in inhibitor K i values between human CAs II and IV (42,43).
Active Site Structure-As in other CA isozyme structures, the catalytic zinc ion is located at the bottom of a conical active site cleft where it is ligated by His-94, His-96, and His-119. The trigonal bipyramidal zinc coordination polyhedron is completed by a solvent molecule and an acetate ion from the crystallization buffer (Fig. 5A). Thr-199 accepts a hydrogen bond from the zinc-bound solvent molecule (presumably water at pH 4.8). One acetate oxygen atom coordinates to zinc, whereas the second acetate oxygen atom accepts hydrogen bonds from the zincbound solvent molecule and the backbone NH group of Thr-199. This arrangement is analogous to the binding of acetate to CA II and CA XII (31,44), and it mimics the binding of the bicarbonate ion product (37,45).
Amino acids in the immediate environment of the catalytic zinc ion are conserved across all active ␣-isozymes, and CA XIV is no exception. CA XIV also contains the proton shuttle residue His-64, which is present in all enzymes with high activity. Because proton release is the rate-limiting step in catalysis (46,47), isozymes without this residue display substantially lower activity (48). In CA XIV, His-64 adopts the "out" conformer (data not shown). This orientation is typically observed in CA structures solved at low pH (31,41,49). Because of the moderate 2.8 Å resolution of the CA XIV structure, no ordered solvent network is observed between His-64 and the zinc-bound solvent molecule. Such a solvent network was observed in the 1.54-Å resolution structure of CA II (50).
Inhibitor Binding to CA XIV-Structural similarity between CA XIV and other isozymes in the active site cleft adjacent to the catalytic zinc ion suggests that aromatic sulfonamides such as acetazolamide, a potent class of CA inhibitors, will bind tightly to CA XIV. Notably, 1 M acetazolamide reduces CA XIV activity in extracts from COS-7 cells by 84% (7). Acetazolamide and a membrane-impermeable CA inhibitor, benzolamide, enhance the extracellular pH shifts in neurons (12,14), in which immunostaining has recently demonstrated the presence of CA XIV (9). To examine the binding of acetazolamide to murine CA XIV, we determined the crystal structure of the CA14x-acetazolamide complex at 2.9-Å resolution. Following x-ray crystallographic refinement and model building of the CA14x polypeptide, zinc ions, and carbohydrate, a clear, continuous feature of electron density was present in the active site in difference Fourier maps (Fig. 5B). As with acetazolamide binding to other isozymes, the ionized sulfonamide NH Ϫ group coordinated to the zinc ion and donated a hydrogen bond to Thr-199. Additional hydrogen bond contacts were made between the backbone nitrogen atom of Thr-199 and a sulfonamide O atom and between the hydroxyl group of Thr-200 and a nitrogen atom on the 1,3,4-thiadiazole ring. The contacts made between acetazolamide and the CA XIV active site are depicted schematically in Fig. 5C.
Among the extracellular CA isozymes, unique variations in CA XIV occur at several positions near the top of the active site cleft (Fig. 6). This region lies adjacent to the end of the acetazolamide inhibitor in the CA XIV active site, and these differences could potentially be exploited in the design of isozymespecific inhibitors. Notably, the residues lining the murine CA XIV active site cleft are identical to those in human CA XIV, allowing us to draw valid inferences about the human isozyme from the murine structure. In CA XIV, Tyr-204 is found in place of Asp or Asn residues in the human CA IV and XII sequences, a difference that eliminates a potential hydrogen bond interaction on one face of the active site cleft. Gln-67 from CA XIV replaces either Lys-67 or Met-67 from CA XII or CA IV, respectively, providing another unique feature to the cleft surface. Finally, the combination of Ala-91 and Leu-131 creates a hydrophobic patch in CA XIV where Thr-91 and Ala-131 are found in CA XII. CA IV also differs in this region because of the disorder of the "130's segment," which is found as an ordered ␣-helix in CA XII, CA XIV, and other isozymes.
Potent inhibitors of CA isozymes bind to the catalytic zinc ion. Because of the absolute identity of amino acids adjacent to the catalytic metal ion in membrane-associated CA IV and transmembrane CA XII and CA XIV, identical intermolecular interactions are expected between the protein and the zincbinding group for inhibitors of these enzymes. To achieve isozyme specificity, potential CA inhibitors should exhibit differences in regions removed from the zinc-binding head group that exploit the variations in the active site cleft discussed above. Compounds that bind with higher affinity to CA II versus CA IV exhibit just such differential interactions with the active site clefts in those enzymes (43). The extracellular placement of the membrane-associated CA active sites also facilitates their selective targeting by charged and/or membraneimpermeable compounds. This combination of extracellular active sites, subtle differences in molecular structure of the active site cleft, and tissue-specific expression profiles for the membrane-associated CA isozymes holds promise for their selective inhibition by small molecules.