The Active Site of a Lon Protease from Methanococcus jannaschii Distinctly Differs from the Canonical Catalytic Dyad of Lon Proteases*

ATP-dependent Lon proteases catalyze the degradation of various regulatory proteins and abnormal proteins within cells. Methanococcus jannaschii Lon (Mj-Lon) is a homologue of Escherichia coli Lon (Ec-Lon) but has two transmembrane helices within its N-terminal ATPase domain. We solved the crystal structure of the proteolytic domain of Mj-Lon using multiwavelength anomalous dispersion, refining it to 1.9-Å resolution. The structure displays an overall fold conserved in the proteolytic domain of Ec-Lon; however, the active site shows uniquely configured catalytic Ser-Lys-Asp residues that are not seen in Ec-Lon, which contains a catalytic dyad. In Mj-Lon, the C-terminal half of the β4-α2 segment is an α-helix, whereas it is a β-strand in Ec-Lon. Consequently, the configurations of the active sites differ due to the formation of a salt bridge between Asp-547 and Lys-593 in Mj-Lon. Moreover, unlike Ec-Lon, Mj-Lon has a buried cavity in the region of the active site containing three water molecules, one of which is hydrogen-bonded to catalytic Ser-550. The geometry and environment of the active site residues in Mj-Lon suggest that the charged Lys-593 assists in lowering the pKa of the Ser-550 hydroxyl group via its electrostatic potential, and the water in the cavity acts as a proton acceptor during catalysis. Extensive sequence alignment and comparison of the structures of the proteolytic domains clearly indicate that Lon proteases can be classified into two groups depending on active site configuration and the presence of DGPSA or (D/E)GDSA consensus sequences, as represented by Ec-Lon and Mj-Lon.

In all cells, energy-dependent proteolysis plays a key role in the rapid turnover of short-lived regulatory proteins and in the elimination of defective and denatured proteins (1). Bacterial cells possess a number of ATP-dependent proteases, which are complex enzymes containing both ATPase and proteolytic activity as separate domains within a single polypeptide or as individual subunits within complex assemblies. Escherichia coli, for example, express five different ATP-dependent proteases: Lon, ClpAP, ClpXP, HslUV (ClpYQ), and FtSH (2). Homologous proteases have also been identified in archaea and eubacteria, as well as in numerous eukaryotes. Some archaeal Lons have one or two putative transmembrane regions, suggesting that they are membrane-associated (3). The proteolytic components of ATP-dependent proteases include several different types of active sites. For instance, ClpP is a classical serine protease (4), whereas HslV has a catalytic N-terminal Thr residue (5).
Lon was the first ATP-dependent protease to be described (6). Similar to the molecular chaperon, Lon recognizes a broad range of proteins and mediates their turnover of abnormal and short-lived normal proteins. Indeed, through degradation of various specialized proteins, Lon is involved in the regulation of a number of biological functions (7). Moreover, it also reportedly acts as a DNA-binding protein, influencing the regulation of DNA replication and gene expression (8). E. coli Lon (Ec-Lon) 1 is an 87-kDa protein containing N-terminal ATPase and C-terminal protease domains on a single polypeptide chain that is found as a homooligomer (9), although the oligomeric states of intact Lon were not reported consistently. Still, recent observations suggest that Lon isoforms from several sources selfassociate into hexameric or heptameric rings (10 -12). Sequence comparisons suggest that Lon contains a catalytic Ser-Lys dyad, and the first crystal structure of the proteolytic domain from Ec-Lon confirmed the presence of a catalytic Ser-Lys dyad within a unique structural fold, distinct from that of the classical serine proteases (13,14). In the present study, however, we found that the Mj-Lon proteolytic domain employs a unique catalytic Ser-Lys-Asp triad. Extensive sequence alignment and comparison of the structures of their proteolytic domains clearly indicate that Lon proteases can be classified into two groups depending on the configuration of the catalytic residues in the active site, as represented by Ec-Lon and Mj-Lon.

MATERIALS AND METHODS
Protein Expression and Purification-Full-length Lon protease was cloned by PCR using a Methanococcus jannaschii genomic library as a template. The resultant product was cloned into the NcoI and XhoI sites of pET28b vector (Novagen). The coding regions for the proteolytic domain of Mj-Lon (residues 456 -649) were then amplified from a plasmid harboring the full-length lon gene and cloned into the NdeI and XhoI sites of pET28b. The resultant construct provides for an N-terminal His 6 tag separated from the protein by a thrombin protease recognition site (LVP(R/G)S). The fusion protein was overexpressed in E. coli BL21(DE3). The cells were grown in LB at 37°C to an A 600 ϳ0.8, at which time expression was induced using 1 mM isopropyl-1-thio-␤-Dgalactopyranoside. Growth was continued at 30°C for 6 h, after which the cells were harvested, resuspended in binding buffer (500 mM NaCl and 50 mM NaH 2 PO 4 (pH 7.5)), and broken by sonication. The lysate was then run on an immobilized nickel affinity column, after which the column was washed with binding buffer, and the fusion protein was eluted in buffer comprised of the binding buffer plus 500 mM imidazole. After concentration, the tag was cleaved from the protein by treating it with thrombin protease at room temperature overnight, and the cleaved protein was subjected to size exclusion chromatography on a Superdex 200 column (Amersham Biosciences) equilibrated with 10 mM Tris-HCl (pH 8.0) plus 50 mM NaCl. The fractions containing the recombinant protein were pooled and concentrated to 9 mg/ml. To prepare the Se-Met-enriched protein, the protease domain of Mj-Lon was expressed in E. coli BL21(DE3) using saturation of the Met biosynthetic pathways protocols due to poor growth of the Met auxotroph strain E. coli B834(DE3) in supplemented M9 medium (15).
Crystallization-The proteolytic domain of Mj-Lon was crystallized at room temperature (20 Ϯ 1°C) using the hanging drop vapor diffusion method. Initial seeds of multiple twinned crystals were grown on a siliconized coverslip by equilibrating a mixture containing 2 l of protein solution (4.5 mg/ml protein in 50 mM NaCl and 10 mM MES-NaOH (pH 8.0)) and an equal volume of well solution (2.4 M ammonium sulfate and 100 mM MES-NaOH (pH 7.0)) against 1.0 ml of well solution. Under the same crystallization conditions, crystals were grown by microseeding with or without subsequent macroseeding to increase the size of single crystals. Within 1 week, single crystals grew to dimensions of 0.1 ϫ 0.2 ϫ 1.0 mm and were flash-frozen by direct transfer to Pratone-N cryoprotectant solution (Hampton Research).
Crystallographic Analysis-Se-Met multiwavelength anomalous diffraction data (Table I) were collected to 2.3-Å resolution from a single frozen crystal with an ADSC Quantum 4R CCD detector at beamline BL-18B at the Photon Factory, Tsukuba, Japan. The data set was processed and scaled using HKL2000 packages (16) and then handled with the CCP4 program suite (17). A second native data set from a Se-Met-labeled crystal was collected to 1.9-Å resolution with an ADSC Quantum Q210 CCD detector at beamline AR-NW12. Multiwavelength anomalous diffraction phasing was carried out using the programs SOLVE and RESOLVE (18). Despite the low proportion of anomalous scatter and the presence of only two Se-Mets among 388 residues, the resultant electron density map, showing a dimer in an asymmetric unit, was readily interpretable. Automatic model building was then carried out using the program MAID (19), with which 88% of the structure was modeled. A partial model containing 342 residues was refined using the program CNS (20) for several cycles of B-factor refinement and simulated annealing. Model phases from the partial model were then applied to the 1.9-Å data sets. Subsequent density modification of CNS yielded an electron density map of excellent quality showing a clear trace of all the residues except for 9 residues at the C termini. The remainder of the model was built manually into the density-modified map using the program O (21). The refinement of the native structure was completed with CNS to a final crystallographic R-factor of 22.6% and an R free of 26.3%. The final model covers residues 456 -640. The stereochemistry of the model was analyzed using PROCHECK (22), which showed no residue to be in a disallowed region.

RESULTS AND DISCUSSION
Description of the Structure-Like other Lon proteases from various sources, the 71.9-kDa Mj-Lon contains an N-terminal ATPase domain and a C-terminal protease domain. The ATPase domain of Mj-Lon containing two transmembrane helices shows low overall similarity to its bacterial and eukaryotic counterparts. As compared with the ATPase domains, the proteolytic domains are well conserved over all Lon families with 40% of the minimal pairwise homology. The proteolytic domain of Mj-Lon shares 29% identity and 49% similarity with the proteolytic domain of Ec-Lon over 193 amino acids. In the present study, we determined the crystal structure of the proteolytic domain of Mj-Lon (residues 456 -649) using multiwavelength anomalous dispersion and refined it to 1.9-Å resolution. The structure of the proteolytic domain consists of five ␣-helices and nine ␤-strands (Fig. 1A). The N-terminal ␤1 strand and antiparallel ␤2 strand form a long ␤-hairpin loop. The parallel ␤3 and ␤4 strands, which are connected by the longest helix (␣1), form the first large ␤-sheet with the ␤1 and ␤2 strands. The subsequent helix ␣2 is kinked at Ser-550, which is a catalytic residue in this enzyme. Helices ␣1 and ␣2 interact with the first ␤-sheet, forming a compact N-terminal subdomain. The other subdomain, which is connected by a random loop (␣2-␤5), is composed of three parallel ␤-strands (␤5, ␤8, and ␤9) surrounded by three ␣-helices. A ␤-hairpin loop composed of strands ␤6 and ␤7 connects strand ␤5 with helix ␣3, and subsequent strands and helices alternate along the primary structure.
The crystal structure of Mj-Lon showed that each asymmetric unit contains two molecules forming a dimer with a noncrystallographic 2-fold axis, presumably corresponding to the dimer in solution during purification. The dimeric structure of the proteolytic domain of Mj-Lon resembles a pair of lungs when viewed from the vertical with respect to the 2-fold axis (Fig. 1B). The dimer interface consists of a ␤-hairpin loop connecting strand ␤5, helices ␣3 and ␣4, the C-terminal end of helix ␣3, and 5 residues in the N-terminal subdomain from each monomer. The surface area buried upon dimer formation is 655 Å 2 , or 7.5% of the total surface area of each monomer. The interface is composed of 64.0% nonpolar atoms and 35.0% polar atoms, and within the dimer, the monomers are held together by two hydrogen bonds and numerous van der Waals interactions. It is questionable, however, whether the crystal structure of the dimer is consistent with the assembly of proteolytic domains in the intact Lon protease. This is because the active sites of the two monomers are both completely buried within the interface, preventing exposure of catalytic residues to the solvent unless the interface undergoes a marked conformational change. Considering that there are hydrophobic patches around active sites, it may be that dimerization of these proteolytic domains through hydrophobic interactions occurs in the absence of the ATPase domains. Also, given that obstruction of the active sites of the Mj-Lon proteolytic domains by dimerization is not reasonable in vivo, it seems likely that the observed dimeric structure is artifactual. Therefore, despite the presence of intact active site in the proteolytic domain, the dimerization might lead to loss of proteolytic activity. The truncation of N-terminal domain in Brevibacillus thermoruber Lon protease resulted in the failure of oligomerization and led to the inactivation of proteolytic and ATPase activities, indicating that N-terminal domain is essential for the correct oligomerization of the protein (23). Intact Lon protease is report-edly active as a homooligomer comprised of 4 -8 subunits. Lon proteases from several sources, including bovine (12), yeast (24), and Thermus thermophilus (25), self-associate into hexameric rings. Moreover, Botos et al. (14) reported that the recombinant catalytic domain of Ec-Lon protease assembled into the hexameric ring structure within crystal, suggesting a hexameric configuration of the holoenzyme.

Comparison of the Mj-Lon and Ec-lon Protease Domains-
The overall fold of the Mj-Lon proteolytic domain is similar to that of Ec-Lon; the r.m.s. deviation is 1.22 Å for 140 equivalent C␣ atoms. The configurations of the secondary structure elements are almost identical in the two structures, except for the segment between strand ␤4 and helix ␣2 (Fig. 2A). The residues extending from the middle of helix ␣2 to helix ␣3 (residues 549 -594 in M. jannaschii), which contain 2 catalytic residues, are structurally well conserved, with a C␣ r.m.s. deviation of 0.77 Å, and the catalytic residues, Ser-550 and Lys-593, are located at almost identical positions within each protease. The structural differences mainly originate from the loops connecting the secondary structure elements. The loop ␤1-␤2 near the active site shows the highest conformational difference among the connecting loops. The most significant difference is in the N-terminal portion of helix ␣2, where the catalytic Ser residue is located. Mj-Lon protease has two additional turns of ␣-helix bent by 35 degrees at the N-terminal end of helix ␣2. In Ec-Lon, by contrast, the equivalent segment is a ␤-strand spanning 6 residues. Sequence comparison clearly shows that Mj-Lon has helix-forming residues in the segment between residues 542 and 550, whereas Ec-Lon has 2 Pro residues hindering the formation of ␣-helix (Fig. 2B). Therefore, the conserved variation in the sequence of the ␤4-␣2 segment results in a significant difference of the configuration of the active site in the two groups of Lon proteases represented by Mj-Lon and Ec-Lon, respectively. Although both proteases have a conserved Asp residue within the ␤4-␣2 segment, only Asp-547 of Mj-Lon is located within salt bridge distance of the catalytic Lys. Consequently, the ␣-helical structure of the ␤4-␣2 segment in Mj-Lon not only changes the shape of the active site but also significantly alters the properties of the catalytic Lys through the formation of a salt bridge.
Another striking feature of the Mj-Lon proteolytic domain is the presence of a cavity in each protomer containing three water molecules aligned consecutively from within hydrogenbonding distance of the catalytic Ser-550 toward the interior of the protein (Fig. 3A). This cavity is completely buried and inaccessible from the surface of a monomeric protease domain. With a volume of 65 Å 3 , it contacts Ser-550 and is situated in a cleft between helix ␣2 and strands ␤1 and ␤5. The Ec-Lon proteolytic domain has a small cavity with a volume of 13 Å 3 , which contains one water molecule at a position corresponding to Wat3 of Mj-Lon. There is no contact with the active site residues, however. The known structure of the Ec-Lon proteolytic domain is that of an active site mutant (S679A), which does not faithfully represent the active site structure of the wild-type enzyme due to disruption of the hydrogen-bonding network caused by replacing the catalytic Ser-679 with an Ala. Still, the presence of additional water molecules in the cavity of wild-type Ec-Lon is not probable because the C-terminal half of the ␤1 strand of Ec-Lon is closer to helix ␣2 than that of Mj-Lon, resulting in shrinkage of the cavity surrounded by helix ␣2 and strands ␤1 and ␤5.
Active Site-The structure of the active site shows that Mj-Lon employs a pseudocatalytic triad comprised of Ser-550, Lys-593, and Asp-547 (Fig. 3B). Asp-547 and the catalytic residue Ser-550 are located in the same face of helix ␣2, oriented toward Lys-593 in helix ␣3. Superposition of the structures of the Ec-and Mj-Lon proteolytic domains shows that Ser-550 and Lys-593 share almost identical positions in the two enzymes, but Mj-Lon has an additional residue, Asp-547, that is located in the N-terminal end of helix ␣2 and interacts with the catalytic residues (Fig. 2A). The carboxyl group of Asp-547 is located at the first turn of helix ␣2 and makes a salt bridge with Lys-593 and a hydrogen bond with a water molecule. In Ec-Lon, the Asp residue is also conserved at the sequence level, but it is exposed to the solvent and not involved with the active site residues because the segment corresponding to the Nterminal end of helix ␣2 is a ␤-strand, which puts the Asp residue at a position distant from the active site (Fig. 3C).
The catalytic mechanism of the classical Ser-His-Asp triad is known to begin with the polarization of the Ser residue by the His residue, which abstracts the hydroxyl proton of the Ser residue. The Asp stabilizes the charged imidazole of the His, promoting its role as a base (26). Conceivably, the Lys in the Ser-Lys dyad may serve the same function as the His in serine proteases. Although Ec-Lon protease has no sequence or structural similarity to signal peptidase or the LexA family of serine proteases, the Ec-Lon proteolytic domain is believed to utilize a Ser-Lys dyad mechanism similar to that used by signal peptidase, in that it has conserved geometries of the Ser-Lys dyad and a Thr residue assisting the dyad (14,27). The structures of signal peptidases suggest that the hydrophobic environment surrounding the catalytic Lys ⑀-amino group is likely essential for lowering its pK a so that it can reside in the deprotonated state required for its role as a general base (28,29). It is supposed from the structural comparison that Lys-722 in Ec-Lon is not charged and acts as a general base as there is no acidic residue in the vicinity of the catalytic Lys and the side chain O␥ of the conserved Thr might be within hydrogenbonding distance of the catalytic Lys in the wild-type enzyme (14). It appears that the Ser-Lys dyad mechanism requires a third residue (Thr/Ser O␥) for optimal activity, possibly serving to help align the general base Lys with the nucleophilic Ser hydroxyl via a hydrogen-bonding network. The role of such a Thr (Thr-704 in Ec-Lon) might be similar to the Asp present in the classic catalytic triad of serine proteases (30).
The same mechanism is not applicable to the Ser-Lys-Asp triad of Mj-Lon because the chemical properties of the Lys-Asp pair are quite different from those of a deprotonated Lys. In the environment of the active site of Mj-Lon, Lys-593 forms a salt bridge with Asp-547 and, presumably, is therefore positively charged. Moreover, unlike Ec-Lon, the conserved Thr-575 is not hydrogen-bonded to Lys-593 but to a water molecule (Wat1). Consequently, Lys-593 is unsuitable to function as a proton acceptor to abstract a proton from Ser-550. It is expected that the role of Lys-593 is to lower the pK a of the hydroxyl group of Ser-550 via its electrostatic potential so that the O␥ of Ser-550 can act as a nucleophile during proteolysis, as is seen in the structures of 20 S proteasomes (31).
The most probable proton acceptor is Wat1, which is located in the cavity and hydrogen-bonds with Ser-550. Wat1 is ideally positioned to act as the general base and promote the abstraction of a proton from the Ser-550 hydroxyl group, initiating nucleophilic attack on the carbonyl carbon of the peptide bond. The oxygen atom of Wat1 points directly at the hydroxyl of Ser-550 as if ready to deprotonate it, whereas the two hydrogen atoms of Wat1 are firmly hydrogen-bonded to the carbonyl oxygen of Leu-465 and hydroxyl group of Thr-575, as is shown in the 2F o Ϫ F c electron density map (Fig. 4A, 4B). Coordination of Wat1 by carbonyl and hydroxyl oxygens suggests that it might act as a proton acceptor in the initial polarization of Ser-550 during catalysis. The presence of an additional carbonyl oxygen of Ser-550 near Wat1 may stabilize the protonated water molecule. The conserved Thr residue (Thr-575 in Mj-Lon), which assists the catalytic dyad in Ec-Lon by hydrogen bonding with the deprotonated Lys, hydrogen-bonds instead with Wat1 in Mj-Lon, implying that its function differs from that in Ec-Lon.
An important component of the catalytic machinery in serine proteases is the oxyanion hole that works by neutralizing the developing negative charge on the scissile carbonyl oxygen during the formation of the tetrahedral intermediates. Typical oxyanion holes are formed by two main-chain amide hydrogens that serve as hydrogen bond donors to the developing oxyanion (32). The oxyanion hole in Mj-Lon is formed by two main chain amide hydrogens, one originating from the catalytic Ser-550 and the other from the preceding residue, Asp-549.
Despite the similar geometries of Ser and Lys residues and the similarity of the oxyanion holes in signal peptidases and the proteolytic domain of Mj-Lon, it is conceivable that Mj-Lon does not employ a Ser-Lys dyad because the hydrogen-bonding environment for the catalytic Lys is quite different. Instead, a structural comparison suggests that the mechanism of Mj-Lon protease is comparable with the autolytic mechanism of prosequences in the 20 S proteasomes, which belong to the N-termi- nal threonine protease family. In 20 S proteasomes, a conserved Lys charged by the salt bridge with a nearby Asp lowers the pK a of the Thr-1 O␥ via its electrostatic potential, and a base close to Thr-1 plays a role in the proton transfer (33). Because the N-terminal amino group is not available to serve as a proton acceptor before activation of the protease through cleavage of the prosequence, a water molecule close to Thr-1 O␥ presumably serves as the proton acceptor in the initial autolysis step (31). Although the folds of the 20 S proteasome and Mj-Lon proteolytic domains are completely different overall, the presence of equivalent catalytic residues (Ser, Lys, and Asp) suggests that they share a similar proteolytic mechanism. However, the exact catalytic mechanism of Mj-Lon cannot be deduced from its structure. Further biochemical and structural analysis will be needed to reveal the precise mechanism.
Classification of Proteolytic Domains of Lon Proteases-A search using the proteolytic domain of Mj-Lon against the Swiss-Prot data base with the BLAST algorithm identified over 100 different Lon proteases from 90 different organisms. Extensive sequence alignment of homologous proteins showed that the segment containing the catalytic residues (residues extending from helix ␣2 to ␣3, residues 545-596 in M. jannaschii) are well conserved among all Lon protease families; moreover, the catalytic residues, Ser-550 and Lys-593 in M. jannaschii, are strictly conserved, without exception. On the other hand, the residues preceding the catalytic Ser are clearly divided into two groups: Asp (Asp-549 in M. jannaschii) preceded by ␣-helix-forming residues and Pro (Pro-678 in E. coli) preceded by ␤-strand-forming residues (Fig. 2B). A third putative residue in the active site of Mj-Lon, Asp-547, is also well conserved at the sequence level as either an Asp or a Glu among all proteases but has a different structural configuration for each group of proteases as described earlier. Thus, the shape of the active site pocket will significantly differ, depending on whether a Pro or an Asp precedes the catalytic Ser. Consequently, the protease domains of Lon can be classified into two groups with consensus sequences of DGPSA or (D/ E)GDSA in the active site, which are represented by the proteolytic domains of Ec-and Mj-Lon in the present study. Hereafter, these two types of Lon proteases will be referred as type I and type II proteases, respectively.
Comparison of the structures of the Mj-and Ec-Lon proteolytic domains revealed that these enzymes likely have different catalytic mechanisms, although they share the same overall fold and conserved dyad residues (Ser-550 and Lys-593). The difference in the shapes of active sites also implies that the substrate specificity and the catalytic activity of these proteases differ. Most type II Lon proteases are archaeal, but some have been identified in eubacteria (e.g. E. coli and Haemophilus ducreyi). Some microorganisms (e.g. E. coli, Bacillus subtilis, and Pseudomonas aeruginosa) express both Lon types. Lon homologues located in a second open reading frame found in various bacteria have been designated LonB and are not necessarily type II Lon proteases. It is clear that LonA (e.g. Ec-Lon and Mj-Lon) and LonB are distinct as the latter contains an active site region (the catalytic Ser residue) but not an ATP-binding site (34). The presence of different types of Lon homologues within a cell suggests that they have distinct metabolic functions and substate specificities. Type II Lon is not found in eukaryotes (e.g. yeast, mouse, human, and other higher organisms), implying that dyad proteases are selected evolutionarily for eukaryotic cellular function.