The steroid side-chain–cleaving aldolase Ltp2–ChsH2DUF35 is a thiolase superfamily member with a radically repurposed active site

An aldolase from the bile acid-degrading actinobacterium Thermomonospora curvata catalyzes the C–C bond cleavage of an isopropyl-CoA side chain from the D-ring of the steroid metabolite 17-hydroxy-3-oxo-4-pregnene-20-carboxyl-CoA (17-HOPC-CoA). Like its homolog from Mycobacterium tuberculosis, the T. curvata aldolase is a protein complex of Ltp2 with a DUF35 domain derived from the C-terminal domain of a hydratase (ChsH2DUF35) that catalyzes the preceding step in the pathway. We determined the structure of the Ltp2–ChsH2DUF35 complex at 1.7 Å resolution using zinc-single anomalous diffraction. The enzyme adopts an αββα organization, with the two Ltp2 protomers forming a central dimer, and the two ChsH2DUF35 protomers being at the periphery. Docking experiments suggested that Ltp2 forms a tight complex with the hydratase but that each enzyme retains an independent CoA-binding site. Ltp2 adopted a fold similar to those in thiolases; however, instead of forming a deep tunnel, the Ltp2 active site formed an elongated cleft large enough to accommodate 17-HOPC-CoA. The active site lacked the two cysteines that served as the nucleophile and general base in thiolases and replaced a pair of oxyanion-hole histidine residues with Tyr-246 and Tyr-344. Phenylalanine replacement of either of these residues decreased aldolase catalytic activity at least 400-fold. On the basis of a 17-HOPC-CoA -docked model, we propose a catalytic mechanism where Tyr-294 acts as the general base abstracting a proton from the D-ring hydroxyl of 17-HOPC-CoA and Tyr-344 as the general acid that protonates the propionyl-CoA anion following C–C bond cleavage.

for the removal of steroid waste in the environment, and the associated catabolic pathways have been exploited for the synthesis of steroidal pharmaceuticals (3). More recently, a cholesterol-degradation pathway has been found to be important for the persistence of Mycobacterium tuberculosis in host macrophages. Although drug-susceptible tuberculosis is treatable through a 6-month course of four first-line antibiotics, patient noncompliance in antibiotic use has resulted in the emergence of multidrug-resistant and extensively drug-resistant strains of the bacteria (4). The steroid degradation pathways and associated enzymes have therefore received attention as potential targets for the development of new antibiotics against tuberculosis (5,6).
Bacteria degrade the side chains attached to the D-ring of steroids utilizing a series of reactions analogous to fatty acid ␤-oxidation (7). In a typical ␤-oxidation cycle, a thiolase is responsible for a C-C bond cleavage, releasing a molecule of acetyl-CoA or propionyl-CoA from a 3-ketoacyl-CoA intermediate, resulting in the shortening of the side chain by 2-3 carbon atoms. However, in the last cycle of ␤-oxidation of the side chains of cholesterol or bile acids, the hydroxyl substituent on the tertiary D-ring carbon cannot be oxidized to form a 3-ketoacyl-CoA; thus, C-C bond cleavage cannot be performed by a thiolase. Instead, it has been determined recently that in M. tuberculosis a protein named Ltp2 is responsible for catalyzing a retro-aldol cleavage of the isopropyl side chain from the hydroxyl-substituted cholesterol metabolite, 17-HOPC-CoA 5 ( Fig. 1) (8). Ltp2 is encoded by the last gene in a cluster of five genes within an intracellular growth (igr) operon that is so named because of its importance for intracellular growth of M. tuberculosis in mouse macrophages (9,10). Ltp2 was also found to associate with the hydratase ChsH1-ChsH2 encoded by upstream genes in the igr operon (8). ChsH1-ChsH2 is a heteromeric enzyme related to the MaoC family of proteins that catalyzes the hydration reaction preceding the retro-aldol reaction in the pathway (11). The hydration reaction has an unfavorable equilibrium, and therefore, coupling of this reaction with the retro-aldol cleavage reaction catalyzed by Ltp2 enables the hydration reaction to proceed to completion. By   (8). The 3Ј-DNA fragment of chsH2 encoding the DUF35 domain can be co-expressed with ltp2 in the heterologous host, Rhodococcus jostii RHA1, and the overproduced Ltp2 and ChsH2 DUF35 proteins formed a catalytically-competent complex with high aldolase activity. Proteins containing DUF35 domains have also been found in many bacteria and archaea, including those that are not steroid degraders (12). These proteins are typically fused to, or associate with, thiolases, 3-hydroxyacyl-CoA dehydrogenases, crotonases, or acyl transferases, but the function of DUF35 proteins in these enzymes/protein complexes is currently not clear. Homologs of ltp2 and chsH2 are found in sterol/bile acid degradation gene clusters of other bacteria, suggesting similar enzymes are used by diverse bacterial species to remove the last isopropyl side chains attached to the D-ring of various steroids (1,8,13). Here, we report the first structure of an Ltp2 in complex with the DUF35 domain of ChsH2. The enzyme was derived from the bile acid-degrading thermophilic actinobacterium, Thermomonospora curvata. Ltp2, ChsH1, and ChsH2 from T. curvata shares 79, 65, and 55% sequence similarity with the respective homologous proteins from M. tuberculosis. Together with supporting biochemical data from enzyme variants, the structure revealed the molecular basis for the divergence of functions between Ltp2 and the related SCP-2 family of thiolases.

Expression, purification, and characterization of the Ltp2-ChsH2 DUF35 complex of T. curvata
The genes encoding Ltp2 (Tcur3479) and the DUF35 domain of ChsH2 (ChsH2 DUF35 , Tcur3482) from T. curvata were PCRamplified, inserted into rhodococcal expression vectors, and transformed into R. jostii RHA1 for co-expression. The overexpressed, untagged ChsH2 DUF35 co-purified with the C-terminal His-tagged Ltp2 on a Ni-NTA column, indicating that the two proteins form a complex. Because the 3-carbon side-chain bile acid metabolites are not available commercially, steady-state kinetic parameters were determined for the Ltp2-ChsH2 DUF35 complex-catalyzed retro-aldol cleavage of the analogous cholesterol metabolite 17-HOPC-CoA (Table 1). Ltp2 can also be expressed by itself, without the DUF35 domain, in recombinant R. jostii RHA1, in good yield (4.8 mg/liter of culture) and is catalytically competent, albeit with a low specific activity of (4.2 Ϯ 2.6) ϫ 10 Ϫ4 mol⅐min Ϫ1 ⅐mg Ϫ1 toward 17-HOPC-CoA (21 M). After a 10-min incubation of a 1:5 ratio of Ltp2/ ChsH2 DUF35 , specific activity increased to (1.4 Ϯ 0.079) ϫ 10 Ϫ1 mol⅐min Ϫ1 ⅐mg Ϫ1 . DUF35 by itself did not display any aldolase activity.

Structure of the Ltp2-ChsH2 DUF35 complex
The T. curvata Ltp2-ChsH2 DUF35 complex was crystallized, and the structure was determined from a dataset collected at the zinc anomalous peak (to 2.3 Å) using single wavelength anomalous diffraction. This preliminary structure was then used to refine a high-resolution native dataset collected to 1.7 Å. Table 2 lists the data collection and model refinement statistics. As anticipated from the clear sequence homology, Ltp2 structurally resembles thiolases (14). Ltp2 is built around a pair of pseudo-symmetric ␣␤ domains (Fig. 2, A and B). The central mixed ␤-sheet of the N-terminal domain has topology 1,Ϫ5,4,2,3, whereas the C-terminal domain has topology 6,Ϫ9,8,7; sandwiched between these two ␤-sheets are the helices ␣3 (from the ␤3-␤4 loop) and ␣11 (from the ␤7-␤8 loop). The ␤4 -␤5 loop from the N-terminal domain is elaborated into an extended (ϳ95 amino acids) subdomain with four ␣-helices (plus extensive loops); this subdomain packs predominantly on the C-terminal domain.
The ChsH2 DUF35 domain is built from two subdomains: an N-terminal zinc finger domain, and a C-terminal domain with an oligonucleotide/oligosaccharide binding (OB)-fold. The zinc finger domain has a three-stranded antiparallel ␤-sheet (topology 2,Ϫ1,3) supplemented by an N-terminal ␣-helix. The zinc ion is ligated by two pairs of cysteine residues contributed by the ␤1-␤2 and ␤3-␤4 loops. The OB-fold domain forms a five-stranded ␤-barrel, with topology 4,Ϫ5,6,8,Ϫ7.
As observed in other thiolase family proteins, the Ltp2 proteins interact extensively and predominantly through their N-terminal ␣␤ domains to form a tight homodimer with an interface area of 2055.9 Å 2 (as determined by PISA (15)). The DUF35 domains are positioned at opposite ends of this central Ltp2 dimer, forming an overall ␣␤␤␣ heterotetramer. The N-terminal ␣-helix and the OB subdomain of ChsH2 DUF35 interact with ␣4and ␣5-helices of one Ltp2 protomer (interface ϳ1500 Å 2 ), whereas the zinc ribbon domain makes limited interactions with ␣1and ␣2-helices of the opposite Ltp2 protomer (interface area ϳ460 Å 2 ). These interactions are predominantly hydrophobic in nature.

Structure of Ltp2
The observed organization of the Ltp2-ChsH2 DUF35 heterotetramer, with both DUF35 domains exposed along one edge of the complex, is consistent with the idea that DUF35 domains act as molecular staples mediating the formation of bifunctional enzyme complexes with thiolase family enzymes as (at least) one key component. The first precedent for this organization is the acetoacetyl-CoA thiolase/3-hydroxy-3-methylglutaryl (HMG)-CoA synthase complex from M. thermolithotrophicus (16). In this complex, a thiolase dimer of dimers occupies the center of the complex, whereas the HMG-CoA synthase occupies the periphery. The DUF35 domain is sandwiched between these two proteins, stabilizing the overall complex. Although the overall architecture of the Ltp2-ChsH2 DUF35 complex resembles these bifunctional complexes, the Ltp2 aldolase itself most closely resembles monofunctional SCP-2 thiolases (18) such as Leishmania mexicana thiolase (PDB 3zbn; r.m.s.d. 1.8 Å). Interestingly, Ltp2 is significantly more distantly related to FadA5, which also degrades steroids, mediating C-C bond cleavage during the ␤-oxidation of 8-or 5-carbon side-chain cholesterol metabolites in M. tuberculosis (19) (PDB 4ubw; r.m.s.d. 2.45 Å). Together, this suggests that steroid specificity may have evolved separately in these two thiolase superfamily members.
It is also interesting to note that the OB-fold of DUF35 also resembles the C-terminal domain of a thiolase-like protein (TLP) from Mycobacterium smegmatis (PDB 4egv) (20). TLP is a monomeric protein that contains a C-terminal ␤-barrel domain linked to an N-terminal thiolase-fold. Ltp2 together with the associated OB-fold domain of DUF35 superimpose well to TLP (r.m.s.d. of 2.1 Å). This suggests a possible precursor to the evolution of the DUF35 domain, which so far has been found to only occur as an accessory domain to thiolase family members.

Modeling the full Ltp2-ChsH1-ChsH2 complex
Attempts to crystallize the full Ltp2-ChsH2-ChsH1 complex of T. curvata were unsuccessful. Because the structure of ChsH1-ChsH2 from M. tuberculosis containing the MaoC hydratase domains but lacking the DUF35 domain is available, we modeled the complex using docking. The structure of the M. tuberculosis ChsH1-ChsH2 MaoC complex (PDB code The Y294F/Y344F variant has no detectable activity. The Y294F and G82P had very-low activity that precludes the determination of K m and k cat values. At the respective concentrations of 25 and 23 M 17-HOPC-CoA substrates, Y294F variant had a specific activity of (5.3 Ϯ 2.3) ϫ 10 Ϫ4 mol⅐min Ϫ1 ⅐mg Ϫ1 , and G82P variant had a specific activity of (5.8 Ϯ 0.59) ϫ 10 Ϫ4 mol⅐min Ϫ1 ⅐mg Ϫ1 . For comparison, the V max of the wildtype enzyme is 7.4 Ϯ 0.21 mol⅐min Ϫ1 ⅐mg Ϫ1 . ND means not determined; NA means no detectable activity.

Structure of Ltp2
4wnb) shows a heterotetrameric organization, where one ChsH1 and one ChsH2 MaoC chain form a functional heterodimer with one catalytic site; these heterodimers then interact through a series of ␣-helices. The native molecular mass of the T. curvata ChsH1-ChsH2 was determined to be 131 kDa by size-exclusion chromatography (data not shown), consistent with this complex also having a heterotetrameric organization. Symmetry provides a strong constraint on the bifunctional complex as oligomers almost universally interact by aligning symmetry elements so that the interfaces are also symmetric (21); here, both sub-complexes have only a single 2-fold symmetry axis, which is very likely aligned. The ChsH2 DUF35 (ChsH2 DUF35 ) domains are fused to the C terminus of the ChsH2 MaoC hydratase domain (ChsH2 MaoC ), with an ϳ16amino acid linker separating the domains; this linker dictates which faces of the two sub-complexes can approach one another, and it also constrains how far the respective termini of the two domains can separate. Finally, both proteins have a saddle-like shape along this axis, suggesting that optimal interactions will involve fitting these saddles to one another. We manually docked the Ltp2-ChsH2 DUF35 to the ChsH1-ChsH2 MaoC complex and then used a customized script to generate a series of variants of this configuration, which conform to the above constraints; Rosettadock was then used to optimize these candidate starting poses and score the refined complexes. The lowest energy solutions all closely resemble that shown in Fig. 3A, where the two protein complexes interact intimately. It should be noted that the ChsH2 MaoC-DUF35 linker and Ltp2 groove-covering loop may also pack between these domains and influence optimal packing, whereas the exposed, extended loops ␤4 -␤5 and ␤7-␤8 within the DUF35 domain are also likely to shift so as to maximize interactions. Nevertheless, this complex as depicted buries a respectable total surface of 1370 Å 2 with a total solvation energy change of Ϫ12.1 kcal/mol (PISA). We therefore anticipate that this model is reasonably accurate in terms of the overall complex geometry, although details including side chain and loop packing are likely not fully captured.
The generated model suggests that the active sites of ChsH1-ChsH2 MaoC and Ltp2-ChsH2 DUF35 aldolase complexes face one another and may make minor contributions to one another's active site. In particular, the N-terminal helix and the ␤7-␤8 loop of the DUF35 domain may directly contribute to binding the A-ring of 3-OPDC-CoA by ChsH1-ChsH2 MaoC . In return, ChsH1-ChsH2 MaoC may possibly help organize the substrate groove-covering loop of Ltp2. It is worth noting is that the CoA-binding site of ChsH1-ChsH2 MaoC is too distant to allow the steroid ring/side chain to reach the Ltp2 active site (Fig. 3A). Ltp2 is therefore suggested to bind CoA independently, as supported by the observation that the hydratase and aldolase can each function efficiently without the partner subcomplex present. This observation also implies that the substrate needs to fully release and rebind between reactions, meaning that this complex will likely not exhibit the robust

Structure of Ltp2
substrate channeling between active sites observed in the archaeal HMG-CoA synthase complex (16).

Active-site organization and enzymatic mechanism of Ltp2
In canonical thiolases, e.g. L. mexicana thiolase, the active site is located at the bottom of a long, relatively narrow tunnel that envelops the substrate and connects to the adenosine moiety on the enzyme surface via the pantetheinyl arm (Fig. 4A). A surface-rendering of Ltp2-ChsH2 DUF35 , in contrast, shows that the active site takes the form of a long (ϳ45 Å), deep (ϳ15 Å), and fairly wide (8 -15 Å) groove that spans the full distance between both CoA-binding sites (Fig. 4B). This groove is built entirely from Ltp2 residues (with no contributions with the ChsH2 DUF35 domain) and is quite hydrophobic, at least along the bottom of the channel. This groove is less open than it appears as it is bridged by the disordered (in this apo structure) loop composed of residues 119 -125. This loop contains several conserved nonpolar residues; we expect that this region becomes ordered either upon substrate binding by packing on the extended nonpolar surface offered by 17-HOPC-CoA or by packing on the ChsH1-ChsH2 MaoC hydratase domains. It is worth noting, however, that these disordered motifs are considerably shorter than the equivalent motifs in e.g. L. mexicana thiolase where they form an extended, well-ordered helix (␣6) that encloses the active site, whereas the shifting and shortening of other motifs also contribute to the formation of this extended open groove in Ltp2. The openness of this substratebinding groove is therefore likely an adaptive feature that is needed to accommodate a large sterol substrate, an idea that is buttressed by the similarly open active site of the sterol thiolase, FadA5 (19).
The thiolase reaction has been intensively studied and is critically dependent upon a pair of conserved catalytic cysteines (22). Considering the reaction in the thiolytic cleavage direction, the nucleophilic cysteine attacks the ␤-carbonyl carbon of the ␤-ketoacyl-CoA substrate, forming an acyl-enzyme intermediate. The key to the energetic accessibility of this reaction step is resonance stabilization of the negative charge on the acetyl-CoA leaving group by the thioester. A second conserved cysteine residue then serves as a general acid that donates a proton to the ␣-carbon of the acyl-CoA anion. Finally, the acylenzyme intermediate is resolved by a second CoA molecule attacking the carbonyl group. In addition to the nucleophile and base, the active site also requires two oxyanion holes, which accommodate negative charges that develop on the two carbonyl groups in different steps in the reaction; these oxyanion holes are typically formed by His and Asn residues whose identity varies among thiolase sub-families. Comparison of the Ltp2 catalytic site to L. mexicana thiolase reveals that none of the key residues that mediate the thiolase reaction are conserved (Fig.  4C). In particular, the nucleophilic Cys-123 has been replaced by glycine, Gly-82. The acid/base Cys-340, which protonates the ␤-carbon of the leaving group, has been replaced by His-296. Finally, His-338 and His-388, which together form the oxyanion hole that stabilizes the CoA-acyl oxygen (O1), are replaced by a pair of tyrosine residues, Tyr-294 and Tyr-344. These radical changes in all elements of the catalytic machinery suggest that the mechanism of the aldolase is likely to utilize the basic thiolase superfamily active-site architecture very differently than thiolases proper.
Extensive attempts to co-crystallize or soak Ltp2-ChsH2 DUF35 with the substrate, 17-HOPC-CoA, or products androstenedione and propionyl-CoA did not yield structures containing ligands. The position of 17-HOPC-CoA was therefore manually modeled in the Ltp2 structure to obtain insights into the binding mode and mechanism (Fig. 3, B and C). In most protein families, co-factor binding is quite conservative and so can be reliably modeled where co-structures of other family members serve as a guide. The details of the CoA-binding mode are, how-

Structure of Ltp2
ever, quite variable among thiolase superfamily members, but the HMG-CoA thiolase structure seems to serve as a reasonable model, placing the negative charges associated with the adenosine group in a positively charged pocket that includes Arg-203 (absolutely conserved) and His-253 (ChsH2 DUF35 domain) near the pyrophosphate, and Arg-192 and Arg-178 (conserved as Arg or Lys) adjacent to the 3Ј-phosphate. There are fewer a priori constraints on the steroid ring placement; in addition, the stereochemistry around C17 and C18 of 17-HOPC-CoA has not been experimentally determined. However, the hydrophobic-binding groove is only just wide enough to accommodate the ring, and the D-ring is constrained to be positioned adjacent to the CoA-binding site. In this orientation, the steroid ring only fits well with its methyl-free side pointing toward the protein surface, packing intimately against two of a trio of absolutely conserved glycine residues, Gly-82 and Gly-83. Puckering the A-ring maximizes nonpolar interactions while placing O3 in hydrogen-bonding distance to Thr-55 (conserved as Ser/Thr in Ltp2 homologs). The B-and D-rings form extensive nonpolar interactions with Met-255, Met-256, and Val-380, which form an extended hydrophobic wall of the pocket. This model places the critical 17-HOPC C17 and C20 carbons in proximity to a set of four absolutely conserved polar residues (Fig. 3C), including Tyr-294 and Tyr-344 (recall that these replace the thiolase oxyanion hole histidine residues). O17 makes hydrogen bonds with Gly-83N and the phenolic hydroxyl of Tyr-294, although the Tyr-344 phenolic hydroxyl interacts closely with the 17-HOPC-CoA C20 hydrogen atom. The edge of the His-346 ring stacks on the C-ring of the modeled substrate, whereas His-296 stacks on the thioester group, with neither residue seeming suitably positioned to form hydrogen bonds (Fig. 3C). Note that this model predicts that 17-HOPC-CoA has a 17(S), 20(S) absolute configuration.

Structure of Ltp2
substrate 17-HOPC-CoA (Table 1). All Ltp2 variants can be purified as complexes with ChsH2 DUF35 indicating proper folding. Our model predicts that the 17-HOPC A-, B-, and C-rings wrap around Gly-82; replacing this residue with proline resulted in activity being reduced at 4 orders of magnitude, consistent with the introduction of significant steric clashes. Replacement of His-296 and His-346 with alanine led to a modest reduction in k cat of about 40-fold, and a decrease in K m by about 3-4-fold, suggesting that these residues, although contributing, are not critical to the mechanism. Replacement of Tyr-344 by phenylalanine led to a 400-fold reduction in specific activity, suggesting that the phenolic oxygen is critical for activity. Replacement of Tyr-294 by phenylalanine reduced catalytic activity by some 4 orders of magnitude. Finally, the Y294F/ Y344F double mutant abolished activity.
On the basis of our model and mutagenesis data, we propose that the catalytic mechanism for the Ltp2-catalyzed retro-aldol cleavage of 17-HOPC-CoA revolves around a pair of catalytic tyrosine residues, Tyr-294 and Tyr-344 (Fig. 5). In particular, Tyr-294 acts as a base, abstracting a proton from the hydroxyl group on the D-ring (O17) of the steroid substrate, leading to C-C bond cleavage. Tyr-344 acts as a general acid, donating a proton to the C ␤ of propionyl-CoA (C20 of 17-HOPC-CoA). The histidine residues play relatively minor roles, and their exact contribution is unclear; possibly His-296 could reorient to interact with the thioester carbonyl during the enzyme's functional cycle, forming an oxyanion hole to stabilize the carbonyl oxygen. The His-346 imidazole is oriented toward the phenolic ring of Tyr-344, and the resulting cationinteraction may play a role in positioning this residue, and possibly fine-tuning its pK a . Enzymes that utilize tyrosine residues as general acids/bases include vanillyl alcohol oxidase and fructose-1,6-bisphosphate aldolases. In vanillyl alcohol oxidase, two tyrosine residues are capable of deprotonating the phenolic substrate during catalysis (23). In Thermoproteus tenax fructose-1,6-bisphosphate aldolase, a tyrosyl residue was proposed to be the catalytic base that abstracts the proton from the C4-OH of the substrate (24). In rabbit muscle fructose-1,6-bisphosphate aldolase, however, a tyrosyl residue acts as a general acid to donate a proton to the carbinolamine intermediate (25). The fructose-1,6-bisphosphate aldolases are, however, evolutionarily, structurally, and mechanistically distinct from Ltp2-ChsH2 DUF35 , and they do not utilize ␤-hydroxy-CoA thioesters as substrates.
It is also interesting to note that although 17-HOPC-CoA is a functional in vitro substrate, the likely in vivo substrate of this specific enzyme is a closely analogous bile acid degradation metabolite, consistent with the ability of T. curvata to grow on bile acids as sole carbon sources (1). Bile acids differ from cholesterol in the steroid ring by hydroxyl substituents at C7 (B-ring) and/or C12 (C-ring), rendering them significantly more polar. Inspection of the 17-HOPC-CoA docked model suggests that an oxygen atom on C12 may be able to form hydrogen bonds with His-346 and Tyr-344. A hydroxyl group in this position might be expected to improve binding and may explain the conservation of His-346, despite its modest contribution in the 17-HOPC-CoA reaction. A hydroxyl substituent on C7 would, however, point into a shallow, wholly nonpolar pocket. This hydroxyl could possibly be partially accommodated by shifting Met-255 into its observed alternate conformation; while preventing steric clash, this interaction would still exact an enthalpic penalty upon binding, due to the cost of desolvating the hydroxyl group. Interestingly, other Ltp2 homologs often replace this methionine with serine or threonine, residues that could potentially form favorable hydrogen bonds with O7. We therefore predict that T. curvata Ltp2 may prefer deoxycholate metabolites over cholate or chenodeoxycholate metabolites.

Discussion
Ltp2-ChsH2 DUF35 shows clear structural and sequence homology to SCP-2 thiolases and seems to have evolved by repurposing the thiolase machinery to catalyze an aldolase reaction. In the canonical thiolase reaction, facile breaking of C-C bonds depends on the stabilization of the anionic thioester intermediate by resonance stabilization within the thioester group. Ltp2 presumably exploits the same stabilization effect to allow an aldolase reaction. Aldolases are generally divided into two groups: class I aldolases stabilize the leaving group by forming an amine adduct, and class II aldolases do so by forming a divalent metal ion complex (26). The Ltp2-type aldolases potentially represent an alternative solution to the problem of aldol leaving group stabilization, using, in this case, the thioester group. Other enzymes that also catalyze retro-aldol cleavage of ␤-hydroxythioesters include malyl-CoA lyase (27) and citrate synthase (28).
Ltp2 almost certainly evolved from an SCP-2-like thiolase, but this process has substantively remodeled the protein. In

Structure of Ltp2
particular, not only has every critical catalytic residue been replaced, but the active site has also been dramatically opened up to accommodate the large, rigid steroid substrate. Replacing multiple catalytic residues and remodeling the active site are unlikely to have occurred in a single step; we would suggest that there are some specific features of this transition that suggest that the aldolase reaction evolved first, and sterol specificity second. First, the evolution of aldolase activity in the context of SCP-2 thiolases in particular may have been facilitated on the use of a pair of histidine residues to form the oxyanion hole in this family (18). These histidine residues are necessarily positioned to stabilize the anionic intermediate in SCP-2 thiolases; this also ideally positioned them to potentially initiate an aldolase reaction by deprotonating a similarly positioned hydroxyl group, exploiting the inherent facility of histidine for performing acid/base reactions near neutral pH. The use of histidine in SCP-2 thiolases may therefore have predisposed this family to evolve aldolase activity, with the tyrosines possibly a later optimization. Second, in Ltp2 the nucleophilic cysteine of thiolases has been replaced with a glycine; this glycine is intolerant of bulkier substitutions and appears, in modeling, to stack directly on the steroid ring. This suggests an evolutionary history where aldolase activity originally evolved in context of a different (possibly smaller) substrate, and steroid specificity only arose once the cysteine was dispensable. The alternative evolutionary history, where Ltp2 evolved from a steroid-specific thiolase similar to FadA5, would require that a new steroid-binding site evolved closer to the backbone after the switch to an aldolase reaction permitted by replacement of the cysteine. In this context, it is worth noting that the SCP-2 family includes thiolase-like proteins of unknown function where the nucleophilic cysteine has been lost, but the active site-covering helix appears to be retained (Fig. 4C) (22). The Pseudomonas protegens enzyme has a thiolase ancestor-like histidine in place of Tyr-344 (17), whereas the Trypanosoma brucei protein has an alanine replacing this critical residue, but retains the thiolase acid/base cysteine (29). It will be interesting to determine whether these enzymes represent additional thiolase superfamily aldolases, albeit ones that may differ in the details of their catalytic mechanism, and act on less bulky substrates; if so, these enzymes would be strong candidates from which the sterol aldolases evolved.

Bacterial strains and plasmids
T. curvata DSM 43183 was obtained from the Leibniz Institute German Collection of Microorganisms and Cell Cultures. R. jostii RHA1 was obtained from Dr. Lindsay Eltis (Department of Microbiology and Immunology, University of British Columbia, Canada).

DNA manipulation
The genes encoding Ltp2 (Tcur3479) and the DUF35 domain of ChsH2 (Tcur3482) were amplified from T. curvata DSM 43183 genomic DNA using primers indicated in Table 3. The 3Ј primer for ltp2 omits the stop codon to append a C-terminal His-tag in the expressed protein. The genes encoding Ltp2 and the ChsH2 DUF35 were inserted into the NdeI/HindIII sites of pTIP-Qc2 and pTIP-RT2 (30), respectively. Both plasmids were transformed into R. jostii RHA1 for co-expression. Site-specific mutagenesis was carried out according to the modified QuikChange method (31) using primers listed in Table 3, except for the G82P variant that was created by gene synthesis (Biobasic Inc.).
Recombinant R. jostii RHA1 containing pTIPQC2/ltp2 and pTIPRT2/chsH2 duf35 was grown at 30°C in 4 liters of LB medium supplemented with chloramphenicol (25 mg/ml) and tetracycline (8 mg/ml). Expression of the recombinant proteins was induced with thiostrepton (1 g/ml) at mid-log phase (OD 600 ϭ 0.4 -0.6). Cultures were grown for a further 24 h at 30°C, and cells were subsequently collected by centrifugation at 9605 ϫ g for 10 min.
R. jostii RHA1 cell pellets were resuspended in 20 mM HEPES, pH 7.5, and cells were lysed through a French press at 20,000 p.s.i. Insoluble fractions were removed by centrifugation at 39,191 ϫ g twice for 15 min. Cell-free extracts were filtered through a 0.45-m filter and incubated for 1 h at 4°C with Ni 2ϩ -NTA resin in buffer containing 50 mM sodium phosphate, 300 mM sodium chloride, and 20 mM imidazole, pH 8.0. The mixture was poured into a gravity column and washed with the same buffer. The His-tagged proteins were eluted with buffer

Structure of Ltp2
containing 150 mM imidazole, pH 8.0. The buffer was exchanged into 20 mM sodium HEPES, pH 7.5, by dilution in a stirred cell equipped with a YM10 filter (Amicon). Purified enzymes were stored at Ϫ80°C.

Protein concentration, purity, and molecular weight determination
Concentrations of proteins were determined by the Bradford assay using BSA as the standard (34). Coomassie Blue-stained SDS-PAGE and Image Lab Software (Bio-Rad Inc) was used to assess purity of the enzymes. The native molecular weight of ChsH1-ChsH2 MaoC was determined by gel filtration on a Superdex 200 column (GE Biosciences) with 20 mM HEPES buffer pH 7.5 containing 0.15 M NaCl as the equilibration and elution buffer. The standard curve used consisted of the proteins cytochrome c (M r ϭ 12,400), carbonic anhydrase (M r ϭ 29,000), BSA (M r ϭ 66,000), alcohol dehydrogenase (M r ϭ 150,000), and ␤-amylase (M r ϭ 200,000) (all from Sigma).

Steady-state kinetic assays
The retro-aldol activity of the aldolase Ltp2-ChsH2 DUF35 toward 17-HOPC-CoA was determined spectrophotometrically using the previously reported coupled assay (8). Briefly, assays were performed at least in triplicate in a total volume of 1 ml at 25°C, using a Varian Cary 3 spectrophotometer equipped with a temperature-controlled cuvette holder (⑀ ϭ 6200 M Ϫ1 s Ϫ1 ). The assay was determined in 100 mM HEPES, pH 7.5, in the presence of 200 M NADH and 16 M of the aldehyde dehydrogenase TTHB247 from Thermus thermophilus (8).

Crystallization and structure determination
Conditions for crystallization of Ltp2-ChsH2 DUF35 were screened using the JCSGϩ kit (Molecular Dimensions Inc.). Crystals used for data collection were grown using the sitting drop method at 10°C, with 2 l of reservoir solution (0.2 M sodium thiocyanate, 20% PEG 3350) combined with 2 l of 16 mg/ml Ltp2-ChsH2 DUF35 . Crystals were soaked in Paratone N as cryoprotectant prior to freezing. The datasets were collected at the Canadian Light Source, Canadian Macromolecular Crystallography Facility (CMCF-BM), and processed using XDS (35). The structure was initially phased using zinc-single anomalous diffraction (Zn-SAD) in Phaser (36), with an initial experimental figure of merit of 0.147. A native dataset was also collected (from a different crystal) at 1.7 Å and was then refined using the lower resolution Zn-SAD structure as a starting model. Refinements were carried out in PHENIX refine, with manual building in Coot (37). The structure proved to have one full heterotetramer in the asymmetric unit, with the only breaks being in Ltp2 where residues of the active site-covering loop 120 -126 (chain A) and 118 -128 (chain C) were disordered. Figures of protein structure were generated using PyMOL version 2.0 (38).

Docking and modeling
The structure of the ChsH1-ChsH2 MaoC substrate complex (PDB code 4wnb) was initially placed manually on the Ltp2-ChsH2 DUF35 complex. Weakly-ordered loops (with high-atomic displacement parameters or that differed significantly between chains) were removed after inspection to avoid steric clashes. Initial molecular placement sought to maximize the overall buried surface area while aligning the two complexes in respective 2-fold symmetry axes. After initial manual placement, variants of this complex were systematically generated by rotations Ϯ5°(in 1°increments) and translating Ϯ2 Å (in 1/3 Å increments) using a perl script previously reported (39). These initial starting poses were then optimized in RosettaDock 3.5 (40), and the lowest energy pose reported. Docking of 17-HOPC-CoA was performed manually in PyMOL version 2.0.