Atomic Structure of Mycobacterium tuberculosis CYP121 to 1.06 Å Reveals Novel Features of Cytochrome P450*

The first structure of a P450 to an atomic resolution of 1.06 Å has been solved for CYP121 fromMycobacterium tuberculosis. A comparison with P450 EryF (CYP107A1) reveals a remarkable overall similarity in fold with major differences residing in active site structural elements. The high resolution obtained allows visualization of several unusual aspects. The heme cofactor is bound in two distinct conformations while being notably kinked in one pyrrole group due to close interaction with the proline residue (Pro346) immediately following the heme iron-ligating cysteine (Cys345). The active site is remarkably rigid in comparison with the remainder of the structure, notwithstanding the large cavity volume of 1350 Å3. The region immediately surrounding the distal water ligand is remarkable in several aspects. Unlike other bacterial P450s, the I helix shows no deformation, similar to mammalian CYP2C5. In addition, the positively charged Arg386 is located immediately above the heme plane, dominating the local structure. Putative proton relay pathways from protein surface to heme (converging at Ser279) are identified. Most interestingly, the electron density indicates weak binding of a dioxygen molecule to the P450. This structure provides a basis for rational design of putative antimycobacterial agents.

The cytochromes P450 (P450s) 1 are a superfamily of hemecontaining mono-oxygenases (1). They are famed for their roles in drug metabolism/detoxification and steroid synthesis in mammals (e.g. Refs. 2 and 3) but are also found in virtually all other life forms. The vast majority of P450s bind and cleave molecular oxygen in a two-step reaction involving delivery of two electrons from a redox partner. In eukaryotic P450s, this is typically the diflavin enzyme NADPH cytochrome P450 reductase (4). However, in prokaryotes the reducing equivalents are usually supplied via two enzymes: NAD(P)H flavodoxin/ferredoxin reductase and either a ferredoxin or a flavodoxin mediator of electrons to the P450 (5,6). Many bacterial P450s are of particular biotechnological and environmental relevance, given their role in degradation of recalcitrant organic molecules for use as energy sources (e.g. Refs. 7 and 8). Bacterial P450s have also proven to be experimentally tractable systems for elucidation of P450 structure and mechanism, due largely to the fact that they are soluble enzymes, as opposed to their eukaryotic membrane-associated counterparts (5). A breakthrough was made recently with the atomic structure of the first mammalian P450 (a form of P450 2C5). In this case, extensive protein engineering was required to remove the N-terminal membrane anchor peptide, to promote release from the cellular membrane and to prevent aggregation of the resultant solubilized P450 (9).
Whereas the genome sequence of E. coli is devoid of P450encoding genes (10), the genomes of the industrially important bacterium Streptomyces coelicolor and the pathogen Mycobacterium tuberculosis (Mtb) indicate that there are 18 and 20 P450s encoded, respectively (11,12). The genome sequence of Mtb revealed several interesting features, most notably the preponderance of genes involved in lipid metabolism; ϳ250 enzymes involved in lipid metabolism have been identified in Mtb (cf. ϳ50 for the similarly sized Escherichia coli genome). The importance of lipid metabolism in Mtb may explain the high P450 content, since typical cellular functions for P450s in eukaryotes and other bacteria are as mono-oxygenases for lipophilic fatty acids, polyketides, steroids, and xenobiotics (e.g. see Ref. 2). Sterols are rare in bacteria, but the P450 product of the Mtb Rv0764c gene (CYP51) is highly related to eukaryotic sterol demethylases and has been shown to catalyze demethylation of plant and eukaryotic sterols (13). The crystal structure of the Mtb CYP51 was solved in complex with the azole antifungal drug fluconazole (14). This structure is of biotechnological importance, given the importance of the eukaryotic CYP51s in physiology and also in view of the fact that the yeast and fungal CYP51 enzymes (lanosterol demethylases) are validated targets for the azole antifungal class of drugs of which fluconazole is a member (15). In recent studies, the potency of these drugs as antibiotics against growth of Mtb's relative Mycobacterium smegmatis and the related actinomycete S. coelicolor has been shown (16).
The biochemical functions of the remaining P450 enzymes in Mtb are not immediately obvious. This is primarily because of limited similarity to other members of the P450 superfamily with known catalytic properties. Most show highest similarity to other P450 enzymes in Mtb, perhaps suggesting co-evolution as oxygenases of related, complex lipids in the bacterium (17). We decided to focus efforts on the product of gene Rv2276, a P450 enzyme classified as CYP121 (1). This P450 shows amino acid sequence similarity to a number of polyketide mono-oxygenase P450s, including the structurally characterized P450 EryF (CYP107A1) from Saccharopolyspora erythraea (18), suggesting a potential role in polyketide metabolism in Mtb. In addition, the known high affinity of P450 EryF for the azole antifungal class of drugs suggests that CYP121 could be a potential drug target in the pathogen, particularly since the role of CYP51, its validity as a drug target, and whether sterol metabolism is really a feature of the pathogen remain to be established. Indeed, the fact that a ⌬CYP51 strain of S. coelicolor is nonlethal (16,19) and that the strain exhibits no significant alteration in sensitivity to azole antifungal drugs (16) confirms that CYP51 is not the target in this actinomycete.
Our initial expression and biophysical studies have established that the Mtb P450 CYP121 has properties typical for members of this class (20). Whereas the physiological role remains unclear, the enzyme binds bulky azole antifungal drugs with high affinity, and the binding constants for these drugs are perturbed by the presence of erythromycin and other large polyketides, suggesting that CYP121 may metabolizes polyketides or bulky polycyclics in vivo. However, what is most remarkable is that in comparison with Mtb CYP51, the binding of a variety of azole antifungal drugs to Mtb CYP121 is tighter and that the order of their K d values correlates well with the minimal inhibitory concentration values for these drugs for M. smegmatis and S. coelicolor (16). Since the drugs retain their high potency against a S. coelicolor CYP51 knockout strain, other P450 enzyme(s) must be the true targets in this bacterium; thus, the fact that potency of azole drugs versus mycobacteria correlates with CYP121 K d values must make this a prime drug target candidate (20). We have previously reported crystallization of CYP121 (21), and in this paper we report the crystal structure of CYP121 to 1.06 Å. This unprecedented level of resolution for a P450 system allows, for the first time, a truly atomic description of the oxygen scission site.

EXPERIMENTAL PROCEDURES
Expression and Purification of CYP121-The Rv2276 gene encoding CYP121 was cloned by PCR from a Mtb chromosomal DNA library (cosmid MTCY339, obtained from Prof. Stewart Cole at the Institut Pasteur, Paris, France) and cloned into expression plasmid pET11a to produce clone PKM2b, as described previously (20). The CYP121 protein was produced in E. coli strain HMS174 (DE3) with growth of the culture after isopropyl-1-thio-␤-D-galactopyranoside induction performed at low temperature (18°C) to promote production of soluble enzyme. The cells were broken using a French press and sonication, and CYP121 was purified to homogeneity by ammonium sulfate fractionation (30 -70% fraction retained), followed by successive column chro-matography steps on phenyl-Sepharose, Q-Sepharose, and hydroxyapatite resins, as described (20).
Crystallization and Structural Elucidation of CYP121-CYP121 crystals belonging to space group P6 5 22 (unit cell dimensions a ϭ 77.81 Å, c ϭ 263.82 Å) were obtained as described previously (21). A complete native data set up to 1.06 Å could be measured on a single flash-cooled crystal on ID14 -4 at the European Synchrotron Radiation Facility. Different sections of the crystal were exposed to avoid excessive radiation damage due to the intense beam. Whereas these crystals clearly diffract further, both the long cell axis of 263 Å and the rapid decay of diffraction quality hampered the collection of higher resolution reflections. Data were processed and scaled with the HKL package programs DENZO and SCALE-PACK (22). Despite the availability of several P450 structures, the crystal structure could not be solved using molecular replacement and was solved using a Multiple Isomorphous Replacement with Anomalous Scattering (MIRAS) approach instead. Data were collected on station 14-1 at Synchrotron Radiation Source for two independent heavy metal derivatives. Initial heavy atom sites were found using the program RSPS (23) and used for detection of minor sites by difference Fourier analysis. Heavy atom sites were refined using the program MLPHARE (24), and the weak anomalous signal of the iron was included in the final phase calculations. Data collection and phasing statistics are given in Table I. The program ARP-WARP (25) was used to build an initial model in the solvent flattened maps. This initial model was manually adjusted using TURBO-FRODO (26) and refined using REFMAC5 (27). The structure of the uncomplexed enzyme was refined with anisotropic B-factors for all atoms including a riding hydrogen atom model. With successive rounds of refinement and model building, the entire CYP121 sequence was fitted to the electron density map with the exception of the N-terminal three residues. Multiple conformations were observed for 33 of the 395 residues defined. The final R-factor was 0.132 (R-free ϭ 0.153) for the model consisting of the protein, 800 water molecules, and 3 sulfates per asymmetric unit. The enzyme was complexed with hydrophilic compounds by soaking in the mother liquor (2.3 M ammonium sulfate, pH 6.5) supplemented with a large excess of the ligand. In order to obtain data on complexes with hydrophobic compounds, the crystals were placed in 30% polyethylene glycol 8000, 0.2 M ammonium sulfate, pH 6.5, supplemented with the compound of interest. Only the hydrophilic iodopyrazole proved to bind in the active site, and data were collected for complexed crystals to 1.8 Å. Final refinement statistics for the two crystal structures are given in Table II.

RESULTS AND DISCUSSION
General Structural Features-Globally, the CYP121 structure resembles a triagonal prism, as observed for all other P450 structures to date. As with other P450s, there are two distinct structural "domains," with the heme sandwiched between a smaller ␤ sheet-rich module and a larger helix-rich domain (Fig. 1). The P450s have relatively few definitive amino acid motifs, and evolutionarily distinct P450s may exhibit as little as 15% amino acid identity. However, the cysteine ligand to the heme iron and the region immediately surrounding this amino acid (the "heme-binding" motif) at the C-terminal section of the P450 are strongly conserved (Fig. 2).
Overall Enzyme Fold-Despite the elucidation of several P450 crystal structures, new cytochrome P450 models often reveal distinct structural features that are unpredictable from primary sequence alignments. Given the relative low homology ͗I͘ is the average intensity of multiple observations of symmetry-related reflections.
(25.3% identity over 396 amino acids) with P450 EryF (CYP107A1; a macrolide monooxygenase) and the unsuccessful molecular replacement, the high three-dimensional similarity of the substrate-free CYP121 with that of substrate/inhibitorbound CYP107A1 comes as a surprise (28,29). Notwithstanding this high similarity in fold (Z score 42.3, root mean square deviation 2.6 Å for 379 C-␣ atoms), several distinct features can be observed in CYP121 for polypeptide stretches lining the active site cavity.
Upon comparison of CYP121 with CYP107A1 from N to C terminus, the first region that is drastically different is that immediately following the conserved B helix (Fig. 3). Starting at residue Met 62 , the polypeptide chain proceeds first with a 3 10 helix and then with an extended loop forming the dome of the active site relative to the heme plane. This loop proceeds into a second 3 10 helix that is immediately followed by helices BЈ and C. The corresponding polypeptide region in CYP107A1 is longer and bears little or no resemblance, having no regular 3 10 helical structure, whereas helix BЈ is separated from C by several residues. The C helix in CYP121 is immediately adjacent to helix BЈ and makes a different angle with other conserved structural elements. It is separated from the short D helix by a flexible region that precedes Glu 110 . From this position onward, both polypeptide chains again assume very similar positions. After helix D and E, the helices F and G are slightly reoriented with respect to the remainder of the structure, and the FG loop that forms a part of the active site dome is bent inward with respect to the CYP107A1 structure. Whereas the CYP121 helix H still occupies a distinct conformation, as does the beginning of helix I, the polypeptide chain trace is once again highly similar to the main part of helix I and remains so for the rest of the structure. Two minor exceptions are the loop following the K helix and the C-terminal region, which are both slightly reoriented and form part of the dome of the active site. These reorganizations in the secondary elements and loops lining the active site cavity result in a slightly increased volume for the CYP121 active site of 1337 Å compared with 1115 Å for the substrate-bound CYP107A1. Whereas in CYP121 the main part of the active site is more compressed due to the closer packing of loops forming the dome of the active site, a significant part of the cavity extends between helices F and G (Fig. 1), accounting for the increased volume in CYP121. As has been observed for the majority of P450 crystal structures (with the notable exceptions of Mtb CYP51 and Bacillus megaterium P450 BM3) (14,30), the active R free is the same as R cryst but was calculated using a separate validation set of reflections that was excluded from the refinement process. b r.m.s., root mean square. site is apparently inaccessible from the surface in this substratefree from of the enzyme. This obviously raises the question of how the putative substrate reaches the active site. As has been proposed for other P450s, we envisage the CYP121 structure as oscillating between open and closed conformations whereby the "breathing motions" of the protein, specifically movements of the secondary elements and loops lining the active site, allow for transient exposure of gaps allowing substrate access. It is especially interesting that the active site seems to protrude in between helices F and G, allowing small movements in the relative positions of these helices to open an active site channel (Fig. 1). Similar motions in helices F/G and the interconnecting loop region have been demonstrated to occur for the CYP119 monooxygenase from Sulfolobus solfataricus (31).
P450 Heme Binding in Atomic Detail-As in other P450 enzymes, a cysteine (Cys 345 ) provides the proximal axial ligand to the heme iron, with a well defined water molecule as the distal axial ligand. The Fe-S distance is 2.30 Å, whereas the Fe-O distance is 2.21 Å, both values being highly similar to those in other P450 structures obtained.
An unusual aspect of the CYP121 structure is the fact that the heme is clearly bound in one of two distinct conformations, related by a 180°rotation through an axis of symmetry across the Ch ␣ -Fe-CH ␦ atoms of the molecule. In short, the heme is bound either in the "a" or "b" conformer, a phenomenon well known for other heme proteins (e.g. cytochrome b 5 ) (e.g. see Ref. 32). This mixture of conformers does not otherwise impact on the overall fold of the enzyme. One of the conformers predominates, and the relative occupancy is refined to a 70:30 mixture of the species (Fig. 4a). In cytochrome b 5 , there is dynamic exchange of the heme, such that the proportions of the a-and b-type conformers can be perturbed considerably by temperature (e.g. see Ref. 33). The situation is extremely unlikely to be the same in the cytochromes P450, where the heme is bound deep in the protein (not superficially as in b 5 ) and requires protein denaturation to be released. Whereas insertion of heme and modified hemes into P450 has been demonstrated, this was achieved only very slowly by denaturation of the P450 and refolding by gradual removal of denaturant after the addition of the exogenous heme (e.g. see Ref. 34). The situation is highly unlikely to occur in vivo, and we infer that the heme-binding step in CYP121 (and other P450s) occurs at a relatively early stage in the protein folding process, and the position of the heme as bound commits the P450 to this conformation throughout its lifetime in vivo.
One of the four pyrrole rings of the heme is distorted out-ofplane by the close interaction with the side chain of Pro 346 . The pyrrole is kinked toward the distal face by an angle of ϳ30° (  Fig. 4b). Pro 346 immediately follows the cysteinate heme ligand (Cys 345 ), and this Cys-Pro motif is found in a number of other cytochrome P450 enzymes, with several of the Mtb P450s having this feature (Fig. 2). This structural perturbation to heme planarity may therefore well be a feature common in other P450 enzymes but previously unrecognized due to the absence of the Cys-Pro motif in P450 structures solved to date. This structural element could indeed be essential to the specific electronic properties of the heme in these oxidases. Recently, we have demonstrated in the P450 BM3 system that a phylogenetically conserved Phe residue (Phe 393 located 7 amino acids before the cysteine ligand) that also stacks with both heme plane and packs with the cysteinate ligand exerts a large effect on the heme electronic properties (Fig. 2) (35). It is therefore highly likely that this proline residue, being closer to both heme and cysteinate ligand than even the phenylalanine (Phe 338 in CYP121), has a considerable impact on the heme catalytic and thermodynamic properties.
The Oxygen Scission Site to Atomic Detail-The water ligand to the heme in the resting state of the enzyme is hydrogen bonding to Ser 237 . This residue replaces the more general threonine residue found in cytochromes P450, which has been implicated in oxygen binding and/or proton delivery (36,37). In turn, Ser 237 hydrogen-bonds to the carbonyl backbone of Ala 233 and to the side chain of Arg 386 (Figs. 4b and 5). Whereas a distinctive deviation of conventional ␣-helix hydrogen bonding patterns in the region of the water ligand-contacting residues can be seen in all P450s of known structure, with the exception of CYP2C5 (9), the I helix in CYP121 is devoid of any significant alteration of hydrogen bonding patterns and resembles the eukaryotic CYP2C5 in this region. The presence of the bulky and positively charged residue Arg 386 immediately adjacent to the oxygen scission site is unique to CYP121 and makes it one of the key residues in this enzyme (Fig. 4b). The side chain of Arg 386 is wedged in between Ile 236 and Phe 280 , whereas hydrogen bonding to Ser 237 , Gln 385 , and two water molecules in the active site cavity. The close proximity of the arginine and serine residues may define a binding site for a negatively charged functional group, possibly a carboxylate, as observed previously for e.g. P450 BM3 (CYP102A1) with its fatty acid substrates (38,39).
In the last stages of refinement, additional density became apparent immediately adjacent to the heme water ligand. The shape of this density is nearly spherical, and the center is located 1.2 Å from the sixth ligand (Fig. 4b). Only two models seem to fit this density equally well: either the sixth ligand occupies two distinct and mutually exclusive conformations, or a diatomic species with ϳ1.2-Å bond length (e.g. dioxygen) is bound to a proportion of the heme groups in the crystal, effectively superposing the densities of water ligation and dioxygen ligation. The presence of hydrogen peroxide can be excluded on the basis that the bond length of the observed species is significantly shorter than that of hydrogen peroxide (1.5 Å). The immediate environment of the additional density does not, however, correlate with the first model (i.e. two distinct water conformations), since there are no significantly stabilizing structural elements present to explain the fact that the ligating water molecule would preferentially bind outside, but close to, the ligation sphere of the iron. Therefore, we favor the second model (i.e. bound dioxygen), with an angle of nearly 120°between dioxygen and heme plane highly similar to other dioxygen-heme structures (e.g. see Ref. 40). However, it is unprecedented that dioxygen binds to ferric heme, and the possibility exists that the crystal might have partially reduced in the intense beam. However, during data collection, several parts of the crystal were exposed each for less than 10 s, minimizing the chance of significant reduction due to repetitively exposing fresh, fully oxidized parts of the crystal. Additionally, several data collections at medium resolution involving significantly longer exposures and higher irradiation doses in similar conditions did not reveal any additional density. Second, the putative bond length for the dioxygen-ferric ion is 2.2 Å, which is highly similar to the water-ferric ion distances but in significant deviation for dioxygen-ferrous ion distances of 1.8 Å. Therefore, we conclude that P450 enzymes might weakly bind dioxygen in the ferric form, a consequence of the electron rich character conferred upon the heme by the cysteinate ligation.
The putative dioxygen species forms an ideal model for analyzing the initial steps of the catalytic cycle, during which both electrons and protons need to be consecutively passed onto the ligand to break the dioxygen bond and create the highly reactive oxidizing intermediate. Upon initial reduction of the heme, the dioxygen molecule can either accept a proton from proximal water molecules or from Ser 237 . However, we assume that the active site will be devoid of water molecules in the vicinity of the heme plane when complexed with substrate. Ser 237 , in turn, can easily receive a proton from Arg 386 . This residue contacts a water molecule buried in the interior of the protein. At this point, there are several other residues contacting this water molecule that could provide a proton. Most notable is Ser 279 , which occupies two distinct conformations in the CYP121 structure. Each of these conformations is in contact with a different series of residues and buried water molecules that eventually lead up to the surface of the molecule. Thus, it appears plausible that there may be a bifurcated proton delivery pathway from the P450s surface, with pathways converging at Ser 279 . This would have the potential to transport one proton at a time through each individual path (Fig. 6).
Structure of the Substrate Binding Site-The active site cavity of 1350 Å 3 is filled with water molecules and is remarkably rigid. Whereas a significant proportion of both surface-exposed and interior residues exhibit multiple conformations, all of the residues involved in lining the active site adopt a single conformation. The dome of the active site chamber is 12 Å above the heme plane, being most open above pyrrole ring D (main conformation). The heme is 62 Å 2 solvent-exposed with access to pyrroles B and C limited by residues from the I helix. The majority of residues lining the active site are hydrophobic with the notable exceptions of Arg 386 and Gln 385 . Whereas several antifungal agents bind tightly to CYP121 in solution (20), we have been unable to date to obtain crystallographic complexes with these compounds by either soaking native crystals or conducting co-crystallization trials. This is highly likely due to the extreme insolubility of the polycyclic azoles in the ammonium sulfate mother liquor but could also be due to restricted access of the azoles to the heme due to obstruction by Arg 386 . The single azole compound that was proven to bind into the active site cavity by soaking the crystals was iodopyrazole. This clearly demonstrates the fact that, even in the tight crystal Both pathways are mainly made by buried water molecules in addition to a single amino acid (Thr 244 and Glu 310 , respectively) in the two pathways. These pathways merge at Ser 279 that occupies two distinct conformations, each contacting a water molecule that is hydrogenbonded to Arg 386 . The latter residue is proposed to donate protons either directly or via Ser 237 (not shown) to the oxygen intermediates.
packing, breathing motions must allow transient access from the outside into the otherwise inaccessible active site cavity. Surprisingly, the pyrazole compound does not ligate to the heme iron but binds instead in the small channel region between helices F and G. The pyrazole plane stacks at ϳ4 Å with Trp 182 on one side and Phe 168 on the other side (Fig. 7, a and b). The bulky iodide atom points toward the active site cavity and is buried between the hydrophobic side chains of Phe 168 , Thr 229 , and Ala 233 . A similar unexpected observation was made upon complexation with 4-phenylimidazole. In this case, the compound bound to the surface of the protein instead of acting as the sixth heme ligand. In spectral titrations, 4-phenyl imidazole was shown to bind to the heme iron, giving a typical (and complete) shift of the Soret maximum to ϳ423 nm. This clearly demonstrates that CYP121 can adopt conformations in solution distinct from that observed in the crystal and that these solution conformations are compatible with ligand binding to the iron. CONCLUSIONS CYP121 is similar in overall fold to other structurally characterized P450s. It contains unique structural elements involved in lining the active site. The heme binding and oxygen scission sites are particularly interesting, since these reveal several new features that could be general to P450s. First, the heme cofactor is bound in two distinct, mutually exclusive conformations. Furthermore, it is contorted due to close interaction with the Cys 345 -Pro 346 motif, a common structural element in Mtb P450s and present also in other prokaryotic and eukaryotic forms. A series of hydrogen-bonded amino acid and water molecules define two potential proton delivery pathways, and these apparently converge at Ser 278 , which has distinct conformations in the CYP121 structure. Most interestingly, the oxygen scission site is markedly different from other bacterial P450s, and, while resembling the eukaryotic CYP2C5 structure, it contains a unique arginine residue (Arg 386 ). The atomic resolution obtained has allowed visualization of a species (probably dioxygen) weakly bound to the ferric heme iron.
Future studies on CYP121 will involve establishing its substrate selectivity through screening for activity against mycobacterial lipid extracts (and possibly infected tissue samples) and use of rational mutagenesis to define roles of key amino acids identified from this structural study in determining the biophysical and catalytic properties of the P450. A further priority is the solution of an azole-ligated CYP121 structure, since this may provide critical information required to facilitate de novo design of novel and more highly specific azole drugs that are potent anti-mycobacterials but that lack the crossreactivity with human isoforms. Through such an approach, anti-P450 drugs may prove useful new agents in the war against multidrug-resistant strains of M. tuberculosis, which the World Health Organization describes as having the potential to cause a "global catastrophe."