Identification and Analysis of Conserved Sequence Motifs in Cytochrome P450 Family 2

Using a multiple alignment of 175 cytochrome P450 (CYP) family 2 sequences, 20 conserved sequence motifs (CSMs) were identified with the program PCPMer. Functional importance of the CSM in CYP2B enzymes was assessed from available data on site-directed mutants and genetic variants. These analyses suggested an important role of the CSM 8, which corresponds to187RFDYKD192 in CYP2B4. Further analysis showed that residues 187, 188, 190, and 192 have a very high rank order of conservation compared with 189 and 191. Therefore, eight mutants (R187A, R187K, F188A, D189A, Y190A, K191A, D192A, and a negative control K186A) were made in an N-terminal truncated and modified form of CYP2B4 with an internal mutation, which is termed 2B4dH/H226Y. Function was examined with the substrates 7-methoxy-4-(trifluoromethyl)coumarin (7-MFC), 7-ethoxy-4-(trifluoromethyl)coumarin (7-EFC), 7-benzyloxy-4-(trifluoromethyl)coumarin (7-BFC), and testosterone and with the inhibitors 4-(4-chlorophenyl)imidazole (4-CPI) and bifonazole (BIF). Compared with the template and K186A, the mutants R187A, R187K, F188A, Y190A, and D192A showed ≥2-fold altered substrate specificity, kcat, Km, and/or kcat/Km for 7-MFC and 7-EFC and 3- to 6-fold decreases in differential inhibition (IC50,BIF/IC50,4-CPI). Subsequently, these mutants displayed 5-12 °C decreases in thermal stability (Tm) and 2-8 °C decreases in catalytic tolerance to temperature (T50) compared with the template and K186A. Furthermore, when R187A and D192A were introduced in CYP2B1dH, the P450 expression and thermal stability were decreased. In addition, R187A showed increased activity with 7-EFC and decreased IC50,BIF/IC50,4-CPI compared with 2B1dH. Analysis of long range residue-residue interactions in the CYP2B4 crystal structures indicated strong hydrogen bonds involving Glu149-Asn177-Arg187-Tyr190 and Asp192-Val194, which were significantly-reduced/abolished by the Arg187→Ala and Asp192→Alasubstitutions, respectively.

One of the most intriguing recent discoveries about mammalian cytochromes P450 (CYP) 3 is the remarkable conformational plasticity exhibited by a number of the enzymes (1)(2)(3)(4). This plasticity allows a single P450 to adapt its ligand binding site to a wide variety of compounds of different size, shape, and chemistry. The rabbit CYP2B4 provides some of the most striking examples of an adaptable active site as inferred from x-ray crystal structures of ligand-free, 4-(4-chlorophenyl)imidazole (4-CPI)-bound, 1-(4-chlorophenyl)imidazole-bound, and bifonazole (BIF)-bound x-ray crystal structures as well as solution thermodynamics derived from isothermal titration calorimetry (5)(6)(7)(8)(9)(10). In particular, comparisons of three structures identified five plastic regions in CYP2B4 (8), which account for almost one-third of the protein and contribute to broad substrate specificity by allowing different conformations in response to ligands. These results suggest an important role of non-active site regions/residues in ligand-induced conformational transitions and differential substrate binding and catalysis. However, predicting the role of non-active site regions/residues is currently very difficult.
Recently, directed evolution of CYP2B1 has been carried out to locate important non-active site amino acid residues. Residues Val 187 , Phe 202 , Lys 236 , Asp 257 , His 295 , and Ser 334 were found to contribute to enzyme catalysis and/or stability (11,12). As a complement to directed evolution, amino acid sequences were compared among CYP2B6 (low heterologous expression in bacteria) and CYP2B1, CYP2B4, and CYP2B11 (high P450 expression), which led to the design of CYP2B6 L264F. This mutant showed enhanced expression and stability compared with the wild type (13).
CYP2B6 is one of the most polymorphic P450 enzymes in humans, with 28 alleles described to date. Several genetic variants are linked to altered plasma levels and/or in vitro metabo-lism of bupropion (14 -17). Mutations in all the variants are located in non-active site regions. Two of the non-synonymous changes in particular, Q172H and K262R, are found in multiple haplotypes. Frequencies of the three most common variants range from 14 to 49% for Q172H, 17 to 63% for K262R, and 0 to 14% for R487C depending on the ethnicity of the population studied (17). At present, the structural basis for the altered function of P450 2B6 variants or for species differences relative to CYP2B1, CYP2B4, or CYP2B11 with regard to the oxidation of steroids (18), or inhibition by imidazoles (16) is largely unknown.
Based on the x-ray crystal structure of bacterial P450cam, Osamu Gotoh in 1992 analyzed 52 P450 family 2 sequences (CYP2), and identified 6 substrate recognition sites (SRSs) (19). These SRSs have been used extensively to guide sitedirected studies in CYP2B enzymes (20). Subsequent x-ray crystal structures have verified most of the active site residues inferred from mutagenesis studies (1,7,10,21). However, to investigate the role of non-active site regions/residues we need an additional approach. One such approach is conserved sequence motifs (CSM) analysis. In previous work, a multiple sequence alignment from representative alphaviruses was used to determine physical chemical property motifs (likely functional areas) with our PCPMer program. Information on residue variability, propensity to be in protein interfaces, and surface exposure on the model was combined to predict surface clusters likely to interact with other viral or cellular proteins. Mutagenesis of these clusters indicated that the predictions accurately detected areas crucial for virus infection (22). In addition, we successfully used this approach to locate regions far from the active site that modulate substrate binding and processivity in apurinic/ apyrimidinic endonuclease (APE1) and related nucleases (23,24). We also showed that CSM as defined by the software package PCPMer can find functionally important residues in surface-exposed regions of viral proteins (25,26). In the present study, we examined a set of 175 P450 sequences from family 2 and identified 20 CSMs. Based on existing structural and functional information on the various CSMs, the role of CSM 8 ( 187 RFDYKD 192 in CYP2B4) in enzyme catalysis, inhibition, and stability was studied by site-directed mutagenesis.
Identification of Conserved Sequence Motifs-We selected 175 CYP2 sequences from the species human, mouse, rat, dog, fugu, and zebrafish, and generated a multiple sequence alignment with ClustalW (28), as presented in supporting data (supplemental Table S1). This multiple sequence alignment was further analyzed with our program PCPMer. In PCPMer each of the 20 natural amino acids was represented by a 5-dimensional vector. The basis vectors of this 5-dimensional space were derived by multidimensional scaling of 237 dimensional physicochemical property (PCP) space (23). The eigenvector that had the highest eigenvalue correlated very well (r ϭ 0.95) with the hydrophilicity scale. However, every eigenvector was a linear combination of different PCP of the 237 dimensional spaces. PCPMer generated a profile for the alignment at every position, which included the standard deviation and relative entropy (30) for each position and component of the 5-dimensional space. PCPMer then used these profiles to identify high relative entropy clusters (highly conserved regions) CSM.
Mutagenesis, Expression, and Purification of P450 Enzymes-All the site-directed single mutants in CSM 8 and a negative control K186A were created using CYP2B4dH/H226Y (H226Y) as the template and appropriate forward and reverse primers as presented in supplemental Table S2. In addition, R187A and D192A were created in CYP2B1dH. To confirm the desired mutation and verify the absence of unintended mutations all constructs were sequenced at the University of Texas Medical Branch Protein Chemistry Laboratory (Galveston, TX). H226Y and CSM 8 mutants were expressed as His-tagged proteins in Escherichia coli TOPP3 and purified using a nickelaffinity column as described previously (7). The P450 content was measured by reduced CO-difference spectra. Protein concentrations were determined using the Bradford protein assay kit (Bio-Rad, Hercules, CA).
Enzyme Assays-The standard NADPH-dependent enzyme assay with 7-MFC, 7-EFC, 7-BFC, and testosterone was essentially carried out as described previously (12,31). To determine the substrate specificity, 200 M of 7-MFC, 7-EFC, and testosterone and 75 M of 7-BFC in 2% methanol were used. For steady-state kinetic analysis a concentration range of 10 -200 M of 7-MFC and 7-EFC was used. Above 200 M concentrations the substrates were not soluble in 2% methanol. The reconstituted system contained P450 (0.05 M), CPR, and b 5 at a molar ratio of 1:4:2. Cumene hydroperoxide (CuOOH)-supported reactions were carried out at 1 mM CuOOH and were devoid of NADPH, CPR, and b 5 . Steady-state kinetic parameters were determined by regression analysis using SigmaPlot (Jandel, San Rafel, CA). The k cat and K m values were determined using the Michaelis-Menten equation. Each kinetic experiment included H226Y, K186A, and the CSM 8 mutants simultaneously for more accurate comparison of the data.
Inhibition Studies-Enzyme inhibition was measured using the 7-EFC O-deethylation assay in a final reaction volume of 100 l at 0.005-0.1 M 4-CPI and 0.1-2.5 M BIF concentrations as described previously (31). BIF and 4-CPI inhibition studies for all the mutants utilized 0.025 M P450. Nonlinear regression analysis was performed to fit the data using a four-parameter logistic function to derive the IC 50 values. Each experiment included H226Y, K186A, and the CSM 8 mutants simultaneously for more accurate comparison of the data.
Thermal Stability-Inactivation of P450 was essentially carried out as described (11,13). The reaction mixture contained 1 M protein in 100 mM Hepes buffer, pH 7.4, in a 1-ml semi-micro spectrophotometric cell with constant stirring using a Shimadzu-2600 spectrophotometer. Thermal inactivation was carried out by measuring a series of absorbance spectra in the 340 -700 nm range as a function of temperature between 30 and 70°C with the interval of 2.5-5°C and a 3-min equilibration at each temperature range. Determination of the total concentration of the heme protein was done by non-linear least square approximation of the spectra using a linear combination of spectral standards of CYP2B4 low spin, high spin, and P420 states. All data treatment and fitting of the titration curves were performed with our SpectraLab software as described (32). Fitting of the temperature profile curves was performed by regression analysis using SigmaPlot (Jandel, San Rafel, CA). The inactivation profiles were fit to a sigmoidal curve using four-parameter logistic function to obtain the mid-point of the thermal transition temperature (T m ) as described (11,13).
Catalytic Tolerance to Temperature-The catalytic tolerance to temperature was studied by incubating enzyme (20 pmol in 20 l dialysis buffer for two assays) at different temperatures (30 -70°C) with the interval of 2.5-5°C for 10 min. The samples were then chilled in ice for 15 min and then brought to room temperature prior to measuring enzyme activity using a 7-EFC O-deethylation assay as described earlier (13). The temperature at which the enzyme retains 50% of the activity (T 50 ) was calculated by fitting the data to a sigmoidal curve using a four-parameter function by regression analysis using SigmaPlot.

Identification of CSM in P450 Family 2-
The program PCPMer identified 20 CSM from the multiple alignments of the selected 175 CYP2 sequences (supplemental Table S1). Because we included sequences from diverse subfamilies, we did not expect to pick up motifs that are specific for subfamilies but rather CSMs that are important and common for the whole CYP2 enzymes. These CSMs were conserved structurally, functionally, and dynamically because of the general features of the family, and they were spread throughout the sequences (Fig. 1). Furthermore, individual motifs and residues within the motifs were analyzed from 175 sequences to rank the level of conservation ( Table  1). The rank order for conservation as defined by the relative entropy level of the motifs is presented in column 3, whereas the rank order of the residues within the motifs are in the order blue Ͼ gray Ͼ red.
Structural and Functional Analyses of the CSM-Using the reported data detailed structural and functional analyses of the CSMs were performed. Structural analysis was performed by comparing CYP2B4 ligand-free (open, 1PO5) and 4-CPI-bound (closed, 1SUO) structures, which differ in their secondary structures and backbone movements (1-8 Å) (6, 7), referred to as enzyme plasticity (8). We analyzed each CSM, and the results describing altered regions of the CYP2B4 are presented in Table 1 and Fig. 1. It was not surprising that regions that interact directly with the heme are conserved, such as CSM 12 in the middle of helix I or CSM 18 at the beginning of helix L. CSM 9 and 10 (helix FЈ, GЈ, and beginning of helix G) are likely involved in enzyme opening/closing and alter their conformation drastically upon ligand binding. On the other hand, CSM 1, a loop before helix AЈ, and CSM 2, a ␤ strand after helix A, are also conserved without any visible involvement in function.
A probable functional role of these CSMs was assessed by reviewing the reported mutagenesis data on CYP2B enzymes, including genetic variants of human CYP2B6. Detailed observations are presented in Table 1 (1, 6, 13, 17, 33-45). It was intriguing to see that many of these CSMs were subjected previously to experimental studies, which showed effects on structure, function, expression, and/or stability (Table 1). Based on the literature, many residues in helices C and D are involved in CPR and/or b 5 binding. Similarly, the role of SRS/active site and substrate access channels mainly through B/C, F/G, H, and I loops/helices has been studied extensively, especially in CYP2B1 (13,(33)(34)39). CSMs 3, 10, and 12 are part of SRS1, SRS3, and SRS4, respectively. In contrast, CSMs 1, 2, 7, 15, 16, 18, and 19 have not been studied earlier and have the least dynamic structures. The most intriguing finding was that the CYP2B6 variants, whose altered functions cannot be explained based on existing structure-function information, are found in the CSM. For example, K262R*, which is the most frequent variant (17-63%) and shows altered activity, is located in motif 11 (37). Interestingly, a mutant in the same CSM, L264F, showed increased stability and expression (13). A significant correlation was found in rank orders of the residues within the motifs and their known functions (Table 1). This correlation suggests that the CSM can be used as one of the approaches to identify the functional role of the residues within motifs. Two interesting regions to test this concept were CSM 8 (residues 187-192 in the E-F loop) and CSM 10 (residues 227-234 in the F-G or GЈ-G loop), because these loops are juxtaposed to the most dynamic regions of the protein, the region involving helices F/G (6, 7) ( Table 1). In addition, these loops are within regions (residues 165-300) that harbor 80% of the beneficial mutations in CYP1A1, CYP2A6, CYP2B1, CYP2B6, CYP2B11, and CYP3A4 enzymes (11-13, 46, 47, 49 -53, 60) identified by directed evolution or rational mutagenesis of non-active site residues. Although the rank orders of conservation CSM 8 and 10 are not very high, they contain an appropriate blend of rank orders of conservation of the residues within the motifs (Table 1). For further functional analysis we selected CSM 8, because it is present between plastic regions 3 (177-188) and 4 (203-298) (8). In CSM 8 Arg 187 , Phe 188 , Tyr 190 , and Asp 192 have the highest rank order of conservation with Յ10 non-identical residues (blue), whereas Asp 189 has an intermediate rank order (gray) and Lys 191 has the lowest rank order (red) of conservation (data not shown). Next we tested the hypothesis that the rank orders of conservation of residues within motif 8 are related to their importance in CYP2B4 structure and function.
Role of CSM 8 in CYP2B4 Substrate Specificity-We created R187A, R187K, F188A, D189A, Y190A, K191A, and D192A in addition to K186A, which serves as a negative control. The mutants were then characterized using coumarin substrates (7-MFC, 7-EFC, and 7-BFC) of variable sizes (Fig. 2). Changes Ն2-fold in substrate specificity, k cat , K m , and/or k cat /K m compared with H226Y are considered significant in the discussion  In column 2 the colors of the residues in the motifs represent the rank order of sequence conservation as a function of relative entropy; red is the least conserved, black is intermediate, and blue represents the most conserved residues. The residues in bold represent ones for which genetic variants or site-directed mutants are known. In column 3 the rank order for conservation of the motifs is presented. In column 4 "altered" stands for either change in secondary structure or displacement of the backbone by Ն 1 Å when open and closed structures are compared. In column 5, the asterisks represent genetic variants in the human CYP2B6, and "P450" represents the P450 in which the mutants, indicated prior to parenthesis, were created. Please note that some of the residues in column 4 are different from column 2 because the sequence differences among the CYP2B enzymes. SRS: substrate recognition site. ND: not determined. Activity is shown for 7-BR, 7-EFC, benzphetamine, testosterone, progesterone, PCB, CPA, IFA, or specific CYP2B6 substrate(s).  (Table 2). R187A showed a ϳ2-fold higher 7-EFC/7-MFC activity ratio than H226Y. In contrast, R187A, Y190A, and D192A showed Ͼ4-fold lower 7-BFC/7-MFC ratios.

Investigation of Conserved Sequence Motifs in P450
Role of CSM 8 in CYP2B4 Steady-state Kinetics-In steadystate kinetic analysis, R187A showed an unchanged k cat /K m with 7-MFC but Ͼ2-fold increase in k cat /K m (0.20 versus 0.09 min Ϫ1 , M Ϫ1 ) with 7-EFC (Table 3). In contrast to the template, R187K and F188A showed ϳ2.5-fold lower k cat and K m with 7-MFC than 7-EFC. In addition, whereas Y190A showed ϳ2-fold decrease in the K m for 7-MFC, D192A showed Ͼ2and Ͼ3-fold decreases in k cat /K m for 7-MFC and 7-EFC, respectively (0.04 and 0.025 versus 0.08 and 0.09, respectively), compared with H226Y. K186A, D189A, and K191A did not show significant changes in the k cat , K m , or k cat /K m values compared with H226Y ( Table 3).
Functional Role of CSM 8 in CYP2B1-To investigate whether the functional role of CSM 8 is conserved in other CYP2B enzymes, we created R187A and D192A in CYP2B1dH and characterized the mutants for P450 expression, substrate specificity, differential inhibition, and thermal stability. R187A and D192A showed ϳ10-fold lower expression and higher P420 than CYP2B1dH (data not shown), suggesting decreased protein stability. The purified D192A showed no holo P450 based on CO-difference spectra, and therefore was omitted from further studies. R187A showed a 1.6-fold higher 7-EFC/7-MFC activity ratio than CYP2B1dH (Table 6). Compared with CYP2B1dH, R187A showed ϳ3-fold decreased differential inhibition (IC 50,BIF /IC 50 , 4-CPI ). Finally, the T m values for R187A and D192A were 11.4 and 9°C, respectively, lower than 2B1dH, and the T 50 of R187A was 3.2°C lower than 2B1dH.
Molecular Modeling of the CSM Mutants Based on the CYP2B4-4-CPI Structure-To examine the role of the Arg 187 3 Ala, Phe 188 3 Ala, Tyr 190 3 Ala, and Asp 192 3 Ala substitutions in decreased protein stability, long range residue-residue interactions were analyzed in the energy-minimized CYP2B4-4-CPI structure using MolMol (54). We observed the residue-residue interaction sites Glu 149 -Asn 177 -Arg 187 -Tyr 190 , Asp 192 -Val 194 , and Phe 188 -Phe 195 . Furthermore, we generated models of the mutants by energy minimization using AMBER (55) to investigate the role of the substitutions in protein stability through altered hydrogen bonds, charge interactions, andstacking (Fig. 4). Fig. 4A clearly shows that Arg 187 forms two strong H-bonds with Glu 149 (Ͻ1.9 Å) and an ionic interaction with Tyr 190 (3.2 Å). Although the Arg 187 3

TABLE 2 O-deethylation of 7-MFC, 7-EFC, and 7-BFC by H226Y, K186A, and CSM mutants
Results are representative of two or three independent determinations carried out simultaneously for more accurate comparison between the template and the mutants. The variation between the experiments is approximately Ϯ 20%.
Conserved motifs are well known in several proteins and have been documented in ProSite (www. expasy.ch/prosite/). However, there is only one P450 ProSite motif, which describes the cysteine heme-iron ligand signature ( (56). Additional known so-called, "conserved motifs" in P450s include: 1) a membrane-binding motif (57) and 2) Pro-rich motif ( 30 PPGPTPFP 37 in P450 2C2) (58). However, these motifs do not shed light on important P450 functions such as differential or overlapping substrate specificity, strict stereo-and regioselectivity, or protein stability. In this study we used a quantitative mathematical description of the naturally occurring 20 amino acids and performed statistical analysis of a multiple sequence alignment from 175 different CYP2 sequences, and identified 20 CSM. This approach may have been overlooked previously, because the enzymes share similar secondary or tertiary structural features irrespective of their diverse sequence identity (15-99%).
Although the mechanism by which the CSM 8 modulates enzyme function is not clear, it can be speculated that R187A, R187K, F188A, Y190A, and D192A exhibit altered dynamics, leading to opening of the active site either directly or by influencing the PR4 region (F/G helices; a proposed substrate access channel), which confers a preference for the larger ligand (BIF) over the smaller ligand (4-CPI). This conclusion is supported by the increase in differential substrate specificity (7-EFC/7-MFC) and k cat values, especially with R187A, R187K, and F188A. Furthermore, R187A, which showed the greatest alteration in substrate specificity (7-EFC/7-MFC, Tables 2 and 3), showed a Ͼ10-fold increased k cat for testosterone 16␣-hydroxylation (6.6 min Ϫ1 versus 0.6 min Ϫ1 ) and a novel 16␤-hydroxylation activity (1.5 min Ϫ1 ) (data not shown), suggesting that the Arg 187 3 Ala substitution allows the enzyme to accommodate testosterone in both orientations. However, a reduced preference for the larger substrate 7-BFC over 7-MFC or 7-EFC in R187A, F188A, Y190A, and D192A is difficult to explain. In any event, the changes in substrate specificity, stereo-and regioselectivity, and inhibitor selectivity are in many ways just as or more striking than those of a recent study of active site substitutions in CYP2B4 (31). An altered interaction of P450 mutants with CPR or b 5 , leading to altered enzyme catalysis, can be ruled out, because most CSM 8 mutants showed similar changes in relative enzyme activity with 7-MFC, 7-EFC, and 7-BFC in CuOOH-to that of NADPH-supported reactions (data not shown). CuOOH is an alternate oxidant, which does not require redox partners for substrate metabolism.
The speculation that CSM 8 regulates a conformational transition is supported by our studies with a bacterial enzyme FIGURE 3. A, thermal inactivation of CYP2B4dH mutants (1 M) monitored by a decrease in the amount of total heme protein as a function of temperature as described under "Experimental Procedures." Determination of the total concentration of the heme protein was done by non-linear least square approximation of the spectra by a linear combination of spectral standards of CYP2B4 low spin, high spin, and P420 states. The data were fit to a sigmoidal curve to obtain T m . B, catalytic tolerance to temperature monitored by a decrease in the enzyme activity with 7-EFC as a function of temperature as described under "Experimental Procedures." The data were fit to a sigmoidal curve to obtain T 50 .  (59). The basis of the decreased thermal stability and catalytic tolerance to temperature in the CSM mutants R187A and D192A can be explained by reduced charge pairs and hydrogen bonds, whereas a reduced protein stability in F188A can be explained by the absence ofstacking. An interaction between Glu 149 and Tyr 190 in CYP2B1 has already been postulated earlier based on the decreased thermal stability of Y190A, Y190F, E149Q, and E149A, presumably as the result of a disrupted H-bond (45). In all the cases, the interacting residues are far from the active site, and therefore, it is speculated that the reduced interactions lead to decreased thermal stability through unfolding of the protein on the surface rather than destabilizing the active site through increased heme dissociation. This hypothesis is also supported by the fact that the temperature inactivates the catalytic activity (T 50 ) and heme protein (T m ) in a similar manner in most of the CSM mutants.
Comparison of the P450 expression and thermal stability  of the CSM mutants reveals interesting similarities and differences. The findings suggest that the decreased thermal stability of R187A, Y190A, and D192A is associated with decreased P450 expression. In contrast, the Phe 188 3 Ala substitution destabilizes the protein without altering P450 expression. P450 can be thermally inactivated by protein unfolding, inactivation of the heme moiety, and degradation into P420 (an inactive form of P450), whereas P450 expression can be influenced by the rate of incorporation of the heme in addition to the above mechanisms (11,13). The results for 2B4dH are consistent with earlier observations for 2B6dH. Thus, although M198L and L390P showed enhanced expression, thermal stability was decreased, whereas L264F showed both enhanced P450 expression and stability (13).
Of the six residues tested in CSM 8, substitutions at four (Arg 187 , Phe 188 , Tyr 190 , and Asp 192 ) lead to significantly altered enzyme functions, which correlate well with their rank orders of sequence conservation. Prior results from another laboratory on CYP2B1 suggest that Tyr 190 is important for enzyme catalysis and protein stability (45). Using a negative control mutant K186A (juxtaposed to CSM 8), which does not show altered enzyme catalysis, inhibition, and stability, further suggests that the highly conserved CSM mutants are critical for P450 structure-function relationships. Further verification of the importance of specific residue positions is provided by the lack of functional alternations in D189A and K191A in contrast to the marked changes in R187A and D192A. In previous work, we successfully used the CSM approach followed by Ala substitutions to locate regions far from the active site that modulate substrate binding and processivity in apurinic/apyrimidinic endonuclease (APE1) and related nucleases (22)(23)(24). A comparative study with random Ala mutagenesis found fewer residues important for the function (27).
The CSM 8 mutants R187A and D192A, which showed decreased P450 expression and stability in CYP2B4dH, also exhibited similar effects in CYP2B1dH. In addition, the activity and inhibitor potency in R187A were increased for the larger compounds in both CYP2B4dH and CYP2B1dH. These results clearly suggest that the CSM 8 has an important and conserved functional role in CYP2B enzymes. For more than a decade the structural basis of substrate specificity and stereo-and regioselectivity of CYP2B enzymes has been studied by altering substrate recognition sites or substrate access channels (1,20). However, growing evidence based on crystal structures and directed evolution of CYP2B enzymes and enzymes from other CYP subfamilies suggests that plastic regions and/or regions that regulate the ligand-induced enzyme plasticity also determine differential substrate specificity and stereo-and regioselectivity (1,29,46,48). The enzymes from different CYP subfamilies differ in the extent of plasticity, which could be associated with differential substrate specificity and stereo-and regioselectivity. For example, in contrast to the highly plastic CYP2B4 structures the CYP2A6 structure shows a compact, hydrophobic active site with one hydrogen bond donor, Asn 297 , which orients coumarin for regioselective oxidation. In addition, methoxsalen effectively fills the active site cavity without substantially perturbing the structure (48). Moreover, the CYP2C5 structures show an intermediate level of plasticity, with backbone motions on the order of 1-3 Å. 4 Furthermore, to test the hypothesis that there are similarities and differences in CSM within the CYP2 family we analyzed the CYP2A, CYP2B, CYP2C, CYP2D, and CYP2J subfamilies separately (supplemental Table S3) and compared the CSM with the motifs for the whole CYP2 family. We identified motifs that are specific for each subfamily and those common to the whole CYP2 family. CSMs 1, 8, and 14 were found in all the subfamilies. Although, CYP2E was not analyzed because of the presence of only two sequences, they also contain CSM 8. The results suggest that CSM 8 is one of the most important motifs in all the subfamilies analyzed and is critical for structural stability and regulating enzyme functions. These analyses suggest that future functional studies are needed to investigate the role of CSM 1 and 14 and specific motifs of each subfamily in protein stability, differential substrate specificity, stereo-and regioselectivity, and enzyme plasticity.
In conclusion, we analyzed CSM in CYP2 enzymes. Detailed structural and prior functional analyses suggested that many of the 20 motifs identified are important for structure and function. Furthermore, study of motif 8 ( 187 RFDYKD 192 ) in CYP2B4 and CYP2B1 suggested that this is an important functional motif in helix E, perhaps mediating a structural change by an effect on the flexible helices F/G region, leading to an altered rate of substrate entry to the active site, ligand binding specificity, stereo-and regioselectivity, and/or dynamics of the protein.
The study also tested an important hypothesis that the rank orders of the residue conservation within the motif are related to their functional importance. This first rational approach to analyze structure-function in P450 with regard to non-active site regions/residues is an important step forward for investigating the functional role of CSM in other P450 enzymes, such as families 1 and 3, which contain some of the most important human xenobiotic-metabolizing enzymes. In addition, the approach can be used to engineer other P450 enzymes with altered substrate specificity, stereo-and regioselectivity, or protein stability by site-directed mutagenesis.