Membrane-embedded substrate recognition by cytochrome P450 3A4

Cytochrome P450 3A4 (CYP3A4) is the dominant xenobiotic-metabolizing enzyme in the liver and intestine and is involved in the disposition of more than 50% of drugs. Because of its ability to bind multiple substrates, its reaction kinetics are complex, and its association with the microsomal membrane confounds our understanding of how this enzyme recognizes and recruits diverse substrates. Testosterone (TST) hydroxylation is the prototypical CYP3A4 reaction, displaying positive homotropic cooperativity with three binding sites. Here, exploiting the capability of accelerated molecular dynamics (aMD) to sample events in the millisecond regime, I performed >25-μs aMD simulations in the presence of three TST molecules. These simulations identified high-occupancy surface-binding sites as well as a pathway for TST ingress into the CYP3A4 active site originating in the membrane. Adaptive biasing force analysis of the latter pathway revealed a metastable intermediate that could constitute a third binding site at high TST concentrations. Prompted by the observation that interactions between TST and the G′-helix mobilize the ligand into the active site, a free-energy analysis of TST distribution in the membrane was conducted and revealed that the depth of the G′-helix is optimal for extracting TST. In summary, these simulations confirm separate, but adjacent substrate-binding sites within the enzyme and the existence of an auxiliary TST-binding site. The broader impact of these simulations is that they support a mechanism in which cytochromes P450 directly recruit membrane-solubilized substrates.

The cytochromes P450 metabolize nearly all drugs and environmental xenobiotics to which humans are exposed. Of the 57 cytochromes P450 encoded by the human genome, cytochrome P450 3A4 (CYP3A4) 2 is involved in the clearance of more than half of approved drugs and presents significant challenge to optimizing pharmacokinetics in drug development and avoiding adverse effects in the clinic (1,2). CYP3A4 enzyme kinetics can demonstrate marked cooperativity (3). Positive cooperativity is characterized by a continuous, positive increase in the rate of catalysis with increasing substrate concentration, whereas positive heterotropic cooperativity occurs when the rate of substrate turnover is enhanced by a second distinct effector molecule. Cooperativity in CYP catalysis was first recognized in microsome preparations (4 -6) and later confirmed using purified CYP3A4 with several substrates including testosterone (TST), 17␤-estradiol, amitriptyline, and aflatoxin B1 (7). A shift in the regioselectivity of midazolam (MDZ) oxidation in patients by concurrent fluconazole administration supports that heterotropic CYP3A4 interactions are also relevant in vivo (8). Cooperativity in CYP3A4 catalysis has been explained by mechanisms invoking simultaneous occupation of the active site by multiple ligands or binding to nearby allosteric sites (9 -15).
The membrane environment can have a profound effect on membrane protein function. With the exception of the N-terminal transmembrane helix, only the FЈ-and GЈ-helices of CYP3A4 are purported to interact with the membrane. Notwithstanding an only superficial interaction with the catalytic domain, the membrane imparts dramatic changes in the physical properties of the protein. Incorporation of CYP3A4 into lipid nanodiscs induces a clear increase in its thermal stability. Furthermore, the structural and dynamic changes imparted by the membrane likewise influence ligand binding affinities. Dissociation constants of some ligands decrease by an order of magnitude, and the kinetics of ligand binding are tunable by changing the lipid composition (5)(6)(7). In view of the evidence that membrane interactions modulate CYP3A4 function, it is important that they be included in models, experimental or theoretical, aimed at understanding the functional dynamics of this important enzyme.
The amphipathic nature of CYP3A4 substrates has led to speculation that the enzyme harvests them from the membrane; however, neither experimental nor theoretical evidence in direct support of this mechanism are available. Understanding how substrates are recognized by CYP3A4 and the locations of allosteric sites are required to improve predictive tools of CYP3A4 metabolism that would be invaluable to optimizing drug therapy. The ideal computational approach to help fill this knowledge gap would permit simulation of the necessary time scales for a ligand to sufficiently explore the protein surface to identify optimal binding site(s). Accelerated molecular dynamics (aMD) is particularly advantageous for this purpose because, in combination with the recent advances in graphical processing unit-enabled hardware, it can provide access to ligandbinding events that occur on the millisecond time scale without requisite knowledge of the binding pathway (16). Hundreds-ofnanosecond aMD simulations have captured the activation of the M2 and M3 muscarinic G-protein-coupled receptors (17)(18)(19) and ligand binding to the M3 receptor (20). In this work, aMD is extended to CYP3A4 to simulate TST binding to the enzyme free of a predefined reaction coordinate. aMD successfully identifies probable locations of auxiliary binding sites that reflect experimental observations and reveals a membrane-embedded pathway for TST entrance into the active site. Freeenergy analyses of the ingress pathway provides a structural basis for the kinetic behavior of CYP3A4 mutants and insight into the mechanism for exclusion of progesterone from the active site in CYP3A4 crystal structures.

Conventional molecular dynamics
The initial configuration for the CYP3A4-membrane system was generated by inserting the N-terminal transmembrane domain into a 75-Å 2 POPC bilayer such that the GЈ-helix was oriented just above the POPC headgroups, or 8 Å above the average position of the phosphate (PO 4 ) units. In the 50-ns simulation that followed, the A-anchor (residues 44 -47), FЈ-helix (residues 217-226), and GЈ-helix (residues 227-238) associated with the membrane headgroups in 10 ns. A similar orientation of the catalytic domain and time for insertion were observed by Baylon et al. (22) using only the soluble catalytic domain of CYP3A4 and the highly mobile membrane-mimetic (HMMM) representation that replaces native lipids at the aqueous interface with those having truncated tails and the interior with 1,1-dichloroethane (DCE) to mimic the bulk hydrocarbon environment.
Following the initial 50 ns of conventional molecular dynamics (cMD) simulations, three TST molecules were randomly added to the water surrounding the CYP3A4 catalytic domain, and the system was equilibrated for an additional 100 ns in the constant temperature and pressure (NPT) ensemble. Two of the TST molecules were observed to partition into the POPC headgroups. Consistent with the observations of Baylon et al. (22), excluding the N-terminal helix, the A-anchor was observed to insert deeper into the membrane followed by the GЈ-and FЈ-helices. The end point of this simulation, the structure of POPC highlighting the relevant fragments, and the positions of the relevant domains are illustrated in Fig. 1. The average positions of the acylglycerol (AG) and PO 4 units were 13.8 Ϯ 0.7 and 17.4 Ϯ 0.2 Å above the membrane center, respectively (Fig. 1C). The FЈ-helix (z ϭ 15.9 Å) sits slightly higher in the membrane than the GЈ-helix (z ϭ 14.2 Å) and therefore, on average, is bounded below by the AG and above by the PO 4 units. The A-anchor (z ϭ 12.9 Å) persists below the AG centers of mass and into the upper boundary of the hydrocarbon tails. Although the relative insertion depths of these domains are consistent with previous simulations, the absolute depths are greater than those reported previously (22). For example, in the simulations reported by Baylon et al. (22), the FЈ-helix was not observed to penetrate beyond the choline fragment of the headgroups. In the simulation described herein, insertion depths of these domains are converged prior to the addition of TST molecules to the system, so the observed differences are not attributable to the extended simulation time used in the present study. Conversely, the variations in insertions are likely attributable to differences in the properties of the HMMM and a full membrane. First, the DCE used to mimic the hydrophobic tails in the HMMM extends well into the AG layer of the headgroups (21). The result is a membrane-mimetic representation with greater density in this region than in a full Figure 1. A, end point of the 100-ns cMD simulation following the addition of TST. The A-anchor, FЈ-helix, and GЈ-helix are illustrated in green, purple, and yellow, respectively. TST molecules are red, and the POPC membrane, which is partially cut away to reveal the immersed domains, is represented as a gray contour surface. The putative auxiliary TST-binding site is highlighted with a dashed circle. B, structure of POPC illustrating the hydrophobic tails (black), AG (orange), PO 4 (blue), and choline (red). C, average distances above the membrane center of lipid components and protein domains in the leaflet bearing the CYP3A4 catalytic domain. The AG and PO 4 units are represented by orange and blue dashed lines. Positions of the A-anchor, FЈ-helix, and, GЈ-helix are represented with green, purple, and yellow solid lines, respectively. CYP3A4 substrate-binding dynamics membrane representation, thereby posing a density barrier preventing deeper penetration of the protein helices. In addition, the interspersion of DCE also decreases the polarity at this depth relative to the full membrane, imposing an energetic penalty on further penetration of the amphipathic helices. Nevertheless, experience with the HMMM is still limited, and caution should be exerted when interpreting simulations where there is limited experimental data describing the extent of the proteinmembrane interaction.

Accelerated molecular dynamics
Following cMD in the presence of TST, 50 0.5-s aMD simulations were seeded to extensively sample TST interactions with CYP3A4. Typical of all replicas, the TST distribution from one copy is illustrated in Fig. 2A. It is clear that TST molecules sample the surface of the CYP3A4 catalytic domain as well as partition into and persist in the POPC bilayer. In two replicas, TST entered the active site from the membrane face of the catalytic domain. Snapshots of the TST positions in the window surrounding these binding events are illustrated in Fig. 2B. In both instances, TST initially interacts with the GЈ-helix and is subsequently captured between the groove formed by the FЈ-/ GЈ-helices and the B-C loop. To illustrate the timescales of TST binding in each replica, the distances between each TST and the heme iron are plotted over the length of the simulations (Fig. 2, C and D). TST is ensconced in the active site by 150 and 50 ns in the first and second replicas, respectively, with an average iron-TST distance of 5-8 Å (Fig. 2, C and D, blue traces). The length of the simulation in the first replica was extended to 1 s wherein the TST was observed to remain bound in the active site adjacent to the heme.
In the cMD simulations, one TST was found in a crevice bordered by the D-, JЈ-, K-, and L-helices adjacent to the proximal face of the heme (Fig. 2, C and D). TST persisted in this site for 645 ns in the first replica where concomitant active site binding was observed with the exception that from 225 to 275 ns the TST molecule dissociated from the site, diffused more than 40 Å away, and returned to its original position. Beyond 645 ns, TST moved into a second stable binding site 5 Å from its initial position and remained there until the end of the trajec- The POPC membrane is represented as a gray contour surface. C and D, distances between the TST centers of mass and the heme iron versus time for the two replicas where TST was observed to enter the CYP3A4 active site through the membrane. Solid lines represent a 30-ns moving average of the distances. E, volumetric map (red) illustrating the highest TST mass-weighted density over the 25 s of aMD simulation. Representative TST positions from trajectory snapshots where it was observed to occupy these sites are illustrated in green. Both the TST positions and volumetric map are superimposed onto initial CYP3A4 geometry used in the aMD simulations. F and G, magnified views of the volumetric map highlighting the auxiliary TST-binding sites. H, representative cvSMD pathway for TST egress from CYP3A4.

CYP3A4 substrate-binding dynamics
tory. In the second replica, TST remained in the same auxiliary site occupied in the first replica through the entire trajectory. In both, the third TST molecule remained more than 20 Å from the heme iron.
Inspection of the remaining replicas found 28 (56%) where TST was bound in the crevices bordered by the D-, JЈ-, K-, and L-helices similar to that illustrated in Fig. 1A. TST binding to this site was reversible in many of the aMD replicas, so the frequent occupation of this site was not an artifact of the cMD end point. To obtain a cumulative view of their occupancy over all replicas, the trajectories were superimposed using the C␣ atoms, and an atomic weighted density map for TST was generated using the VolMap tool in VMD (23). The density isocontour surface reveals the greatest TST density at two neighboring peripheral sites, bordered by the D-, JЈ-, K-, and L-helices near the heme but not in the active site (Fig. 2E).
Like the high-occupancy site observed in the simulations, there is experimental evidence for a high-affinity TST-binding site that bears the features of being outside the active site (24). Using the CYP3A4 embedded in POPC nanodiscs, occupation of the high-affinity site induces only a minor shift in the iron spin state (22%), and under catalytic conditions NADPH oxidation is increased severalfold without coupled product formation. Both of these observations are consistent with a TSTbinding site that is too far from the heme iron to efficiently displace the iron-coordinated water or undergo hydrogenatom abstraction by the compound I oxoferryl species.
In both of the auxiliary sites, TST is predominately surrounded by hydrophobic residues (Fig. 2, F and G). Among these, Arg 446 lies at the interface of both sites where its aliphatic side chain contacts TST. Using cross-linking mass spectrometry and mutagenesis, Zhao et al. (25) demonstrated that Arg 446 contributes to the binding surface and mediates the electrostatic interaction between CYP3A4 and cytochrome b 5 . Like occupation of the high-affinity site by TST, the superficial interaction with cytochrome b 5 can likewise induce a partial spin transition of the heme iron (26). Loss of catalytic activity in the R446A mutant is not surprising because elimination of the cationic binding site would undoubtedly be detrimental to productive interactions with redox partners. However, this mutant is still capable of binding TST albeit with reduced affinity and cooperativity.

Adaptive biasing force simulations of TST binding
Although aMD has pushed the simulation of events in biomolecules into the millisecond regime, reweighting the trajectories to recover free energies from the original ensemble has proven more difficult because the trajectories are dominated by large values of the boost potential, resulting in noisy energy statistics (27). To assign free energies to states along the pathway for TST ingress, the pathway observed in the aMD simulation was reconstructed using constant velocity-steered molecular dynamics (cvSMD) (28,29). The cvSMD pathway is illustrated in Fig. 2H. cvSMD provides a finely resolved ensemble of structures along the ingress coordinate () from which a set of equally spaced representatives were selected for an adaptive biasing force (ABF) calculation (30) of the potential of mean force (PMF). Spanning 22 Å, was partitioned into 2-Å bins that extended from TST burial in the CYP3A4 active site to immersion in the POPC headgroups. Each bin was populated with a trajectory snapshot from the cvSMD simulation corresponding to a value of within that bin. Sampling in each bin was initially performed for 100 ns and in 10-ns increments thereafter. The complete PMF (⌬G()) converged after 150 ns with an overall free-energy change (⌬G U 3 B ) of Ϫ12.9 kcal⅐mol Ϫ1 . The 100-ns PMF, those from the subsequent 10-ns increments, and the final converged PMF are illustrated in Fig.  3A. Although the PMF is rugged as expected for ligands binding to proteins, it can be divided into three general regions: 1) TST immersed in the POPC headgroups including initial contact with the GЈ-helix, 2) a metastable intermediate, and 3) TST bound in the active site.
From total immersion in the POPC headgroups at Ӎ 20 -22 Å, the initial minimum occurring at Ӎ 18 Å corresponds to the first interaction of TST with the GЈ-helix. In a representative snapshot from Ӎ 18 Å (Fig. 3B), the lower face of TST is bounded by the GЈ-helix, and the upper face is bounded by the AG fragments of POPC. This initial interaction suggests a mechanism in which the GЈ-helix serves as the initial contact as well as a "ramp" to load TST directly from the membrane.
From the initial interaction at Ӎ18 Å, TST subsequently encounters small energy barriers (Ͻ1 kcal⅐mol Ϫ1 ) corresponding to movement over the GЈ-helix before encountering the intermediate configuration at Ӎ 10 Å and 4.9 kcal⅐mol Ϫ1 (⌬G U 3 I ) below the unbound state. At this point in the PMF, TST is flanked by the FЈ-and GЈ-helices and the B-C loop (Fig.  3C). The origin of the minimum on the PMF is apparently derived from stabilization through close packing of "Phe cluster" residues (Phe 108 , Phe 213 , Phe 220 , Phe 241 , and Phe 304 ) as well as Ile 120 , Ile 223 , and Val 240 around TST. The intermediate TSTbinding site is reminiscent of that for progesterone (PRG) in crystal structures with CYP3A4; however, there are important differences (31,32). In Fig. S1, an overlay of the intermediate structure with the CYP3A4-PRG crystal structure shows that PRG sits closer to the FЈ-helix and is packed against Phe 213 , Phe 219 , and Phe 220 . PRG is precluded from entering the active site by the interdigitation of the Phe 108 , Phe 213 , Phe 215 , Phe 241 , and Phe 304 side chains. By contrast, TST leverages the aperture open in the intermediate, disrupting the Phe cluster with concomitant movements of the B-C, F-FЈ, and G-GЈ loops, thereby resulting in deeper penetration into the entrance channel.
The PRG site has been speculated to be both an allosteric site and/or a result of PRG capture in the entrance channel by crystallography. These and previous MD simulations have established that interactions with the membrane induce remarkable conformational changes in the FЈ-an GЈ-helices compared with those observed crystallographically (22,33,34). The interaction with the membrane unwinds the N terminus of the G-helix (residues 243-245) and extends the G-GЈ loop, causing Phe 241 to disengage from the hydrophobic cluster. In light of these conformational changes, it is not surprising that the positions of TST in the intermediate state and PRG in the crystal structure are not superimposable. In the CYP3A4-PRG structures, the GЈ-helix packs against the same helix of a second CYP3A4 monomer in the unit cell, and the pair are related by an S 2 axis perpendicular to the GЈ-helical axis (Fig. S2). Sevrioukova and CYP3A4 substrate-binding dynamics Poulos (32) recognized that the interface created by this interaction creates a hydrophobic patch to accommodate PRG. This interaction apparently stabilizes the closed configuration of the Phe cluster, precluding passage of this ligand into the active site. Assuming PRG enters the enzyme using the same mechanism as TST, it is unlikely that the PRG site represents a native transport intermediate but rather a non-native state resulting from distortion of the native potential-energy surface by the crystal lattice.
From the intermediate state, TST surmounts a 2.2 kcal⅐ mol Ϫ1 energetic barrier (⌬G I 3 B ) to enter the active site. Conversely, passage of TST back into the intermediate state requires overcoming a 10.6 kcal⅐mol Ϫ1 barrier, and it is thereby kinetically trapped in the active site. The TST ␣-face is packed against the heme, whereas the side chains of Phe 304 , Phe 213 , and Phe 215 contact the ␤-face (Fig. 3D). Orientation of the ␣-face of TST toward the heme is inconsistent with the dominant 6␤hydroxylated product (35). Nevertheless, this configuration would not afford adequate volume to accommodate O 2 ; hence, TST reorientation would also be required for productive catalysis. Because the TST C6 lies ϳ3.7 Å from the heme iron ( Fig. 4A), only a minor reorganization of the TST would be required to accommodate O 2 .
The CYP3A4 active site conforms to TST with Phe 213 and Phe 215 in the F-FЈ loop and Phe 304 in the I-helix, restraining TST close to the heme. Leu 482 , Ala 305 , Arg 105 , and Ser 119 surround the TST periphery, limiting translation (Fig. 3D). In the recent structure obtained with MDZ, the ligand contacted both Leu 482 and Ser 119 where the latter residue apparently contributes a functionally important hydrogen bond to MDZ (36). The relative positions of TST, Ser 119 , Ile 301 , Phe 304 , and Arg 372 are illustrated in Fig. 4A, and the distances between those that directly contact TST and the TST center of mass are plotted in Fig. 4B. In contrast to the CYP3A4 structure with MDZ, hydrogen bonds involving Ser 119 were not observed; rather the backbone carbonyl of Ile 301 is hydrogen-bonded to the 17␤-hydroxyl of TST. At the opposing end, the 3-keto oxygen is hydrogen-bonded to water molecules that bridge the heme propionate to the backbone carbonyl of Arg 372 .
Numerous residues have been targeted for mutagenesis to identify regions of the protein involved in substrate specificity and kinetic cooperativity (37)(38)(39)(40). Mutations have successfully

CYP3A4 substrate-binding dynamics
restored hyperbolic reaction kinetics as well as mirrored the stimulation induced by heterotropic effectors. One residue, Phe 304 , has demonstrated importance in substrate specificity and catalytic cooperativity. The V max is increased for PRG hydroxylation, and the Hill coefficient is slightly reduced in the F304A mutant, consistent with a decrease in cooperativity (38). Hyperbolic 6␤-hydroxylation kinetics are restored in the F304W mutant, and stimulation by ␣-naphthoflavone is diminished (41). This evidence supports that Phe 304 sits near the interface of the active site as well as an effector-binding site. Indeed, Phe 304 does directly interact with TST in both the intermediate and fully bound states (Fig. 3, C and D). In the intermediate state, the mean distance between TST and the Phe 304 phenyl is 5.8 Ϯ 0.4 Å and is therefore closely packed against the steroid nucleus. In the bound state, at 9.8 Ϯ 0.4 Å, Phe 304 is more distant, although it is positioned to prevent TST from reverting back into the intermediate state (Fig. 4B). To the extent that the active and intermediate sites could be simultaneously occupied at high concentrations, the intermediate site could constitute the third TST-binding site and is consistent with neighboring active and allosteric sites.

ABF simulations of TST in a POPC bilayer
It is clear from examining the distribution in individual aMD trajectories that TST predominately persists among the POPC headgroups ( Fig. 2A). Similar results were obtained in umbrella-sampling studies of cortisone in a POPC bilayer where the minimum was also within the AGs (42). Considering that the FЈ-and GЈ-helices are likewise inserted at a depth dominated by the AG units and ABF calculations support a mechanism where TST is harvested from this region by the GЈ-helix, it was hypothesized that the insertion of the FЈ-GЈ region was optimized by evolution to harvest amphipathic substrates from the membrane. This hypothesis was tested using a pair of unrestrained simulations of TST in a POPC bilayer and by calculation of the PMF for TST to traverse the bilayer. First, in separate, symmetrically hydrated (Ϯ10 Å) 50-Å 2 POPC bilayers, TST was placed at the center of the membrane (among the lipid tails) and at 22 Å above the membrane center (surrounded by water). Following equilibration of the system with TST fixed in these two positions, the TST restraints were released, and the simulations were conducted for 50 ns. The composite-normalized, mass-weighted densities for TST and the membrane headgroup components over the simulations are illustrated in Fig. 5A. Despite migrating to opposing leaflets, from these analyses it is evident that TST almost exclusively persists among the AG units of the POPC bilayer.
To determine whether the AG units represented the energy minimum along the one-dimensional reaction coordinate for TST crossing the membrane ( T ), a series of constrained simulations were performed where TST was moved incrementally along the membrane normal between the center and the sol-

CYP3A4 substrate-binding dynamics
vated position to establish initial configurations for an ABF calculation. The resulting PMF for TST crossing the POPC bilayer is illustrated in Fig. 4B. The minimum of the PMF lies at T ϭ 12.0 Å above the membrane center, 6.9 kcal⅐mol Ϫ1 below TST in water ( T ϭ 20 -22 Å). The lipid tails present a 4.5 kcal⅐mol Ϫ1 barrier for TST to cross the bilayer. The PMF confirms the result of the unconstrained simulation that dissolution among the AG groups corresponds to the minimum in the PMF. To compare the basin in the PMF with the positions of the FЈ-and GЈ-helices, distributions of insertion depths from the 1-s aMD replica were superimposed in Fig. 5B. The mass density distribution of the GЈ-helix distance could be fit to a Gaussian centered at 12.8 Å, which nearly matches the position of the PMF basin. Thus, the GЈ-helix occupies a nearly optimal position for contacting membrane-embedded TST. Because the distributions were derived from long, enhanced-sampling MD simulations, it is not surprising that the distributions of these domains were broad (full width at half-maximum ϭ 11.6 Å), demonstrating that the GЈ-helix occasionally dips into the upper vicinity of the lipid tails as well as into the choline layer. This behavior would allow CYP3A4 to have great flexibility in the polarity of substrates it could recruit from the membrane. This characteristic would also permit the enzyme to function in membranes with somewhat different and more complex environments akin to those found in vivo compared with the relatively simple model POPC membrane used in these studies. Indeed, the distance separating the preferred dissolution layer for a particular substrate from the optimum position of the GЈ-helix position in a given membrane environment may influence the kinetics of substrate oxidation by CYP3A4.

Conclusion
The results presented herein represent the first simulation of ligand binding to a cytochrome P450 from solution. The positive homotropic cooperativity in the oxidation of TST by CYP3A4 is well-documented and has been attributed to three TST-binding sites (13,24,43). The TST-CYP3A4 pair is an ideal system to capture the structural details of these aspects of cytochrome P450 activity by simulation. aMD frequently and reproducibly identified binding sites in close proximity to the heme that are consistent with the partial heme spin shift and NADPH oxidation without coupled product formation previously observed at low TST concentrations (24). Albeit with lower frequency, aMD also captured the entry of TST into the active site through an entirely membrane-embedded pathway. Free-energy analysis of this path proved it to be a two-step mechanism with an intermediate state where TST rests in the access channel stabilized by residues of the Phe cluster. Taken together with existing site-directed mutagenesis data for this region, it is probable that the active and intermediate sites could be simultaneously occupied at high TST concentrations. Finally, these studies demonstrated that the GЈ-helix is near optimally positioned within the model membrane to contact TST, although its distribution spans the complete headgroup space as well as into the hydrocarbon tails, affording access to more or less polar substrates. The broader impact of these simulations is that they support a viable mechanism whereby CYP3A4 can recruit substrates directly from the membrane.
However, given the vast number of CYP3A4 substrates and their broad range of physicochemical properties, it would be irresponsible to claim that all CYP3A4 substrates use the same mechanism to enter the active site. Nonetheless, it is possible to use the approach described herein to predict how other substrates gain access to the CYP3A4 active site.

System preparation and conventional MD
An N-terminally modified version of the 2.05-Å resolution CYP3A4 structure (Protein Data Bank code 1TQN) was used as the starting point for all simulations (44). Because the extent of the fluctuations observed in the simulations far exceed the structural deviations observed in the available crystal structures, the conclusions are expected to be independent of the crystal structure used to construct the initial model (see Fig.  S3). Following removal of crystallographic water, an ␣helix with the sequence 1 MA-13 LLLAVFLVLLYLYGT 27 was appended to the N terminus of the protein. This N terminus lacks residues 3-12 of the full-length protein and bears an S18F mutation (underlined). The overwhelming majority of biophysical studies of CYP3A4-ligand interactions in model membranes have been performed with this version of the protein so it was adopted here to avoid confounding comparisons between simulation predictions and past experimental data (24,(45)(46)(47)(48)(49)(50)(51). Protonation states of amino acids were assigned using Poisson-Boltzmann-based PROPKA calculations at pH 7.4 (52). Curated CYP3A4 was then inserted into a 75-Å 2 POPC membrane constructed using the Membrane Builder plugin of VMD (23). POPC and water molecules overlapping and within 0.8 Å of the protein were removed. The system was solvated with a 10-Å layer of TIP3P water in the Ϯz directions. Potassium and chloride ions were added to an ionic strength of 0.15 M, thereby ensuring electrical neutrality.
All molecular dynamics procedures were performed with the NAMD 2.12 code (53) using the CHARMM36 (54) force field for POPC and the CHARMM27 (55) force field for the remainder of the system. The complete system was minimized for 10,000 steps followed by equilibration of the lipid tails for 500 ps with the coordinates of the remaining atoms in the system frozen. Melting of the lipid tails around the protein was followed by 2 ns at NPT simulation with a harmonic restraint (5 kcal⅐mol Ϫ1 ⅐Å Ϫ2 ) applied to the protein C␣ atoms using the Langevin piston method with a target pressure of 1.01325 bars, decay period of 100 fs, and piston temperature of 310 K. After completion of the preparative simulations, the complete CYP3A4 system was equilibrated in the NPT ensemble for an additional 50 ns. This end point served as the initial configuration for the addition of three TST molecules placed randomly in the water on the membrane face bearing the CYP3A4 catalytic domain. Parameters for TST were from the CGenFF force field (56,57). Following the addition of TST, the system was minimized for 10,000 steps and equilibrated for an additional 100 ns in the NPT ensemble.

Accelerated MD
aMD is a convenient approach to sample rare conformational transitions in proteins by increasing the escape rates CYP3A4 substrate-binding dynamics from potential energy basins through reduction of energetic barriers (27). In an aMD simulation, when the system potential energy, V(r), is greater than a user-defined value E, the simulation proceeds as in a conventional simulation. Conversely, when V(r) falls below E, a continuous, positive "boost" energy is added such that the simulation is performed on a modified potential energy surface, V*(r).
The value of the boost energy depends on a tuning parameter, ␣, that determines the extent that energy basins on the modified potential energy surface, V*(r), are elevated, effectively smoothing the potential energy surface.
Fifty replicas based on the end point of the conventional simulation containing TST were created, and their initial velocities were independently randomized at 310 K. aMD simulations were performed in "dual-boost" mode meaning that boost potentials were added to both the dihedral and total system potential energies with E dihed ϭ V dihed_avg ϩ 0.3V dihed_avg , ␣ dihed ϭ 0.3V dihed_avg /5, E total ϭ V total_avg ϩ 0.2N atoms , and ␣ total ϭ 0.2N atoms . aMD simulations were performed for 0.5 s in each replica. The simulation was extended to 1 s in one of the replicas where TST entered the active site.

Steered molecular dynamics
Following identification of the binding pathway in aMD simulations, cvSMD simulations were initiated from the first aMD trajectory snapshot that revealed occupation of the CYP3A4 active site by TST. The binding pathway was approximated by a linear path defined by the TST center of mass and a second point on the cvSMD trajectory where TST was immersed in the lipid headgroups adjacent to the GЈ-helix. The vector defined by these points served as the pulling vector. A moving harmonic restraint was attached to the TST center of mass and pulled with a velocity of 1.25 ϫ 10 Ϫ5 Å⅐fs Ϫ1 along the pulling vector. C␣ atoms of residues 19,39,167,194,285,341, and 417, all distant from the entrance channel, were harmonically restrained (600 kcal⅐mol Ϫ1 ⅐Å Ϫ2 ) to avoid translation of the CYP3A4 system. The force on the harmonic constraint was recorded every 400 fs. Steered molecular dynamics simulations were performed for a total of 4 ns and repeated 10 times to generate an ensemble for PMF calculations.

Adaptive biasing force simulations
The constant velocity pulling vector from cvSMD was adopted as the reaction coordinate for the ABF simulations (30). The reaction coordinate was divided into 2-Å bins, and trajectory snapshots with ligand positions as close as possible to the beginning of the bins were selected as initial positions. These frames were minimized for 2000 steps with the TST coordinates frozen. Those C␣ atoms restrained in the cvSMD simulations were maintained in the ABF calculations. The ABF data were accrued in bins 0.1 Å wide with upper and lower wall constants of 10 kcal⅐mol Ϫ1 ⅐Å Ϫ2 . Simulations were performed until the overall PMF was observed to converge, defined as a mean PMF difference of less than 0.3 kcal⅐mol Ϫ1 . Because the biasing force rapidly varies, no bias was applied during the first 1200 fs steps of ABF simulation to avoid perturbation by nonequilibrium effects.

ABF simulations of TST in a POPC bilayer
A 50-Å 2 POPC membrane was constructed using the Membrane Builder plugin of VMD (23) and symmetrically hydrated with an additional 10-Å layer of TIP3P water in the Ϯz directions. The hydrated membrane was minimized for 5000 steps and equilibrated in the NVT ensemble for 0.5 ns and in the NPT ensemble for 5 ns. In separate copies of the equilibrated membrane, TST was placed at the center (0 Å; in the POPC tails) and in 2-Å increments to 22 Å above the center (in water). With TST's position frozen, the systems were minimized for 10,000 steps and equilibrated for 4 ns in the NVT ensemble. Each constrained simulation served as an initial configuration for an ABF calculation of the PMF for TST entry into the membrane with the bilayer normal as reaction coordinate. The 22-Å reaction coordinate was partitioned into 2-Å bins, and each was populated with the end point geometry from each of the corresponding constrained simulations. Within each of the major bins, ABF data were accrued in bins 0.1 Å wide with upper and lower wall constants of 10 kcal⅐mol Ϫ1 ⅐Å Ϫ1 . The ABF simulations were performed for 50 ns. The equilibrated systems having TST frozen at 0 and 22 Å were used as the initial configurations for two 50-ns NPT simulations where TST was unrestrained. Atomic density profiles were calculated using the Density Profile Tool for VMD (58).