Determination of Ligand Pathways in Globins

Background: O2 pathways in animal hemoglobins and myoglobins are controversial. Results: Ligands enter and exit sperm whale Mb and Cerebratulus lacteus Hb by completely different pathways. Conclusion: Rational mutagenesis mapping can identify ligand migration pathways and provides experimental benchmarks for testing molecular dynamics simulations. Significance: Globins can use either a polar gate or an apolar tunnel for ligand entry. Although molecular dynamics simulations suggest multiple interior pathways for O2 entry into and exit from globins, most experiments indicate well defined single pathways. In 2001, we highlighted the effects of large-to-small amino acid replacements on rates for ligand entry and exit onto the three-dimensional structure of sperm whale myoglobin. The resultant map argued strongly for ligand movement through a short channel from the heme iron to solvent that is gated by the distal histidine (His-64(E7)) near the solvent edge of the porphyrin ring. In this work, we have applied the same mutagenesis mapping strategy to the neuronal mini-hemoglobin from Cerebratulus lacteus (CerHb), which has a large internal tunnel from the heme iron to the C-terminal ends of the E and H helices, a direction that is 180° opposite to the E7 channel. Detailed comparisons of the new CerHb map with expanded results for Mb show unambiguously that the dominant (>90%) ligand pathway in CerHb is through the internal tunnel, and the major (>75%) ligand pathway in Mb is through the E7 gate. These results demonstrate that: 1) mutagenesis mapping can identify internal pathways when they exist; 2) molecular dynamics simulations need to be refined to address discrepancies with experimental observations; and 3) alternative pathways have evolved in globins to meet specific physiological demands.

Although molecular dynamics simulations suggest multiple interior pathways for O 2 entry into and exit from globins, most experiments indicate well defined single pathways. In 2001, we highlighted the effects of large-to-small amino acid replacements on rates for ligand entry and exit onto the three-dimensional structure of sperm whale myoglobin. The resultant map argued strongly for ligand movement through a short channel from the heme iron to solvent that is gated by the distal histidine (His-64(E7)) near the solvent edge of the porphyrin ring. In this work, we have applied the same mutagenesis mapping strategy to the neuronal mini-hemoglobin from Cerebratulus lacteus (CerHb), which has a large internal tunnel from the heme iron to the C-terminal ends of the E and H helices, a direction that is 180°opposite to the E7 channel. Detailed comparisons of the new CerHb map with expanded results for Mb show unambiguously that the dominant (>90%) ligand pathway in CerHb is through the internal tunnel, and the major (>75%) ligand pathway in Mb is through the E7 gate. These results demonstrate that: 1) mutagenesis mapping can identify internal pathways when they exist; 2) molecular dynamics simulations need to be refined to address discrepancies with experimental observations; and 3) alternative pathways have evolved in globins to meet specific physiological demands.
Most animal myoglobins (Mbs) 4 and red cell hemoglobins (Hbs) bind O 2 reversibly for storage or transport. The ligand covalently coordinates to the heme iron atom and is fully enclosed by amino acid side chains to prevent exposure to solvent and oxidation. In crystal structures, there is often no obvious pathway for ligand migration into the distal portion of the heme pocket shown in Fig. 1A. In 1966 after publication of the first high resolution globin structure, Perutz and Matthews (1) suggested that O 2 enters Hb and Mb by a short channel gated by the distal histidine at the E7 helical position, and this idea was later supported by outward movement of the distal histidine when large ligands were bound to Mb (2,3). However, in 1969, internal cavities were discovered in sperm whale myoglobin (SwMb) capable of binding xenon at moderate pressures (Յ10 atm (4 -9)). This observation led Petsko and co-workers (5) and others in the 1980s to suggest that instead of passing through the His(E7) gate, diatomic gases may enter and exit globins through apolar pathways between helices and the loops connecting them.
The current evidence for movement of photodissociated ligands out of the distal pocket and into the Xe1 and Xe4 sites in SwMb is clear and comes from time-resolved spectroscopic methods (10 -13) and from direct observations by time-resolved x-ray crystallography (14 -20). However, these measure-ments cannot detect whether ligands enter and exit the protein from these transient positions. Ligand movements out of the protein involve overcoming larger activation barriers, and the resultant unstable transition states cannot be seen directly. Consequently, entry and exit pathways have to be inferred either theoretically from molecular dynamics (MD) simulations or experimentally from the effects of mutagenesis on kinetic parameters for ligand entry and escape.
In 1990, Elber and Karplus (21) carried out locally enhanced simulations of ligand movement in SwMb and observed multiple trajectories for ligand migration out of the protein. In 2010, Elber (22) reviewed the most recent computational literature, which still indicates that there are multiple ligand pathways in animal myoglobins and hemoglobins and that the channel gated by the distal histidine is only a minor route for entry and exit (23)(24)(25)(26)(27)(28)(29)(30)(31).
In 1994, Huang and Boxer (32) used an elegant random mutagenesis protocol to create and screen a library of ϳ1,500 single point mutants. Mutations were identified from colonies that produced lysates containing Mbs with kinetic parameters significantly different (Ͼ50%) from those of wild-type (WT) controls. Roughly 10% of the mutants examined showed differences, and a structural model of Mb was constructed to highlight the positions of the mutants showing altered kinetic parameters. In addition to positions near the bound ligand, other regions were highlighted, including interior positions near the Xe cavities and the C-terminal portion of the H helix. These results were interpreted to mean that multiple pathways for CO and O 2 entry and exit exist, supporting the conclusions from the initial MD simulations (32).
Building on Huang and Boxer's strategy, our group explored the E7 gate and Xe cavity pathways in SwMb by a rational mutagenesis mapping strategy in which small (Gly or Ala) and large (Phe or Trp) mutations were constructed in ligand-accessible cavities, along the major trajectories specified by the MD simulations and at some of the positions found in the random mutagenesis screen (13). Each of 90 different mutants was produced in milligram quantities; geminate and bimolecular rate parameters for ligand binding were measured with pure proteins; and in many cases, crystal structures were determined (33)(34)(35)(36)(37)(38).
Our mutagenesis mapping results strongly indicate the following: 1) although O 2 can move from the binding pocket into the Xe cavities, it must return to the distal pocket to exit the protein, and 2) the major pathway (Ն75%) for O 2 entry and exit is the short channel that opens by outward movement of the distal histidine side chain. The discrepancy between our mutagenesis map and that of Huang and Boxer (32) is not as great as it first appears. The largest changes in their random mutagenesis map were observed at or near the E7 gate, and most of the effects observed at remote positions can be rationalized as false positives due to mutations that destabilize the Mb structure, leading to lysates containing partially unfolded proteins with exposed or weakly bound heme groups (i.e. many of the Pro substitutions (32)). The multiple trajectories obtained from MD simulations are harder to reconcile and imply that either we cannot "find" internal pathways with a rational mutagenesis mapping strategy or the models and inter-action potentials used in the simulations cannot accurately define ligand movements into solvent.
In 2002, Pesce et al. (39) reported the structure of the minihemoglobin from the sea worm Cerebratulus lacteus (CerHb). CerHb is a neuronal O 2 storage protein, which surrounds the axons and brain of the Nemertean worm, and it rapidly releases O 2 when the worm burrows into sand and becomes anaerobic while searching for prey (40). The analysis of the CerHb crystal structure revealed the presence of an apolar tunnel leading from the distal pocket to solvent, with an exit point between the E and H helices created by the loss of the N-terminal A helix (Fig. 1B). This tunnel was suggested to be the major route of entry and exit for O 2 binding, and later work indicated that this "open" channel accounts for the very large association rate constants for O 2 and NO binding to CerHb (200 -300 M Ϫ1 s Ϫ1 ), which are close to the values expected for diffusion-controlled processes (41).
Our initial mutagenesis and Xe binding studies of CerHb supported the original idea of Pesce et al. (39) that the internal tunnel is the major ligand pathway (39,42,43). Thus, we have used CerHb as a counter example to SwMb where the histidine gate mechanism for O 2 binding appears to apply. However, Orlowski and Nowak (44) carried out simulations with CerHb and, as with SwMb, obtained multiple trajectories with the apolar tunnel being only one of them.
To resolve this discrepancy and determine whether the experimental mutagenesis mapping strategy can identify internal pathways, we constructed and examined the O 2 , CO, and NO ligand binding properties of 120 different mutants of CerHb. Rates of ligand entry and exit for this library of CerHb variants were compared with results for an expanded library of SwMb mutants, which extends the results of Scott et al. (13) to regions near the ends of the E and H helices.
Our goals were: 1) to verify the mutagenesis mapping strategy; 2) to define experimentally the ligand pathways in the two proteins; 3) to compare the results to MD simulations; and 4) to provide benchmarks for future theoretical work.

EXPERIMENTAL PROCEDURES
Hb and Mb Sample Preparations-WT and mutant recombinant CerHbs were expressed and purified as described previously using a synthetic gene with codon usage optimized for expression in Escherichia coli (39,41). The WT SwMb gene and expression plasmid (pMb413) were constructed by Springer and Sligar (45) to optimize codon usage in E. coli. All new mutants were constructed using this gene in a pUC19-based expression vector and the Stratagene PCR-based QuikChange site-directed mutagenesis kit (Stratagene, La Jolla, CA). Purification was achieved as described previously by Springer and Sligar (45,46) and modified by Carver et al. (47). The WT control and the mutant Mbs have an extra N-terminal Met residue and a D122N substitution for crystallization in the P6 space group, neither of which has an effect on ligand binding (48). All experiments to determine rate constants were carried out in 0.1 M phosphate buffer, pH 7.0, 1.0 mM EDTA, 20°C.
Measurement of Rate Constants for Ligand Association (k O2 Ј , k CO Ј , and k NO Ј ) and Dissociation (k O2 and k CO )-CO association time courses were measured after complete laser photolysis of SwMbCO and CerHbCO samples containing various [CO] under pseudo first order conditions as described previously (42). The association rate constants for NO binding to deoxy-SwMb and deoxy-CerHb were measured using a flow-flash multimixing apparatus and a 500-ns dye laser (42,43,49). In these experiments, SwMbCO or CerHbCO samples in deoxygenated buffer were mixed with various concentrations of NO and then photolyzed ϳ10 -20 ms after mixing, allowing measurement of time courses for NO binding to the newly formed deoxy-SwMb or deoxy-CerHb without the limitations of the low quantum yields of CerHbNO and SwMbNO complexes. O 2 association and dissociation time courses were measured after complete laser photolysis of SwMbCO and CerHbCO samples in O 2 /CO mixtures using a 500-ns dye laser as described previously (42,49,50). Experimental procedures and fitting routines to obtain values for k O2 Ј , and k O2 (k O2 Ј indicates oxygen association rate constant; and k O2 indicates oxygen dissociation rate constant) from analyses of the fast bimolecular binding and slow O 2 displacement phases have been described in previous work (42,49,50).
Measurement of Geminate Recombination-Time courses for internal geminate CO and O 2 recombination within CerHb and SwMb were measured at 436 nm after excitation with a 7-ns excitation pulse from a Lumonics YAG-laser system, using a Tektronix TDS3052 digitizing oscilloscope with nanosecond time resolution. Experimental procedures and fitting routines were followed as described previously (13,42). Sample time courses are shown in Figs. 5-7. The results were analyzed and interpreted in terms of the previously published mechanisms for ligand binding to SwMb (13) and CerHb (42,43), as described below.
Analysis of Bimolecular and Geminate Rate Parameters-In Scott et al. (13), bimolecular and geminate O 2 bindings to SwMb were analyzed in terms of the three-step, side-path model shown in Fig. 1A. The rate constants for intramolecular rebinding, escape, and movement through the protein between sites B and C were obtained by fitting sets of time courses to this model using numerical integration algorithms described by Scott and Gibson (12). The geminate rebinding time courses for WT SwMb and many of the mutants can also be fitted well by the sum of two exponential processes and an offset reflecting the amount of escape to solvent. In this analysis, the rate of the fast phase equals k bond ϩ k BC ϩ k escape , the total fraction of geminate recombination, F gem,total , equals k bond /(k bond ϩ k escape ), and the fraction of the slow phase, F gem,slow , approximately equals k BC /(k bond ϩ k BC ϩ k escape ), allowing an estimation of all three rate parameters (13). It should be noted that a simple two-step reaction analysis, using the correct F gem,total , provides the same value of k entry Ј and an only slightly larger k escape value.
The side-path mechanism was chosen for analysis of SwMb data because both Xe binding and mutations that fill the Xe1 and Xe4 cavities collapse the time courses to a single rapid phase with little or no change in the total fraction of geminate rebinding (12,13). Similar results were obtained for geminate ligand rebinding to WT Scapharca inaequivalvis Hbl, where again, Xe binding and Phe and Trp replacements in internal cavities caused little change in the total fraction of geminate recombination (51). In the case of HbA subunits, single geminate phases for CO rebinding are observed at room temperature on nanosecond time scales and indicate that there are no side path C states, and again, the E7 channel appears to be the path for ligand entry and escape (49,52). Thus, the scheme shown in Fig. 1A appears to apply to most animal hemoglobins with a tertiary and distal pocket structure similar to that of SwMb.
The fraction of photodissociated ligands that geminately rebind after photolysis is defined solely by k bond /(k bond ϩk escape ) in the SwMb scheme in Fig. 1A. There is no dependence on the rates of interconversion of the C and B sites, which only affect the rates and amplitudes of the slower geminate phases. Thus, regardless of the complexity of the internal movements, the overall bimolecular rate constant for ligand binding to SwMb under physiological conditions (  Table 3 for new mutants and in Scott et al. (13) for the remaining 90 variants.
It is important to note that the experimental values of F gem are model-independent and depend only on an accurate definition of the absorbance change for the total amount of geminate recombination. Thus, Equation 1 can also be used empirically to estimate the rate of entry into any globin, including CerHb, where in Fig. 1B binding involves a linear sequential set of ligand movements. As discussed by Scott et al. (13), the steadystate expression for k X Ј using a linear scheme is still k X Ј ϭ k entry Ј F gem , but in this case k entry Ј represents the bimolecular rate of entry into the first discrete site in the protein, which, in the case of CerHb, would be the apolar tunnel. However, if the barriers along the tunnel and from the tunnel into the distal pocket are small compared with bond formation, the mechanism reduces to a two-step process with a single geminate intermediate. Under these conditions, F gem can again be approximated as k bond /(k bond ϩ k escape ).
In the case of CO binding to CerHb, there is evidence of a barrier between the tunnel and the distal pocket implying a linear three-step scheme. However, this extra complexity is only evident at low temperatures (53) or at room temperature when restrictions in the tunnel are completely removed as is observed in the L86A mutant (43).
When movement into and through the tunnel is restricted by mutagenesis, the geminate time courses become single exponentials and are readily analyzed as a simple two-step scheme (Equation 2), SEPTEMBER 28, 2012 • VOLUME 287 • NUMBER 40

JOURNAL OF BIOLOGICAL CHEMISTRY 33165
We used the expressions in Equations 1 and 2 to estimate values of k entry Ј , k escape , and k bond for CO binding to all of the mutants of CerHb examined (Tables 1 and 4 and supplemental Tables S1-S4). These estimations, although empirical, are the simplest and closest to the observed experimental data, making the maps shown in Fig. 4 experimentally defined and not model-dependent.
NO Association and the Validity of Using k entry Ј and k escape -In principle, the calculated values of k entry Ј and k escape should be used to map the ligand pathways quantitatively because these parameters factor out the differences in reactivity (k bond values) between the O 2 , CO, and NO. The values of these entry and exit parameters should be the same for all three diatomic gases because of their similar sizes and diffusion constants at room temperature. In the case of CO binding to WT CerHb, the relative errors in these calculated parameters are very large because the extent of CO geminate recombination is very small, Յ0.05, and it has an absolute error of approximately Ϯ0.03, which is roughly the same for all values of F gem measured. To reassure ourselves that the calculated k entry Ј values are valid for the CerHb mutants, we examined the correlations between k entry Ј , k NO Ј , and k O2 Ј for some of the key variants (Table 1). In contrast to CO, the fraction of internal NO recombination after photolysis is very large, Ն0.99 due to its high reactivity with iron. Once inside the protein, NO reacts immediately with the iron atom, with almost every internal collision between them being productive. As a result, the rate-limiting step for NO binding from solvent is the bimolecular rate of ligand entry into the protein, and the observed association rate constant, k NO Ј , should equal the calculated value of k entry As shown in Fig. 2A, there is a linear correlation between k entry Ј , calculated using k CO Ј and F gem,CO , and the experimentally measured value of k NO Ј for a series of Val-7(B6), Gln-44(E7), Thr-48(E11), Ala-55(E18), and Leu-86(G12) CerHb mutants ( Table 1). The error in k entry Ј is very large for those mutants with large k CO Ј and very small F gem,CO values, making it hard to determine the slope of the correlation between k entry Ј and k NO Ј .
In addition, there appears to be a systematic overestimation of k entry Ј for the CerHb variants with F gem,CO Ͻ 0.1 ( Fig. 2A). However, the correlation is strong for mutants where F gem is larger due to partial blockage of the tunnel. Similar linear correlations between k entry Ј and k NO Ј were observed for all of the SwMb mutants reported in Scott et al. (13), verifying the k entry Ј values for myoglobin.
As shown in Fig. 2B, there is a 1:1 correlation between k NO Ј and k O2 Ј for the CerHb variants in Table 1 implying that O 2 binding is also primarily limited by the bimolecular rate of entry into the protein. The only exceptions are the T48V mutant, where the loss of ␥-OH at the E11 position releases the phenol side chain of Tyr(B10), allowing it to sterically inhibit ironligand bond formation, and the Q44F mutation, where the larger Phe(E7) side chain directly inhibits O 2 binding at the iron atom. For both these mutants, overall O 2 binding is slowed because bond formation is markedly inhibited. As a result, k bond becomes similar to k escape causing a decrease in k O2 Ј due to a decrease in F gem and not the rate of entry (Equations 1 and 2). In the case of NO, its high reactivity with the iron atom is still much greater than that of ligand escape, keeping ligand entry as the rate-limiting step even for the T48V and Q44F CerHb mutants. Fig. 2C shows the correspondence between the calculated ligand entry rate using overall and geminate CO binding data with the bimolecular rate constant for O 2 binding. In this case, there is an ϳ1:1 correspondence with all the mutants examined in this work, even though the scatter is fairly large. All of the results in Fig. 2  and CO forms of this multiple mutant were obtained by reduction of the ferric crystals in the presence of either air or 1 atm CO. These manipulations often caused high mosaicities and reversion to the ferric or completely deoxygenated reduced form. As a result, crystal manipulations and data collection were repeated several times.
In all cases crystals were transferred to appropriate storage solutions containing 2.8 M ammonium sulfate and 25% sucrose a The numbers for WT CerHb are based on the average of all determinations for the past 6 years (Ն10 separate preparations). As described in the text, the error in k entry Ј is very large for WT CerHb because F gem is close to 0.0, poorly defined, and has an error of Ϯ60%. When F gem is Ն0.1, the error in k entry Ј diminishes greatly to approximately Ϯ20%. Complete sets of O 2 and CO binding parameters for these and all other variants examined are given in supplemental Tables S1-S4. Some of the parameters in this table were taken from previous work (42,43) or were re-measured.
(w/v) immediately before data collection at 100 K. For CO complexes, the mounting solution was saturated with 1 atm of pure carbon monoxide.
A cryo-cooling N 2 system was used to maintain low temperature (100 K) in the environment of the mounted crystals to reduce radiation damage. Complete x-ray diffraction data sets for each of the CerHb variants were obtained from single crystals using CuK ␣ radiation ( ϭ 1.5418 Å) from a Rigaku RUH3R rotating anode x-ray generator operated at 50 kV and 90 mA and a Rigaku R-AXIS IVϩϩ image plate detector (Rigaku Americas Co.). Data were collected, scaled, and reduced using d*TREK software (54).
The program PHENIX (55) was used for both structure solution and refinement. The structures of the H100F and H100W CerHb derivatives (PDB codes 4F6B, 4F6D 4F6F, 4F6G, 4F6I, and 4F6J) were solved by difference Fourier syntheses using the structure of wild-type CerHb (PDB code 1KR7) with the ligand and solvent molecules omitted. The structures of the Y11F/ Q44L/T48V/A55W CerHb mutants (PDB codes 4F68 and 4F69) were also solved by difference Fourier syntheses using the structure of the A55W CerHb mutant (PDB code 2VYY). An initial round of simulated annealing was used to calculate unbiased electron density maps to confirm the correct placement of the mutated residues. This process was followed by several refinement macrocycles with maximum likelihood as the target and included bulk solvent correction and anisotropic scaling of the data, individual coordinate refinement with minimization, and individual isotropic ADP refinement interspersed with manual fitting. The ligands were modeled into the electron density maps at this point, and refinement was continued with restraints imposed on the expected Fe 2ϩ -CO, Fe 2ϩ -O 2 , or Fe 3ϩ -H 2 O geometries. Solvent molecules were then included, and in a few cases, combined TLS and individual ADP refinements were carried out in the final stages of refinement.
Map fitting and other manipulations with molecular models were performed using the graphic software COOT (56). The accession codes for the models, crystal parameters, and statistics of x-ray data collection and refinement are provided in Table 2. Figs. 1, 4, 5, and 6 were prepared using the PyMOL Molecular Graphics System, version 1.2r3pre (Schrödinger, LLC).

Mutagenesis Mapping Strategy-The distal pockets, internal
Xe-binding cavities, and tunnels within SwMb and CerHb are shown in Fig. 1. Key amino acid side chains near the E7 gate (Gln-44 in CerHb and His-64 in SwMb), at the Xe-binding sites, and along the apolar tunnel in CerHb between the E and H helices are shown as sticks and labeled in the upper panels of Fig.  1. From these side views, it is clear that the continuous apolar tunnel in CerHb is not present in SwMb due both to the presence of the A helix and to the occurrence of larger amino acid side chains located along the interiors of the E, G, and H helices in SwMb, which separate the various internal Xe cavities.
To define experimentally the ligand entry/exit pathway in CerHb, we constructed a series of replacements with small (Ala and Val) to large (Trp and Phe) amino acids at positions in or along the E7 channel, the apolar tunnel, and other regions reported to be important, based on previous simulations and experimental work with recombinant Mbs and Hbs. We examined 101 single point mutations at 22 different positions in CerHb, and the overall rate constants for O 2 and CO binding were measured using rapid mixing, laser photolysis methods with O 2 /CO mixtures, and flow-flash techniques as described in Salter et al. (42) and Pesce et al. (43). Internal CO recombination within each mutant was also measured using a 7-ns YAG laser. The rate, k gem , and the fraction, F gem , of geminate recombination and the overall bimolecular rate constant for CO binding, k CO Ј , were used to estimate the first order rate constant for ligand escape (k escape ϭ k gem,CO (1 Ϫ F gem,CO )) following its photodissociation from the heme iron and the bimolecular rate of ligand re-entry from solvent (k entry (42,43). In key cases, the calculated value of k entry Ј was verified by measuring the bimolecular rate constant for NO association as described under "Experimental Procedures" (Table 1 and Fig.  2). Complete sets of O 2 and CO kinetic parameters for all the CerHb variants examined are listed in the supplemental Tables S1-S4.
Underlying Assumptions-The first premise of our mutagenesis approach is that larger and smaller apolar amino acid side chains should impose larger and smaller kinetic barriers, respectively, for both ligand entry and escape if they are located along the primary pathway for ligand binding (13,42). Thus, the overall bimolecular rate constant of ligand entry for a mutant with a small amino acid, k entry,small Ј , should be significantly larger than the entry rate constant for a mutant with a large amino acid, k entry,large Ј , at positions located along the migration pathway that leads to bond formation. Similar relationships should occur for the ligand escape rate constants for the same set of mutants. To quantify these effects, we computed the overall rate enhancement effect, R enhance , of a large-to-small mutation as shown in Equation 3 (13), R enhance ϭ log ͩ kЈ entry,small kЈ entry,large ͪ ϩ log ͩ k escape,small k escape,large ͪ (Eq. 3) Large-to-small replacements at positions on the major ligand pathway will increase both k entry Ј and k escape , which will lead to positive values for both logarithmic terms in Equation 3 and a large value of R enhance when added together. In contrast, mutations at positions distant from the ligand trajectory should have little or no effect on entry and escape.
The second premise of our mapping approach is that ligand binding and single point mutations have only minor effects on overall globin structure. Ligand binding to Mb does produce structural changes involving replacement of distal pocket water, small outward movements of the distal histidine, and alterations of the proximal pocket and minor tilting of the heme to accommodate in-plane movement of the iron atom. However, none of these changes have a significant effect on the overall tertiary structure and the internal Xe cavities shown in Fig.  1A (14, 19, 36, 37). Recently, Germani et al. (57) have shown that CO and O 2 binding to deoxy-CerHb causes even smaller changes in the heme cavity and has no effect on the size or shape of the apolar tunnel.    Table S2). C, k entry Ј versus k O2 Ј , open circles, all single mutants are located in and along the apolar tunnel (supplemental Table S1); closed circles, multiple mutants are located in the apolar tunnel (supplemental Table S4).
Our previous work with distal pocket mutants of CerHb also showed very little change in overall structure (41), and crystallographic studies of the mutations at the 55(E18) (42) and 100(H11) positions (this report) near the solvent entrance to CerHb have little effect on overall globin structure or the tunnel. In contrast, Xe addition to WT CerHbO 2 crystals causes loss of diffraction, and we could only determine the structure of Xe bound to L86A CerHb where the tunnel cavity near the internal G12 position is enlarged. Similarly, we have been unable to crystallize V7F, V7W, L86F, and L86W variants presumably because the larger aromatic side chains expand the tunnel and unfavorably alter inter-protein contacts in the crystal lattice. However, all standard UV-visible, FTIR, and CD spectroscopic properties of these tunnel mutants are very similar to those of the WT protein.
In addition, mutations near the bound ligand can have large effects on overall ligand binding parameters due to either direct steric hindrance of bond formation or electrostatic stabilization of the bound ligand. In most cases, these effects are manifested as large changes for the internal rate of iron-ligand bond formation or dissociation (k bond and k Ϫbond in Fig. 1) and do not significantly alter the rates for entry and escape (13,38). In some cases, the results can be very complex, particularly for mutants containing Q44H and Q44W replacements (supplemental Tables S2 and S3) (42). However, in general, the assumption that single point mutations are conservative with respect to overall globin tertiary structure appears to hold for both CerHb and SwMb (33, 36 -38, 41-43, 48).
We also constructed and characterized eight new SwMb mutants to augment the data from Scott et al. (13) and to better complement the set of CerHb mutants located near the exit point of its apolar tunnel and at position CD3 near the His(E7) gate (Table 3). These SwMb mutations included R45W, L61W, L69A, L69W, I75A, I75W, L135A, and L135W at the CD3, E4, E12, E18, and H12 helical positions. Estimates of the rates of entry and escape in SwMb were measured using O 2 instead of CO, as described in Scott et al. (13). The iron atom in Mb is less reactive due to proximal constraints, and as a result, very little internal CO recombination is observed at room temperature in ordinary buffers. In contrast, WT Mb shows ϳ50% geminate O 2 recombination, and the rate is slow enough (k gem Ϸ10 -20 s Ϫ1 , t1 ⁄ 2 Ն30 -60 ns) to be measured readily using a 7-ns YAG laser excitation pulse. In the case of CerHb, O 2 geminate recombination is too fast to measure accurately with WT and most mutant proteins (k gem Ն130 s Ϫ1 , t1 ⁄ 2 Յ5 ns) (42).
Rate Enhancement Maps-Bar graphs of the rate enhancements caused by large-to-small mutations, R enhance , versus sequence position are shown in Fig. 3 for both SwMb and CerHb. In both cases, significant effects are observed at the active site positions where the amino acid side chains interact directly with bound ligands (see B10, CD1, E7 and E11 bars labeled with blue diamonds in Fig. 3). However, beyond those regions the differences between Mb and CerHb are dramatic.
In the case of Mb, large rate enhancements are only observed at or near the E7 channel or in the back of the distal pocket where ligands are initially captured before binding to the heme iron atom, i.  a The values of K O2 were calculated from k O2 Ј /k O2 , and its standard deviation was calculated from the standard propagation of error formula. When fitting the O 2 geminate data, single and double exponential fits were examined. In the case of k gem values, estimates from one exponential fits were used initially. In the case of F gem values, two exponential fits were used to estimate the total F gem,O2 . Similar values for k entry Ј and k escape were obtained from two exponential analyses as described under "Experimental Procedures." Because no significant R enhance values were observed for these mutants, further more sophisticated analyses were not warranted. b Rate constants in bold are taken from Ref. 13. connects to the apolar tunnel (bars labeled with green asterisks, Fig. 3B).
The magnitudes of the R enhance parameters are mapped onto the three-dimensional crystal structures of WT SwMb and CerHb in Fig. 4 (13). The order of R enhance effects is red Ͼ orange Ͼ yellow spheres at native amino acid positions (see legend to Fig. 4). The white or gray sticks in Fig. 4 indicate positions where little or no effect is observed on k entry Ј and k escape for large-to-small amino acid substitutions.
The results are quite clear for both proteins. Only positions near the E7 gate or in the distal heme pocket are highlighted in SwMb, and no rate enhancements are observed along alternative pathways, including the Xe pockets or other interior apolar regions (Fig. 4, A-C). In contrast, mutagenesis mapping clearly highlights the internal tunnel in CerHb (Fig. 4, D-F), proving experimentally that ligands enter and exit through this pathway. This conclusion is particularly evident in Fig. 4F where the tunnel between the E and H helices is clearly seen and circumscribed by red, orange, and yellow side chains. The same view for SwMb (Fig. 4C) shows that this pathway is completely blocked by the A helix.
Identifying the Solvent Exit of the Apolar Tunnel-Three further tests of the apolar pathway in CerHb were performed. First, we explored the solvent-exposed exit aperture of the apolar tunnel at the ends of the E and H helices; second, we examined the influence of polar interactions at the E7 gate; and third, we filled the apolar channel with aromatic amino acid side chains to completely block this pathway.
In WT CerHbO 2 , the polar imidazole side chain of His-100 at the H11 helical position is extended out into solvent and does not appear to block entry into the protein (Figs. 1B and 5B) (42).
As expected, removing the imidazole at this position by a H100A mutation has little effect on ligand binding; however, both Phe and Trp mutations at this position cause ϳ3-fold decreases in both k O2 Ј and k O2 , with little change in affinity (Table 4). These same aromatic amino acid mutations also cause large increases in the fraction of CO geminate recombination, from ϳ0.05 for WT CerHbCO to ϳ0.34 for H100W (Fig. 5A), and a marked decrease in k gem,CO due to an ϳ10-fold decrease in k escape ( Table 4). All of these effects suggest that large Phe and Trp side chains at the H11 helical position in CerHb rotate back to the protein surface to partially cover the solvent exit of the ligand tunnel and restrict ligand escape.
The imidazole ring of the naturally occurring His-100 side chain is poorly defined in the WT CerHbO 2 structure, with only a portion of the expected electron density of the ring apparent in the calculated map (Fig. 5B). In the structures containing the A55W replacement, the His-100 chain is better defined but in a different orientation due the large indole ring at the E18 position, which restricts the motion at the H11 helical position (i.e. see Fig. 6C). Fig. 5C shows the conformation of His-100 in seven different CerHb structures in both reduced and oxidized forms.
The side chain appears to show free rotation about the C␣-C␤ bond with no specific energy minimum, presumably because of equally favorable interactions with solvent water molecules. As a result, the entrance into the apolar channel is kept open. When Phe or Trp is inserted into this position, the more apolar aromatic rings appear to be forced into the entrance of the channel by a hydrophobic effect and restrict ligand entry and escape (Table 4 and Fig. 5A).
To verify this interpretation, we crystallized and determined structures of the H100F and H100W CerHb mutants in   Table 2). In both cases, the aromatic side chains rotated back onto the protein surface, partially entering and blocking the tunnel from the H11 helical position (Fig. 5D). The Trp-100 and Phe-100 side chain conformations have ϳ100% occupancy, are well defined by electron density (Fig. 5D), and account for the decreases in rates of ligand entry and exit and the increase in geminate CO recombination observed for H100F and H100W mutants (Table 4 and Fig. 5A).
In WT CerHb, the small residue Ala-55 at E18 is located on the opposite side of the entrance to the apolar channel from His-100 at H11. Substitution of a Trp for Ala at this position also blocks ligand entry, decreases k entry Ј and k escape , and traps photodissociated ligands inside the protein, increasing F gem (Table 4 and Figs. 3B and 6C) (42). Similar large decreases in k entry Ј and k escape and increases in F gem occur when Ala-101 at H12 is replaced with Trp ( Fig. 3B and supplemental Table S1). Combined, these mutagenesis results unambiguously locate the entrance for ligand movement into CerHb between the ends of the E and H helices at positions E18, H11, and H12.
Constructing an Apolar Distal Pocket-Pesce et al. (39,43) suggested that extensive polar interactions between Tyr-11(B10), Gln-44(E7), Lys-47(E10), and Thr-48(E11) limit opening of the E7 gate, causing ligands to use the apolar tunnel for entry and exit. To test this idea, we examined ligand binding to a triple CerHb mutant with a completely apolar active site con-  1KR7). The protein backbones are represented as silver ribbons, and the heme groups are shown as blue sticks. A-C, side, top, and back views of SwMb. D-F, side, top, and back views of CerHb and its ligand tunnel. Red spheres signify R enhance Ն 1.5; orange spheres signify 1.5 Ͼ R enhance Ն 1.0; yellow spheres signify 1.0 Ͼ R enhance Ն0.5; white or gray sticks signify 0.5 Ͼ R enhance (no significant effect). Numerical values are shown in Fig. 3 (13) and can be computed from the supplemental tables.  a Complete sets of association and dissociation rate parameters and equilibrium constants for CO and O 2 binding to these variants and others are given in supplemental Tables S1-S4. b The values in parentheses in the second-to-last column are the values for k NO Ј , the bimolecular association rate constant for NO binding, which should approximate the computed rate of ligand entry, k entry Ј (see supplemental Table S1). The errors in k entry Ј and k escape for WT CerHb are very large due to the small value of F gem , and this problem and the validity of the calculations are discussed under "Experimental Procedures." taining Y11F/Q44L/T48V mutations at the B10, E7, and E11 helical positions, respectively. As shown in Table 4, rates of O 2 binding to and dissociation from this triple mutant increase by only 50%, and there is little change in the computed values of k entry Ј and k escape compared with WT CerHb. The extent of geminate recombination does increase to ϳ0.40 (Fig. 6A) due to a marked ϳ7-fold increase in the rate of covalent bond formation (k bond ) compared with WT CerHb (from ϳ4.0 to ϳ30 s Ϫ1 for the mutant, see supplemental Table S4). However, when the A55W mutation is added to this triple mutant to block exit from the apolar tunnel (Fig. 6C), there are marked ϳ5-fold decreases in k O2 Ј and k O2 ; the fraction of geminate CO rebinding increases to 0.85, and the calculated value of k entry Ј decreases from ϳ400 to ϳ30 M Ϫ1 s Ϫ1 (Table 4 and Fig. 6A). Thus, even when the distal pocket is made apolar, ligands still appear to enter and exit the protein through the apolar tunnel, and blocking the entrance to the tunnel markedly inhibits bimolecular binding.
We determined the crystal structure of the Y11F/Q44L/ T48V/A55W mutant to examine whether the apolar substitu-tions cause significant changes in the tertiary structure of CerHb. Electron density maps for the active site amino acids of the quadruple mutant and Trp-55 at the tunnel entrance are shown in Fig. 6, B and C, respectively. A comparison of the active site of the quadruple mutant to that of the single A55W mutant is shown in Fig. 6D. The structures overlay remarkably well with a root mean square deviation of 0.17 Å for the C␣ atoms, and the side chains of the B10, E7, and E11 amino acids have very similar conformations despite the changes in polarity (note that the root mean square deviation was 0.15 Å for the C␣ atoms of WT CerHb and the quadruple mutant). The major difference is that there is more "free" space surrounding the bound O 2 atoms due to the loss of the O atom of Tyr-11(B10) and the N⑀ or O⑀ atoms of Gln-44(E7). This loss of steric hindrance adjacent to the bound ligand accounts for the increase in the rate and fraction of geminate recombination after laser photolysis ( Fig. 6A and Table 4). Instead of being pushed away by the larger Tyr(B10) and Gln(E7) side chains, photodissociated CO can remain near the center of the heme group and more rapidly rebinds than in WT CerHb. However, the large, Ն10- fold decrease in the rate of entry caused by the A55W mutation at E18, even when the active site is apolar, argues strongly that most ligands (Ն90%) still enter CerHb through the apolar channel (k entry Ј values in Table 4). Completely Blocking the Tunnel-As a final test of the map for CerHb, we attempted to completely block ligand movement through the apolar tunnel. To prove that entry into the distal pocket from the tunnel is at the B6 position, we mutated Val-7(B6) to Ala, Phe, and Trp. As shown in Fig. 7, introducing an indole ring at this position traps CO inside the distal pocket increasing both F gem and k gem dramatically. At the same time, the V7W replacement markedly decreases k entry Ј , k O2 Ј , and k NO Ј and the rate of O 2 dissociation (Table 4), demonstrating that this position is on the pathway for ligand binding. This conclusion is verified by the large increase in F gem and a decrease in k entry Ј when the V7W mutation is added to the apolar pocket (Y11F/Q44L/T48V) CerHb mutant ( Fig. 6A and Table 4).
The V7A/L86A CerHb double mutant was constructed as a control for a completely open channel with Ala at the three key positions along the apolar tunnel, 7(B6), 55(E18), and 86(G12), respectively (WT CerHb has Ala at position 55(E18)). This double mutant has kinetic parameters very similar to those of WT CerHb (Table 4). In contrast, the V7F/A55F/L86F and V7W/A55W/L86W triple mutants show dramatic Ն10-fold decreases in k O2 Ј , k O2 , and k entry Ј and Ն10-fold increases in F gem,CO , from ϳ0.05 for WT CerHb to 0.80 and 0.93 for the triple Phe and Trp mutants, respectively (Table 4 and Fig. 7). The computed R enhance value comparing Ala-7/Ala-55/Ala-86 versus Trp-7/Trp-55/Trp-86 is 3.1, indicating a 3-fold increase in the free energy barrier to ligand escape and entry. These data show unambiguously that the apolar tunnel in CerHb can be closed by mutagenesis, reducing the measured bimolecular rate constants for O 2 and NO binding and the calculated rate for ligand entry to less than 10% of the WT values. If the triple aromatic mutations at the B6, E18, and G12 helical positions completely close the tunnel, the residual bimolecular rate of 25 M Ϫ1 s Ϫ1 applies to ligand binding through alternative routes. This value suggests that in WT CerHb Ն92% of the ligand molecules enter through the apolar tunnel and only 8% by alternative routes. Interestingly, when the tunnel is blocked, the value of k entry Ј is only slightly smaller than the estimated rates of ligand entry into WT (His-64(E7)) and Gln-64(E7) SwMb, which are 34 and ϳ60 M Ϫ1 s Ϫ1 , respectively (13). Thus, the limiting value of 25 M Ϫ1 s Ϫ1 for the triple aromatic CerHb mutants could represent ligand movement through the Gln(E7) gate, although this conclusion is speculative.

DISCUSSION
In the case of SwMb, all the positions that regulate entry and escape are located in the distal pocket or at the heme propionate-solvent interface (Figs. 3 and 4, A-C). As pointed out by Scott et al. (13) and Olson et al. (38), when the effects of Xe binding and selected mutations are examined individually, the arguments in favor of ligand movement through the E7 gate are even stronger. There are no changes in the fraction of geminate O 2 recombination, the calculated bimolecular rate constants for ligand entry, or the observed association rate constants for O 2 and CO binding to WT SwMb when the protein solution is pressurized with 5-10 atm of Xe. Similarly, when Trp substitutions are made in the Xe1 and Xe4 sites, there are small or no changes in k entry Ј , k escape , and F gem (13,38). The E7 gate pathway in SwMb is also supported strongly and independently by time-resolved x-ray crystallography mea- surements with mutants at the 29(B10) position. The lifetimes of CO in the Xe1 pockets of WT, L29F, and L29W SwMb are ϳ2-10 s (19), 100 -300 s (14,18), and 1-2 ms (16), respectively, and the double mutant L29Y/H64Q SwMb shows intermediate behavior (15). If ligands moved directly to solvent from the Xe1 site, the size of the amino acid at the B10 position should have had no effect on the lifetime for ligand escape from this position. Instead, ligands appear to move back to the distal pocket, pass between the B10 side chain and the plane of the heme, and then exit through the E7 channel.
When His-64(E7) is replaced with Trp in SwMb, the calculated bimolecular rate of entry decreases from 34 to 7 M Ϫ1 s Ϫ1 , implying that up to 25% of the ligand molecules may take alternative routes to the active site (13). However, this calculation assumes that the H64W substitution completely blocks the E7 channel. Recently, Birukou et al. (52) have shown that the indole side chain in the crystal structure of H64W deoxy-Mb does not enter the E7 channel but just partially blocks the entrance, occupying a position between the heme propionates. A similar "blocked" conformation is observed in the crystal structure of Trp-63(E7) ␤ human HbCO subunits (52). In contrast, the E7 indole side chain in the crystal structure of Trp-58(E7) deoxy-␣ human Hb subunits is located in the distal pocket, completely closing the E7 channel and sterically restricting access to the heme iron (52). In this completely "closed" conformation, the Trp(E7) side chain causes Ն100fold decreases in both k NO Ј and the calculated value of k entry Ј for isolated mutant ␣ subunits (49,52). Thus, it is possible that the percentage of ligands entering through the E7 channel in Mb is significantly larger than the 75% value estimated from the rate of entry for Trp-64(E7) MbO 2 .
Discrepancies with MD Simulations-The mutagenesis mapping results in Figs. 3 and 4 are not in agreement with either past or recent molecular dynamics simulations, all of which suggest multiple pathways. The latest theoretical work for both CerHb and SwMb suggests 3-5 pathways with roughly equal probabilities, one of which is the E7 channel in the case of SwMb and the apolar tunnel in CerHb. Thus, the theoretical calculations predict that ϳ20 -33% of the ligands enter and exit through the paths identified experimentally by mutagenesis, implying that alternative routes show similar rates of entry. In contrast, the experimental data indicate that ligand movements through the E7 gate in SwMb and the apolar channel in CerHb are ϳ4 -10fold faster than through the alternative routes, based on the much smaller measured bimolecular rate constants when the gate or tunnel is blocked by Trp side chains. However, these differences are relatively small on a free energy of activation scale. Thus, small inaccuracies in representing chemical and structural barriers along the individual pathways could easily affect the computed results.
One key problem is representing the barrier to apolar ligand movement through clathrate-like water clusters that cover apolar patches at the protein surface. For example, well defined crystallographic water molecules surround the highly conserved Leu-89(F4) side chain in SwMb, which is apolar and protects the iron-His-93(F8) bond from being hydrated (33). We mutated this amino acid to Gly to open a hole into the Xe1 cavity and observed that a string of 3 to 4 water molecules entered this site and were connected to well defined surface waters, forming an extended chain of "ice-like" structures (33). The rates of bimolecular ligand binding and unimolecular release were relatively unaffected by this mutation, and there was little or no effect on the total fraction of O 2 geminate recombination, even though the Xe1 cavity was completely open to solvent. The only major change was the loss of the slow phase for geminate rebinding, which in WT SwMb reflects movement into and then back from the Xe1 cavity (13). These structural and computational results demonstrate that water enters into the exposed Xe1 cavity in the Gly-89(F4) SwMb mutant and forms a barrier that prevents diatomic apolar ligands from entering this site.
Another example of how the hydrophobic effect can influence ligand binding is the effect of the His-100(H11) to Phe and Trp mutations on ligand binding to CerHb (Fig. 5). In most of the published structures of WT and mutant CerHb, the imidazole side chain of His-100 is pointing directly out into the solvent (see Figs. 1B and 5, B and C). However, when this polar amino acid is replaced by Phe and Trp, the hydrophobic effect pushes the more apolar phenyl and indole rings back on the surface of the protein, which partially blocks the tunnel entrance, markedly increases F gem,CO by trapping ligands within the CerHb tunnel, and decreases both k entry Ј and k escape (Fig. 5 and Table 4).
A key test of the accuracy of theoretical calculations would be to examine the effects of the L89G SwMb and H100W CerHb mutations on simulated structures, starting from the WT proteins, and on ligand trajectories, carefully looking at water structure in and around the Mb Xe1 cavity near the F4 helical position and the exit aperture of the CerHb tunnel near the H11 helical position. Other valuable tests would be simulations of the effects of the F46V SwMb mutation at the CD4 helical position on movement of the His-64(E7) side chain and of the motions of the indole side chain in Trp-64(E7) Mb. The F46V mutation accelerates ligand entry and release in SwMb and causes outward movement of the distal histidine (34). The H64W mutation slows ligand entry ϳ4-fold, even though only the entrance to the E7 channel is blocked (13,52). The effects of all four of these mutations on ligand entry rates are well defined experimentally and should be replicated in any molecular dynamics simulation.
Physiological Relevance-Ironically, the fastest rates of ligand uptake and release are observed when the longer apolar tunnel in CerHb is used rather than the shorter E7 gate channel in Mb. The rates of ligand binding to CerHb are close to diffusioncontrolled and very similar to model heme compounds with no distal steric hindrance or to SwMb and human Hb mutants in which the distal histidine is replaced by Gly or Ala. Thus, the Nemertean mini-Hb from C. lacteus is able to take up and release O 2 roughly 10 times more rapidly than mammalian Mbs and Hbs. CerHb surrounds the sea worm's brain and axons and acts as an O 2 storage protein. High rates of O 2 release give the worm an advantage for high neuronal activity under hypoxic conditions in the sea floor when it hunts for prey (40).
Apolar tunnels have also evolved in the truncated HbN from Mycobacterium tuberculosis (MtTrHbN) to facilitate rapid dioxygenation of NO as part of the bacterium's defense against host macrophages (58 -63). In this case, there appear to be two channels that allow rapid access to the heme iron atom, a short one between the G and H helices and a longer one leading from the solvent interface at the AB and GH loops into the distal pocket (59). The rate constants for bimolecular CO and O 2 binding to MtTrHbN become similar to or greater than those for CerHb when steric restrictions due to distal pocket water are removed by apolar distal pocket mutations, and the rate of NO dioxygenation of wild-type MtTrHbN-O 2 is ϳ750 M Ϫ1 s Ϫ1 , which is ϳ10-fold greater than that for either human HbO 2 or mammalian MbO 2 (61). Both results indicate that, like in CerHb, ligands can move freely into the active site of MtTrHbN. Simulations have suggested that Phe-62(E15) closes the shorter tunnel when O 2 is bound, but NO can still rapidly enter the protein through the longer pathway (60). However, these ideas are controversial (62, 63) but could be verified or refuted by mutagenesis mapping approaches similar to those shown in Figs. 3 and 4.
Conclusions-The experimental mutagenesis mapping results shown in Figs. 3 and 4 are unambiguous. Ligands enter and exit CerHb through the interior apolar tunnel at least ϳ90% of the time, whereas ligands enter SwMb from the opposite direction Ն75% of the time through the E7 gate. The significance of these results is 3-fold. 1) Our rational mutagenesis mapping strategy is robust and can identify interior apolar pathways. 2) Globins have evolved different routes for ligand entry, and the longer but more open apolar pathway allows more rapid rates of entry and exit. 3) The SwMb and CerHb results provide experimental benchmarks for testing the validity of molecular dynamics trajectories, one with a short pathway between the heme propionates that is gated by a polar residue and another with an internal route from the heme pocket to an opening at the opposite end of the protein.