Path to Collagenolysis

Collagenolysis is essential in extracellular matrix homeostasis, but its structural basis has long been shrouded in mystery. We have developed a novel docking strategy guided by paramagnetic NMR that positions a triple-helical collagen V mimic (synthesized with nitroxide spin labels) in the active site of the catalytic domain of matrix metalloproteinase-12 (MMP-12 or macrophage metalloelastase) primed for catalysis. The collagenolytically productive complex forms by utilizing seven distinct subsites that traverse the entire length of the active site. These subsites bury ∼1,080 Å2 of surface area, over half of which is contributed by the trailing strand of the synthetic collagen V mimic, which also appears to ligate the catalytic zinc through the glycine carbonyl oxygen of its scissile G∼VV triplet. Notably, the middle strand also occupies the full length of the active site where it contributes extensive interfacial contacts with five subsites. This work identifies, for the first time, the productive and specific interactions of a collagen triple helix with an MMP catalytic site. The results uniquely demonstrate that the active site of the MMPs is wide enough to accommodate two strands from collagen triple helices. Paramagnetic relaxation enhancements also reveal an extensive array of encounter complexes that form over a large part of the catalytic domain. These transient complexes could possibly facilitate the formation of collagenolytically active complexes via directional Brownian tumbling.

Collagen fibrils are bundles of staggered triple helices. Each triple helix comprises three extended strands twisted into a right-handed superhelix with hydrogen bonding between chains (9). Each collagen chain is a long series of GXY triplets in which the glycine is in the interior, X is often proline, and Y is often a stabilizing 4-hydroxyproline (Hyp). Because each chain is offset by one residue, the chains have been named leading, middle, and trailing (10). Lack of Pro and Hyp in four GXY triplets on the C-terminal side of the MMP scissile bond in interstitial collagens (types I, II, and III) was proposed to loosen the triple helix (1); this was later observed as 10-fold symmetry (11).
Protease substrates are labeled by increasing distance from the scissile peptide bond with a prime indicating the C-terminal side: P 4 -P 3 -P-P 1 ϳP 1 Ј-P 2 Ј-P 3 Ј-P 4 Ј where ϳ denotes the scissile peptide bond. We refer to the residues on the N-terminal side of the scissile bond as "unprimed" and those on the C-terminal side as "primed." The active site of MMPs is funnel-shaped with a wide end of unprimed subsites and a narrow end of primed subsites. The ϳ13-Å diameter of the collagen triple helix appears too wide to fit the 6-Å-wide channel through the primed subsites, prompting a longstanding question of how can the triple helix fit into the narrow channel (12)(13)(14). It has long been assumed that the collagen triple helix must partially unwind or be destabilized prior to formation of the Michaelis complex with a collagenolytic MMP (12)(13)(14)(15)(16)(17)(18). The archetypal collagenase MMP-1 may unwind the collagen triple helix (13) or rather destabilize the helical state and stabilize the locally unwound state (19). This may occur more readily in Hyp-poor regions that are less stable and more susceptible to localized unwinding (1,14,20). The classic collagenases MMP-1, -8, and -13 are joined by MMP-2 and -9 and membrane type MT1-MMP and MT2-MMP (21).
Like these collagenolytic MMPs, MMP-12 comprises pro-, catalytic, and hemopexin-like domains (21). The catalytic domain of MMP-12, a key physiological form (22), was reported to hydrolyze the fibril-forming collagens I, II, III, and V (23)(24)(25). We observed digestion of collagens I and V by the MMP-12 catalytic domain to be very slow. Nonetheless, prior x-ray crys-tallographic studies demonstrated the catalytic domain of MMP-12 to be a model system appropriate for collagenolysis (26). In turn, MMP-12 efficiently hydrolyzes a homotrimeric triple-helical peptide that mimics residues 436 -450 of the ␣1 chain of collagen V, abbreviated ␣1(V)436 -450 THP (27,28). This collagen V mimic was first characterized as a substrate selective for the gelatinases MMP-2 and -9 (27). This self-assembling miniprotein was used for the docking studies described below. Collagen digestion by MMPs normally requires the catalytic domain working in tandem with the hemopexinlike domain or fibronectin-like inserts (12). The MMP-12 catalytic domain and its interaction with the homotrimeric collagen V mimic provide a simplified model of triple-helical peptidase activity of collagenolytic MMPs.
The interfaces between proteins can be mapped by NMR peak shifts (chemical shift perturbations) but with the complication that conformational changes can also shift peaks (29 -31). Such perturbations of NMR peaks are minimal in encounter complexes (32). NMR detection of the line broadening of one partner introduced by an unpaired electron on the other partner provides an excellent means to characterize both specific and lightly populated nonspecific modes of binding (32)(33)(34)(35). The unpaired electron, typically introduced in a nitroxide group, has such a strong magnetic moment that its dipole-dipole broadening of the NMR peaks of neighboring protons reaches up to around 25 Å despite its steep decline proportional to 1/r 6 (36). These 1 H NMR line broadenings are measured as paramagnetic relaxation enhancements (PREs) most reliably as transverse relaxation rate constants, ⌫ 2 (35). The insights from PREs into complexes of lower affinity (35) have been exploited to characterize the structures of mixtures of specific and nonspecific protein complexes with DNA (33), proteins (34), redox protein partners (32,37,38), and bilayer mimics (39,40). Synthesis of a nitroxide-containing artificial amino acid into key locations within a triple-helical peptide model of collagens I-III enabled structural capture of its transient complex with a domain of MT1-MMP (41). We adopted this latter paramagnetic NMR approach to generate numerous structural restraints between the triple-helical peptide from collagen V and the MMP-12 catalytic domain despite their moderate affinity (28). The measurement of paramagnetic NMR relaxation also presented an opportunity to look at the sparsely populated encounter complexes that have been proposed to precede the formation of catalytic complexes (37). Elucidation of various encounter complexes proved crucial to capturing experimentally, for the first time to our knowledge, the productive complex with the triple helix intact, two of its chains occupying portions of the catalytic channel, and the scissile GlyϳVal peptide bond positioned correctly for catalysis. The encounter complexes form along an extended serpentine surface that centers around the II-III loop (previously designated exosite 2 of elastin interactions (42)). This might suggest a route of guided Brownian tumbling of an MMP on collagen fibrils, which could possibly hasten diffusive search to position the catalytic cleft around the triple helix.
NMR Spectroscopy-The E219A-inactivated catalytic domain of MMP-12, designated MMP-12 cat, was prepared as described in a buffer of 20 mM imidazole (pH 6.6), 10 mM CaCl 2 , and 20 M ZnCl 2 (31). Table 1 explains the choice of conditions used. NMR was performed on a Bruker Avance III 800-MHz spectrometer equipped with a TCI cryoprobe. Sample temperatures were 25°C with TOAC substitution at P 5 of the unprimed substrate and 8°C with TOAC substitution at P 8 Ј in the primed substrate. These temperatures were chosen to keep the triple helix mostly assembled in each. In the nomenclature of protease substrates, P 5 is five residues in the N-terminal direction from the scissile GlyϳVal peptide bond, whereas P 8 Ј is eight residues in the C-terminal direction from the scissile bond. Spectra were processed using Bruker TopSpin and analyzed using CcpNmr Analysis (44). Paramagnetic NMR experiments were performed on enzyme samples at ϳ300 M, and triple-helical peptide was present at 0, 0.1, 0.33, 0.67, 1, and 1.5 molar eq. Diamagnetic controls were recorded after reduction with a 4-fold molar excess of ascorbic acid. PREs emanating from the unprimed substrate were measured on 15 N-labeled MMP-12 cat using 15 N heteronuclear single quantum coherence modified with two time evolution periods (45). PREs emanating from the primed substrate were measured on 2 H/ 13 C, 1 H 3 / 15 N-labeled enzyme using 15 N heteronuclear single quantum coherence and 13 C heteronuclear multiple quantum coherence with a new improvement that includes a Carr-Purcell-Meiboom-Gill pulse train modified to suppress 3 J HH cou-  47. These unambiguous restraints were coupled with ambiguous restraints derived from the shifts of NMR peaks and scaled by the size of the shifts. Rigid body docking was carried out with HADDOCK 2.1 (48) using nearly default parameters with two exceptions: (a) interatomic interaction scaling was set to 0.5 to allow a modest amount of van der Waals violation during rigid body docking, and (b) topology and parameter files were modified to include the presence of non-interacting virtual sites ("dummy atoms") to represent the nitroxide group of TOAC in anchoring distance restraints. The homology model of the collagen V-mimetic peptide used during docking was mutated from the sequence and crystal structure of Ref. 17 and modified to include the virtual sites to represent the positions occupied by the TOAC nitroxide group, determined to be 4.2-Å distant from the C␣ atom of the modified residue in the N-C␣-C plane (49). To improve satisfaction of PRE-based distance restraints at the active site, rigid body docking was performed after removing Phe-237 through Thr-242 from the S 1 Ј specificity loop of the solution structure of MMP-12 cat (Protein Data Bank code 2POJ) (31). These residues were replaced by virtual sites for use in distance restraints. The rigid body docking protocol (it0) of HADDOCK was used to generate a library of ϳ7500 unique structural models of complexes of ␣1(V)436 -450 THP with MMP-12 cat. HADDOCK calculations were repeated 20 times with 20% of the restraints randomly removed each time to ensure a diverse library.
⌫ 2 values for PREs implied by the coordinates of the complexes were then back-calculated at each MMP-12 residue for each member of the library of models of complexes. Ensembles of structures agreeing well with the measured ⌫ 2 values were then generated using a metaheuristic algorithm. This program, q_test.py (available upon request), is small with ϳ500 lines of Python code and relatively quickly assembled minimal ensembles. The program generates ensembles through random addition of models to the ensemble until there is no further reduction of Q-factor, which quantifies agreement of model with measurements (37). These maximal ensembles are then stripped of their members that do not contribute significantly to satisfying the PREs (Fig. 5). The program uses the Akaike information criterion (50) to validate the retention of a structure in the ensemble as a statistically significant improvement to accounting for the PREs measured. The minimal ensemble that results was used as the seed for further rounds of addition and subtraction. After generating ϳ45,000 ensembles, those structures present in at least 1% of the ensembles were clustered using a pairwise root mean square deviation (r.m.s.d.) cutoff of 5 Å (51). Clusters present in Ͼ80% of the ensembles in the top 1000 ensembles (ranked by Q-factor) were determined to be major species. The percent contributions of these structures to the overall ensemble were optimized using a grid search. Another round of metaheuristic ensemble building was carried out with the ratios of the major species fixed and the use of cross-validation to prevent overfitting. This chose the 20 best ensembles in terms of Q-factor. Although this suggested the encounters present, it left uncertain the amounts of each mode of binding. The occupancies of the most frequent binding modes were optimized again using a grid search to find the populations that best satisfy the PREs measured as evident by improved Q-factor. This ensemble of ensembles was visualized using the program VMD-XPLOR (52).
The S 1 Ј specificity loop (Phe-237 to Tyr-242) of each major complex was rebuilt and subjected to energy minimization using the Amber99 force field with optimized hydroxyproline parameters (53). This was followed by 100 ps of position-restrained equilibration in a bath of explicit, flexible single point charge water molecules (54) at constant volume and temperature and then at constant pressure and temperature. Buried surface areas of the major complexes were calculated using the program NACCESS (55).
To determine whether primary structure alone accounted for the observed MMP-12 cat activity, a single-stranded peptide analog encompassing the sequence of residues 436 -450 of type V collagen (Gly-Pro-Pro-GlyϳVal-Val-Gly-Glu-Gln-Gly-Glu-Gln-Gly-Pro-Pro-NH 2 ) was utilized (27). MMP-12 cat (17 nM) was incubated with this single-stranded peptide (730 M) at 37°C. Reversed-phase HPLC indicated by a single peak with retention time of 7.108 min that the single-stranded peptide was not cleaved by MMP-12 cat even after 24 h. Thus, primary structure was not sufficient for MMP-12 hydrolysis of the ␣1(V)436 -450 THP, and substrate triple-helical conformation of the peptide greatly enhanced hydrolysis rates. These results suggest that MMP-12 can act as a "true collagenase," preferentially cleaving a specific sequence in triple-helical conformation while having little or no activity toward the sequence in a nontriple-helical context. Indeed, recent analysis of hydrolysis among thousands of linear peptide sequences indicates that the single strand of collagen V residues 436 -450, apart from Val at P 1 Ј, lacks cleavage motifs favored by MMP-12 (56,57).
Binding Evident at the Active Site-The collagen V model THP induces small shifts of amide NMR peaks of E219A-inactivated MMP-12 cat, in most cases with ⌬␦ HN Ͻ0.18 ppm, that map mainly to the full length of the active site cleft (Fig. 1, A and  B). Modest peak shifts also mapped to remote exosite 2 implicated in elastin digestion (42). The residue with the largest amide peak shift (⌬␦ HN Ϸ 0.22 ppm) is His-222, which coordinates the catalytic zinc and is positioned to respond to triple helix entry into the active site. The amide NMR peak of His-222 responds to titration with the triple-helical collagen V peptide in the intermediate to slow exchange regime with the appearance of a second NMR peak representing the bound form ( Fig.  1C), suggesting a comparatively slow off-rate from the active site. (Fast, intermediate, and slow chemical exchange regimes refer to the off-rate exceeding, being similar to, or being less than, respectively, the NMR chemical shift differences between free and bound states.) This places the triple helix in the correct vicinity for hydrolysis. Amide NMR peaks of residues throughout the rest of the catalytic cleft exhibit fast-intermediate exchange in titrations because their peak shifts are smaller than those of His-222. No intermediate exchange broadening was observed outside the catalytic cleft. That the small peak shifts induced in NMR spectra of MMP-12 cat by the triple-helical peptide were small (Fig. 1A) suggests minimal conformational change in the protease. Consequently, the high resolution solution structure of free MMP-12 cat (31) (Protein Data Bank code 2POJ) was used in rigid body docking calculations with a homology model of the ␣1(V)436 -450 triple-helical peptide.
Intermolecular Paramagnetic Relaxation-To investigate how MMP-12 cat recognizes this triple helix from type V collagen, we sought the potentially transient solution structure of the complex by measuring long range distances between the enzyme and nitroxide spin-labeled THP substrates (Fig. 2). The ␣1(V) triple-helical peptide was synthesized with the unnatural amino acid TOAC that incorporates a nitroxide group into the cyclized side chain ( Fig. 2A). This nitroxide harbors an unpaired electron exerting a strongly distance-dependent (r Ϫ6 ) line broadening effect on NMR peaks of hydrogen atoms within about 25 Å. TOAC was placed at either the P 5 position five residues in the N-terminal direction from the scissile GlyϳVal peptide bond or at P 8 Ј eight residues in the C-terminal direction from the scissile bond (Figs. 2B and 3). The triple-helical peptides, which were homotrimers with a one-residue stagger, placed TOAC with this spatial offset in the respective leading, middle, and trailing chains ( Fig. 2A). The peptide with TOAC at P 5 retained most of the thermal stability typical of collagen triple-helix (Fig. 3A). The TOAC substitution at P 8 Ј, however, decreased the T m to 16°C (Fig. 3B). Consequently, NMR studies using the P 8 Ј-substituted peptide were carried out at 8°C to ensure that its triple helix remained fully folded.
The TOAC labels result in distance-dependent PREs that are NMR line broadenings radiating as far as 25 Å from the nearest TOAC (Fig. 2B). The PRE patterns indicated a clear directionality of each TOAC-substituted triple-helical peptide at the catalytic cleft. Gly-105 at the unprimed end of the catalytic cleft mainly exhibits a PRE when TOAC is present at P 5 (Fig. 2C). In contrast, Gly-179 at the primed end of the cleft shows the opposite behavior of PRE only with TOAC at P 8 Ј (Fig. 2C).
Paramagnetic Relaxation Implicates Remote Binding-Overshadowing the paramagnetic broadening around the active site emanating from TOAC are many additional PREs (ϳ200 in total) radiating widely across the enzyme from locations other than the active site. Nonspecific, remote binding of collagen V model peptides was consistently observed in titrations in the form of extensive PREs far from the active site. For example, this is obvious in the broadening of methyl NMR peaks (surveyed only with TOAC at P 8 Ј) of a patch of side chains (Leu- 160, Val-162, and Ile-191) next to the ␤-sheet (Fig. 4) and near aforementioned exosite 2. The large number of PREs (from TOAC at either P 5 or P 8 Ј) remote from the active site means that any standard docking attempting to satisfy all PREs simultaneously with a single mode of binding is inappropriate for representing the mixture present. The "minor states" seemed collectively to be more abundant than the complex with triplehelical peptide bound at the active site.
Computing Ensembles of Complexes Accounting for PREs-The mixture with remote binding motivated us to adapt the ensemble of ensembles strategy pioneered by Clore and coworkers (58). We developed a new docking protocol to identify the triple-helical peptide bound at the active site among the myriad of encounter complexes. Development of the docking protocol resulted in five steps: generation of a library of thousands of potential docked poses of ␣1(V)436 -450 THP with MMP-12 cat, selection of a minimal ensemble of poses, identification of the major docked poses, weighting of the major complexes to match PREs, and identification of a minimal ensemble of ensembles (58) to portray aptly the mixture of complexes present. The interactions of ␣1(V)436 -450 THP with MMP-12 cat are underdetermined in that the number of potential binding modes outstrips the hundreds of distance restraints measured. Consequently, we sought to avoid overfitting through use of our in-house algorithm used to identify parsimonious ensembles of binding poses consistent with the PREs measured (q_test.py; Fig. 5). Typically, the parsimonious ensembles comprise 6 -10 members (Fig. 6A). Although the ensembles vary, the ensembles agreeing best with the PREs (low Q-factor) consistently contain the same few models of complexes of triple-helical peptide with MMP-12 (Fig. 6, B and C). The collagen V-mimicking triple-helical peptide was synthesized with a nitroxide-containing amino acid to introduce strongly distance-dependent PREs to MMP-12 cat. A, the modified triple-helical peptide shows the nitroxide-containing TOAC residue with sticks and the leading, middle, and trailing chains in red, yellow, and blue, respectively. The radially decreasing violet color symbolizes the strong distance dependence of PRE that decreases in proportion to 1/r 6 where r is the distance between unpaired electron and the proton monitored by NMR. B, peptide synthesis placed TOAC at either the P 5 or P 8 Ј position in the homotrimer. This introduces paramagnetic relaxation radiating up to 25 Å from each nitroxide group. The paramagnetic relaxation emanating from the TOAC at P 5 is symbolized by green, and that from TOAC at P 8 Ј is symbolized by violet. C, relaxation patterns indicate that Gly-105 at the unprimed end of the catalytic cleft is near a nitroxide at P 5 , whereas Gly-179 at the primed end of the cleft is near a nitroxide at P 8 Ј. The relaxation in the left panels was measured by the widely used two-point method (35,45). In the right panels, the new multipoint approach (39 -41) was used. Its exponential decays confirm suppression of J coupling. Error bars indicate S.D. of spectral noise. The unpaired electron in TOAC was reduced, and its paramagnetic relaxation was removed by incubation with 6 mM ascorbate, establishing the diamagnetic reference state. Three major poses are present in over 80% of ensembles with lowest Q-factor (Fig. 6B). These three structures collectively account for the PREs with a reasonable Q-factor of 0.37. One of these occupies the active site (Fig. 7), and the other two are major encounter complexes (Fig. 8).
Productive Mode of Binding-Of the major modes of binding identified, the one occupying the active site was of most interest because of the proximity of the scissile bond to the catalytic zinc. Temporary removal of six residues from the S 1 Ј loop during rigid body docking enabled the triple helix to rotate into and fill the catalytic cleft on the unprimed side, thereby satisfying the PRE-based distance restraints better. Once the S 1 Ј specificity loop had been rebuilt, there was a degree of steric clash between the enzyme and the triple-helical peptide in the resultant structure (supplemental Movie S1). This clash was removed by performing a round of energy minimization that resulted in a small amount of distortion to both the enzyme and substrate (Fig. 9). The distortion to MMP-12 cat was limited to the C-terminal ends of the III-IV and V-B loops and part of the S 1 Ј loop (Fig. 9, A and C). The distortions of both of these areas are well within the ranges of experimental variation among the crystallographic and solution NMR coordinates of MMP-12 cat available in the Protein Data Bank (Fig. 9A). 15 N relaxation experiments have shown them to be flexible (59). Intriguingly, R 1 measurements (60) showed that both these areas, which together form the upper and lower rims of the narrowest part of the active site (Fig. 9C), undergo sub-ms motions postulated to be consistent with 1-4-Å widening of the cleft. The distortions to the triple helix are minor and are mostly restricted to the scissile triplet of the trailing strand and Glu-20 of the leading and middle strands (Fig. 9, B and D).
The major pose in the active site (cluster 2 in Fig. 6B) places the carbonyl group of the scissile GlyϳVal peptide bond of the trailing strand close enough for attack by the water molecule positioned for catalysis (Fig. 7). Consequently, we consider this the first productive complex of a triple-helical substrate with an MMP protease to be determined experimentally. The occupation of each of the subsites of MMP-12 cat in the structure of the productive complex by the triple helix from collagen V is near almost all of the largest NMR peak shifts induced in MMP-12 cat (Fig. 1). After refinement of this complex in explicit water, the MMP-12 cat exhibits a small deviation of the FIGURE 4. Paramagnetic NMR line broadenings introduced by triple-helical peptide with TOAC at P 8 are extreme and remote from the active site. A, i-iv, 13 C heteronuclear multiple quantum coherence of Ile-Leu-Val-labeled, free MMP-12 cat (cyan) at 800 MHz reveals several methyl peaks broadened away by addition of ␣1(V) THP with TOAC on the primed side of the scissile bond (purple). Ile-180 is in the catalytic cleft. Ile-191, Val-162, and Leu-160 cluster on the ␤-sheet, suggesting alternative modes of binding. Each asterisk (*) symbolizes all three degenerate methyl protons. B, i-iv, concentration dependence of paramagnetic broadening. The last column was recorded after reduction with 6 mM ascorbate, which restored NMR peaks broadened by the spin-labeled triple-helical peptide. backbone from the NMR structure of the free state with r.m.s.d. of 0.56 Å. When hosting the triple-helical peptide, the widening of the cleft via outward shift of the S 1 Ј loop appears Ͻ1.7 Å. The deviation of the triple helix from the starting coordinates is also small with r.m.s.d. of 0.65 Å. That is, both partners remain mostly unchanged. The pattern of hydrogen bonds inferred from the coordinates of the triple helix appears unchanged by the docking procedures.
Contrary to recent proposals that a single chain of the triple helix must be released for insertion in the catalytic cleft of the MMP (16 -18), both the middle and trailing strands fit neatly and intact into the catalytic cleft with extensive subsite usage along both the upper and lower rims of the cleft. This stereospecific and productive complex has extensive interfacial buried surface area (Fig. 7) of ϳ1080 Å 2 , which is much greater than the interfaces of the other two binding poses (Fig. 8). The trailing chain contributes around half of the surface area buried in the interface of the productive complex. Its Val side chain at the P 1 Ј position is buried in the main S 1 Ј pocket (Fig. 7B). The Pro at P 3 of the trailing chain forms extensive hydrophobic contacts with the S 3 subsite formed from the N terminus and the N-terminal portion of the Met loop (Fig. 7B). The trailing chain places its Glu side chain at P 4 Ј into the adjacent channel close enough to Thr-215 for hydrogen bonding. The trailing chain also appears to form hydrogen bonds with the S 1 Ј loop.
The middle strand contributes 40% of interfacial surface area and fills five subsites traversing the entire length of the active site (Fig. 7B). The Hyp at position P 5 of the middle strand forms FIGURE 6. Statistics of parsimonious ensembles generated by the metaheuristic algorithm q_test.py. A, minimal ensembles generated showed a typical Gaussian distribution of between six and 10 members with an ensemble size of 8 being most frequent. B, certain clusters of structures were highly represented among the 1000 best ensembles (when ranked by Q-factor). Three clusters (clusters 1, 2, and 3) appeared in over 80% of ensembles. C, occurrence of clusters among the 1000 best minimal ensembles. 65 clusters were represented in at least 1% of ensembles of which only three (clusters 1, 2, and 3) occurred in more than 80%. both hydrophobic and polar contacts with subsite S 4 . Pro at P 3 of the middle chain may contact the enzyme with a single atom. The Pro at P 2 buries deeply into subsite S 2 . Both Val side chains of the GϳVV triplet of the middle strand tuck snugly underneath the S-shaped loop into subsites S 1 and S 2 Ј, respectively (Fig. 7B). Finally, Gln at P 5 fits into the S 3 Ј subsite where it appears to donate a hydrogen bond toThr-210 of the protease. (The corresponding positions in MMP-1 and -8 are essential to collagenolysis (61, 62).) Around 10 intermolecular hydrogen bonds are suggested by the coordinates of the productive complex. The hydrogen bonding presumably promotes and specifically positions the productive binding. The middle chain forms apparent hydrogen bonds with the ␤-sheet. In effect, the two chains of the triple helix and the extended segment of the S 1 Ј loop continue the network of hydrogen bonding of the fivestranded ␤-sheet by the transient addition of these three additional strands.
Principal Remote Encounter Complexes-The productive mode of binding accounts only for a minority share of the PREs observed (Fig. 10, top row) with Q-factor of 0.68, strongly implying interactions elsewhere. The PREs centering on the II-III loop, the ␤-sheet, and the N terminus require understanding. In the ensembles of complexes computed as collectively satisfying the intermolecular PREs, cluster 1, which binds on the "left side" in Fig. 8, is even a bit more frequent than the productive binding (Fig. 6B). This most frequent pose passes by multiple chemical groups with NMR peaks completely broadened by TOAC positioned at P 8 Ј in the triple helix. In the ␤-sheet, these include the methyl groups of Leu-160 and Val-162 (Fig. 5) and amide groups of Leu-160 to Val-162 and Ala-195 (Fig. 10B). In the loops flanking this channel, the amide peaks of Thr-154, Ala-157, Ala-167, His-172, Ile-191, and Gly-192 joined by the Ile-191 ␦-methyl peak are also nearly completely broadened by the TOAC at P 8 Ј (⌫ 2 Ն 100 s Ϫ1 in Fig.  10B). That is, the triple helix here in cluster 1 occupies a hydrophobic surface groove running from the ␤-sheet to the unprimed end of the catalytic cleft (Fig. 8B). At corresponding locations in this channel, TOAC substitution instead at P 5 introduces PREs with ⌫ 2 of 35-46 s Ϫ1 only at Gly-155, Met-156, and Val-162 (Fig. 10A), suggesting only transient approaches by the P 5 position of the triple helix. Cluster 3 represents the third most frequent binding pose (Fig. 6B). It binds the back of the catalytic domain over helix A near exosite 3 (Fig.  1A) most distant from the active site (Fig. 8). Combining the productive and two main encounter complexes significantly improves the accounting for PREs (Fig. 10).
Many Lesser Encounters-Nonetheless, many measured PREs remain unexplained by the three main modes of triple helix binding (Fig. 10). Among the parsimonious ensembles selected by the new software, the most frequent sizes are seven to nine members (Fig. 6A). These additional poses occur in dozens of additional clusters of lower populations (Fig. 6C). We evaluated groups of these ensembles to model the encounter complexes (58). This is consistent with a cloud of minor encounters and their dynamic variation. The mesh in Fig. 8 represents the likelihood of a volume containing atoms from the THP, in this case for at least 20% of the distribution. Weak, transient associations are generally accompanied by minimal NMR peak shifts compared with specific complexes (63), which also seems true of the triple helix interactions with MMP-12 cat with the possible exception of the peak shifts of exosite 2 (Fig.  2). Collecting the best 20 ensembles containing 6 -10 binding poses each improves the Q-factor to a respectable 0.21 and predicts the majority of measured PREs. However, PREs of residues 213, 214, and 216 to TOAC at P 5 (Fig. 10A) remain unexplained, perhaps suggesting a minor population of alternative peptide positioning in the active site not captured by the struc- tural ensembles. One possibility for this minor population is the monomer fraction of this peptide present at the 25°C used (see Fig. 3A, inset). The monomer fraction may have been larger than its normal presence in triple-helical peptides (64,65). Nonetheless, MMP-12 chooses the triple-helical form for hydrolysis but not the monomer as described above.
Overall, the cloud of transient poses encircles the II-III loop (exosite 2), crosses the top of the ␤-sheet (Fig. 8A), and courses down the back of the catalytic domain (Fig. 8C). This appears to be a continuous, serpentine path of occupation that leads through the active site, around the S-shaped loop, over the ␤-sheet, and down the spine of helix A (Fig. 8).

Two Chains of a Collagen Triple Helix Fit the Active Site
Two of three chains of a triple helix occupy most of the length of the catalytic cleft of MMP-12 cat in this first experimental measurement of productive triple helix binding (Fig. 7). Dozens of distance restraints from paramagnetic NMR position two chains of the collagen V-mimetic peptide into the cleft of MMP-12. It may be possible that this structural snapshot of the Michaelis complex can precede a subsequent intermediate in which a single chain occupies the narrower primed subsites, perhaps after the first chain is hydrolyzed. Or appropriate conditions may enable a collagenolytic MMP to draw a single chain from a triple helix into its primed subsites, i.e. high enough temperature or both domains of a full-length collagenase working together to unwind the triple helix partly (16,17). However, the productive mode of binding observed with the triple helix intact suggests that chain separation may not be required to initiate triple helix digestion. With the active site intact (with only modest NMR peak shifts) and intact collagen triple helix occupying it, the scissile peptide bond is near the water ligand of the catalytic zinc poised for hydrolytic attack (Fig. 7B) but prevented by lack of the general base Glu-219. Another possible implication of the two-stranded insertion captured in this structure is that the proximity of the middle strand to the catalytic zinc and water suggests the possibility of a minimal and quick reorientation to position its scissile GlyϳVal bond for attack.

Parallels at the Active Site Interface
Trailing Chain-The more superficial binding of a collagen II-based triple-helical peptide to MMP-1 in the crystal structure placed the P 1 -P 5 residues of the trailing chain into unprimed subsites of the active site (17) in a fashion analogous to the positioning of the P 1 -P 5 residues of the middle chain in the productive complex with MMP-12 cat (Fig. 7B). However, the trailing chain in the productive complex with MMP-12 cat additionally positions its P 1 -P 6 residues in contact with the other rim of the unprimed side of the active site (Fig. 7B), whereas the hypothesized productive complex with MMP-1 only predicted Pro at P 3 and P 6 to make such contacts (17). The position of the trailing strand more deeply inserted in the MMP-12 active site (Fig. 7B) strongly resembles the positioning of a hexapeptide caught immediately after cleavage (26) with an average C␣ displacement of 2.8 Å (close to the crystallographic resolution limit of 1.9 Å), consistent with the coordinates of Fig.  7B depicting productive binding.
Middle Chain-The large contribution of the middle strand from this collagen V mimic to buried surface area and to five subsites at the interface suggests this chain to be extremely important in complex formation despite not itself being primed for cleavage. The middle chain in the productive complex with MMP-12 cat is mimicked partly by the hypothetical productive model of the collagen II triple-helical peptide with MMP-1 (17,66) on the wider, unprimed side of the active site. The hypothesized productive complex places the P 1 -P 5 residues of the middle chain in contact with the unprimed subsites (17) similarly to the productive complex with MMP-12 cat (Fig. 7B). However, the middle chain in the productive MMP-12 complex measured in solution runs deeply through the full length of the cleft. In the hypothetical model proposed for MMP-1 in contrast, the middle chain is excluded from the narrow, primed side of the cleft (17). The propeptide fragment crystallized in the active site of MMP-13 (66) is a much better mimic of the deep course of the middle chain of the productive complex of MMP-12 with triple-helical peptide. The MMP-13 propeptide chain from Arg-41 to His-48 is displaced from the middle chain in the productive MMP-12 complex by 2.25 Å on average across the C␣ traces of the peptide backbones. Nonetheless, two chains occupying, in parallel, the full length of the cleft is unique among MMP complexes. The dynamic fluctuations of the S 1 Ј specificity loop in solution (59) might allow enough widening for the two chains of the triple helix to enter the full length of the cleft. The more rigidified state of this loop in crystals at cryogenic temperatures might impede entry of two chains.

Contacts for Triple-Helical Peptidase Activity
Mutations of MMP-12 that compromise its collagen triplehelical peptidase activity the most are F185Y, G227F, T239L, and K241H (67). Phe-185 (Phe-166 in MMP-1) contributes both to subsite S 2 and to maintaining the conserved hydrophobicity of the groove around the S-shaped loop. The hydroxyl group of F185Y may directly impair triple helix binding. The G227F lesion fills the S 3 subsite occupied by the triple helix with the bulky phenyl moiety. The ⑀-amino group of Lys-241 is suggested by the coordinates of the productive complex to be capable of forming hydrogen bonds with Gln at the P 5 Ј position of the trailing chain (Fig. 7B) and with Glu at P 7 Ј of the leading chain. Thr-239 seems pivotal, and its lesions heavily impact catalytic turnover of triple-helical substrates. Not only does Thr-239 contribute much surface area to S 1 Ј, but it also forms hydrogen bonds with the trailing strand and is straddled by the FIGURE 10. Ensembles of structures are needed to account for the many PREs measured. Theoretical PREs (colored symbols) are compared with measured PREs (gray) introduced by a 1.5-fold excess of the triple-helical peptide with TOAC substitution either at P 5 (A) or at P 8 Ј (B). The PREs from triple-helical peptide to amide protons are marked with gray columns for measurements and squares for back-calculations from the productive model in red, from the three frequent binding poses (clusters 1-3) in blue, and the ensemble of parsimonious ensembles in green. Triangles mark PREs from P 8 Ј to methyl groups measured (gray) or back-calculated (color). Theoretical PREs calculated from the productive mode of binding (red) explain broadening in the active site (e.g. residues 102-105 and 221-230 in A and 202-220 in B) but fail to explain other areas experiencing significant broadening (e.g. residues 140 -165 in B). Addition of the two frequently sampled remote poses explains the PREs of several other areas (blue). However, an ensemble of ensembles is required to model (green) the widely observed PREs with the comparatively high quality of a Q-factor of 0.21. Error bars indicate S.D. of spectral noise.
valine pair in the scissile GϳVV triplet. The MEROPS database suggests the MMP-12 preference of collagen substrate sequence (like other MMPs) to be GPXGϳ⌽XGX where ⌽ is hydrophobic. The occupation of subsites by the trailing chain of the ␣1(V)436 -450 triple-helical peptide is in excellent agreement with these sequence requirements, further supporting its positioning as the starting orientation for digestion of the triple helix.
The distances estimated from paramagnetic relaxation positioned the peptide bond of the scissile site in the collagen V triple-helix, i.e. the GϳVV triplet of the trailing chain, poised for hydrolysis (Fig. 7B). However, digestion in the subsequent GEϳQ triplet was observed in the case of MMP-12 (28). This might reflect hydrolysis after the initiating hydrolysis. The GluϳGln peptide bond of the leading chain is only a 120°rotation and a one-to two-residue spatial translation from the Gly-ϳVal bond positioned for attack (Fig. 7B); it is a short corkscrew rotation apart.

Encounter Complexes and Implications
Encounter complexes have been well characterized among electron transfer proteins but deserve study broadly among other protein associations (32). Minimal ensembles reported to represent the loosely bound, transient encounter complexes observed by paramagnetic NMR in other protein associations have up to 7-10 members (38,58,68). Remote binding of the collagen V miniprotein to the ␤-sheet, active site, and possibly elsewhere had been suggested by its protection of these surfaces of MMP-12 cat from solvent PREs introduced by a Gd⅐EDTA probe (28). However, that previous approach lacked the wealth of distance measurements to orient the poses of the collagen V-derived triple helix made possible by spin labeling it.
Encounter complexes have been hypothesized to speed up the process of molecular recognition by reducing the search from three to two dimensions through the formation of weak electrostatic (58) or hydrophobic (38) interactions. The hydrophobic and conserved nature of the large serpentine path filled by transient complexes reinforces its potential as harboring "loading complexes", i.e. attracting the collagen triple helix to the catalytic domain and then often to the catalytic cleft from where it can proceed to form a hydrolytically active complex (Fig. 11A). In a physiological environment, insoluble collagen fibrils are essentially immobile. Therefore, it is the MMP that diffuses upon the fibrils in the search for productive complexes (69,70). The encounters of multiple surfaces of the MMP-12 catalytic domain with a collagen triple helix may increase the frequency of collisions. Such a process of guided tumbling could expedite the diffusional search to productive engagement, perhaps by a potential route marked by the blue arrow in Fig. 11A.
This potential route between loops over the ␤-sheet is utilized in other contexts. The extended sA-sB loop of tissue inhibitor of metalloproteinases-2, a physiological inhibitor of MMPs, occupies the hydrophobic portion of this path in its association with MT1-MMP, specifically reaching between the S-shaped, IV-V, and II-III loops over the ␤-sheet (71) (Fig. 11B). This projection just touches the edge of exosite 2 at the II-III loop. The B-type propeptide of MMP-13 was also observed reaching from the active site into the proximal portion of the corresponding groove between the S-shaped and IV loops of MMP-13 (66). Intriguingly, the predominant pose of the collagen V triple helix crosses this same groove between the S-shaped and IV-V loops of MMP-12 cat (Fig. 11). Thus, this groove appears to serve as a recognition channel for narrow protein partners of MMPs.
Exosite 2 at the II-III loop enhances affinity and catalytic efficiency for soluble elastin fragments despite being ϳ20 Å from the catalytic cleft (42). This reinforces the hypothesis that this loop may be important in the molecular recognition of fibrillar substrates en route to the active site. The II-III loop is also critically important for the molecular recognition of membrane bilayers (39,40) and a receptor (72). These precedents and the PREs from spin-labeled triple-helical peptides encompassing the II-III loop may be consistent with the possibility of a functional role in initial associations with collagen triple helix.
The present study was performed utilizing an MMP catalytic domain alone. As the presence of the C-terminal hemopexinlike domain in a full-length MMP enhances affinity for (73) and catalytic turnover of (74) triple-helical collagen, it may be worth considering whether one potential role of the hemopexin-like domain could be to attract and stabilize the productive binding of the triple helix from among the multitude of alternative binding poses.

Conclusions
A productive complex between a collagen triple helix and an MMP catalytic domain has now been observed experimentally, FIGURE 11. Remote binding to exosite 2 and a hydrophobic channel topping the ␤-sheet might impart a hypothetical advantage to reload another triple helix into the active site. A, proposed route of diffusion that might reload productive binding (blue arrow) after molecular recognition near exosite 2. B, tissue inhibitor of metalloproteinases-2 (magenta; Protein Data Bank code 2E2D) reaches its long sA-sB loop (magenta surface) around the S-shaped loop to fill the channel over the ␤-sheet to approach exosite 2. This occupies the hypothetical path of diffusion of the triple helix. which was made possible by paramagnetic NMR measurement of numerous intermolecular distances. The trailing chain of this triple helix runs deeply through the catalytic channel with positioning very similar to that reported for a linear peptide substrate (26). The accompanying path of the middle chain is similar to the path of a propeptide fragment bound to MMP-13 (66). However, the observation of two chains of a triple helix fitting the full length of the catalytic channel, with both triple helix and active site little changed in structure, is unexpected and novel. Also surprising is the wealth of remote encounter complexes detected winding loosely from the back of the catalytic domain across the ␤-sheet past exosite 2 and between three loops toward the active site. This serpentine cloud of orientations raises the question of its relevance to enhancing attraction for collagen fibrils and guiding diffusional reorientation of the catalytic domain on collagen. The two-stranded triple helix insertion into the active site and the convoluted array of encounter complexes observed interject new concepts for reflection and experimentation regarding the initiation of collagenolysis. The channel occupied by the most abundant encounter might prove druggable. Regardless, the productive mode of binding is now available as a structural template for rational development of inhibitors acting by molecular mimicry or by competing with binding at outlying subsites.