The Peptide-Substrate-binding Domain of Collagen Prolyl 4-Hydroxylases Is a Tetratricopeptide Repeat Domain with Functional Aromatic Residues*

Collagen prolyl 4-hydroxylases catalyze the formation of 4-hydroxyproline in - X -Pro-Gly- sequences and have an essential role in collagen synthesis. The vertebrate enzymes are (cid:1) 2 (cid:2) 2 tetramers in which the catalytic (cid:1) -sub- units contain separate peptide-substrate-binding and catalytic domains. We report on the crystal structure of the peptide-substrate-binding domain of the human type I enzyme refined at 2.3 Å resolution. It was found to belong to a family of tetratricopeptide repeat domains that are involved in many protein-protein interactions and consist of five (cid:1) -helices forming two tetratricopeptide repeat motifs plus the solvating helix. A prominent feature of its concave surface is a deep groove lined by tyrosines, a putative binding site for proline-rich peptides. Solvent-exposed side chains of three of the tyrosines have a repeat distance similar to that of a poly-L -proline type II helix. The aromatic surface ends at one of the tyrosines, where the groove curves almost 90° away from the linear arrangement of the three tyrosine side chains, possibly inducing a bent conformation in the bound peptide. This finding is consistent with previous suggestions by others that a minimal structural requirement for proline 4-hydroxylation may be a sequence

The prolyl 4-hydroxylases (P4Hs) 1 catalyze the formation of 4-hydroxyproline by the hydroxylation of proline residues in peptide linkages. Two P4H families are known today. The collagen P4Hs (C-P4Hs), enzymes residing within the lumen of the endoplasmic reticulum, have a central role in the synthesis of all collagens, with the resulting 4-hydroxyproline residues being essential for the folding of the newly synthesized collagen polypeptide chains into triple helical molecules (1)(2)(3). The P4Hs hydroxylating the hypoxia-inducible factor are cytoplasmic and nuclear enzymes that play a key role in the response of cells to hypoxia (4 -6).
All vertebrate C-P4Hs are ␣ 2 ␤ 2 tetramers in which the ␤-subunit is identical to the enzyme and chaperone protein disulfide isomerase (PDI) (1)(2)(3). Three isoforms of the catalytic ␣-subunit have been characterized from human and mouse sources and shown to form with PDI [␣(I)] 2 ␤ 2 , [␣(II)] 2 ␤ 2 , and [␣(III)] 2 ␤ 2 tetramers, the type I, II, and III C-P4Hs, respectively (1-3, 7, 8). The type I enzyme is the main form in most cells, but type II is a major form in chondrocytes, osteoblasts, endothelial cells, and some other cell types, whereas type III is expressed in many tissues but at much lower levels than the type I and type II C-P4Hs (7-10). All P4Hs require Fe 2ϩ , 2-oxoglutarate, O 2 , and ascorbate, and their reaction involves a stoichiometric 2-oxoglutarate decarboxylation (1)(2)(3). The C-terminal region of the C-P4H ␣-subunits contains four conserved residues, two histidines and one aspartate that bind the Fe 2ϩ atom and a lysine that binds the C-5 carboxyl group of the 2-oxoglutarate (11).
The C-P4Hs act on -X-Pro-Gly-triplets in collagens and more than 20 other proteins with collagen-like sequences (1)(2)(3)12). Tripeptides with the structure X-Pro-Gly fulfil a minimum requirement for hydroxylation, given that the residue before the proline cannot be glycine, whereas the residue after the proline must be glycine (1)(2)(3). Polytripeptides with the structure (Pro-Pro-Gly) n are good substrates, with only the prolines preceding glycine being hydroxylated and the K m values decreasing markedly with increasing chain length (1)(2)(3). The interaction of peptide substrates with the C-P4Hs is further affected by the amino acid in the X position of the X-Pro-Gly triplet to be hydroxylated, by other nearby amino acids, and by the peptide conformation, with the triple helical conformation of the collagenous peptides completely preventing hydroxylation (1)(2)(3). Poly-L-proline is an effective competitive inhibitor of type I C-P4H but only a very weak inhibitor of the type II, there being also smaller differences between these two isoenzymes in the binding properties with respect to peptide substrates, whereas the recently identified type III C-P4H appears to be intermediate between the type I and type II enzymes in this respect (1)(2)(3)7).
The peptide-substrate-binding domain of the C-P4Hs is separate from the catalytic domain and is located between residues 140 and 245 in the 517-residue human ␣(I)-subunit (13). NMR studies have shown that a recombinant human C-P4H ␣(I) polypeptide Phe 144 -Ser 244 forms a folded domain consisting of five ␣-helices and one short putative ␤-strand between the second and third ␣-helices (14); this structure is quite different from those of other proline-rich peptide binding modules, which consist mainly of ␤-strands (15)(16)(17)(18)(19)(20)(21)(22). Binding of the peptide (Pro-Pro-Gly) 2 to this domain caused major chemical shifts in many backbone amide resonances, with the residues showing the largest shifts being located mainly in helix ␣3 (14). The K d values determined by surface plasmon resonance and isothermal titration calorimetry for the binding of several synthetic peptides to the ␣(I) and corresponding ␣(II) domains were very similar to the K m and K i values for these peptides as substrates and inhibitors of the type I and type II C-P4H tetramers. The K d values determined for a 4-hydroxyproline-containing peptide indicated a marked decrease in the affinity of hydroxylated peptides for the domain (14). Many characteristic features of peptide binding to the type I and type II C-P4H tetramers can thus be explained by the binding to this domain rather than to the catalytic domain (14).
We recently obtained crystals of the ␣(I) peptide-substratebinding domain (23), and we have now obtained crystals that diffract to 2.3 Å resolution. The domain was found to consist of five ␣-helices and belong to the family of tetratricopeptide repeat (TPR) domains that are involved in many protein-protein interactions (24 -27). The peptide substrates and the competitive inhibitor poly-L-proline are suggested to become bound to a groove lined by tyrosines; a critical role for several tyrosines in peptide binding was verified by site-directed mutagenesis studies.

Recombinant Protein Samples and Site-directed Mutagenesis-
The recombinant peptide-substrate-binding domain covers residues 144 -244 of the human C-P4H ␣(I)-subunit. The only cysteine in this domain, Cys 150 , was mutated to serine, and the domain was extended by a C-terminal tag of six histidines and preceded by a methionine (14). The recombinant domain was expressed in Escherichia coli and purified as described earlier (14,23). Se-Met derivatization of the protein was performed by expressing it in a methionine-requiring auxotroph E. coli strain B384 (DE3) using induction for 20 h at 20°C. All tyrosine residues in the peptide-substrate-binding domain and the corresponding residues in the full-length ␣(I)-subunit were converted individually to alanine using a QuikChange™ site-directed mutagenesis kit (Stratagene). The sequences were verified on an automated DNA sequencer (Abi Prism 377; Applied Biosystems). Recombinant type I C-P4H tetramers were expressed in insect cells as described previously (11,13) by coinfecting them with recombinant baculoviruses encoding the wildtype or mutant ␣(I)-subunits and the PDI polypeptide. The cells were harvested 72 h after infection, homogenized in a Triton X-100-containing buffer, and centrifuged (11,13). Aliquots of the soluble proteins were analyzed by nondenaturing PAGE, and P4H activity was assayed by a method based on the hydroxylation-coupled decarboxylation of 2-oxo[1-14 C]glutarate (28). K m values for (Pro-Pro-Gly) 10 were determined as described (29), and IC 50 values for poly-L-proline, M r 5000 -8000, were measured by adding increasing concentrations of this peptide to the enzyme reaction mixture.
Isothermal Titration Calorimetry-The ITC experiments were performed using a VP-ITC titration microcalorimeter (Microcal, Northampton, MA) as described previously (14). In the individual titrations, 5-l doses of 1-5 mM solutions of (Pro-Pro-Gly) 10 or poly-L-proline, M r 5000 -8000 (in 10 mM sodium phosphate buffer, pH 6.8) were injected into a sample cell (volume 1.4476 ml) containing 0.05-0.3 mM of the wild-type or mutant peptide-substrate-binding domain in the above buffer at an interval of 4 min with stirring at 300 rpm. The titration data obtained were corrected by subtracting the heat of dilution of the peptides from the raw data and fitted by the non-linear least squares minimization method using the one set of sites model (Levenberg-Marquardt algorithm) to determine the dissociation constant (K d ) using Origin software (Microcal).
Crystallization and Data Collection-Crystals of the recombinant C-P4H ␣(I) peptide-substrate-binding domain were grown by the hanging drop vapor diffusion method as described previously (23). An addition of 10 mM urea to the crystallization drops was found to improve the quality of the crystals. Briefly, 2-l drops of a protein solution, 10 mg/ml, in 20 mM bis-Tris, 100 mM glycine, pH 6.8, and 10 mM urea, were mixed with 2 l of a well solution of 1.4 M ammonium phosphate, pH 8.4. The crystals were soaked for 30 s in a solution of 1.6 M ammonium phosphate, pH 8.4, and 20% glycerol as a cryoprotectant prior to flash freezing in a stream of cold (100 K) N 2 . All data were collected from frozen crystals.
The structure was resolved using a 2.8 Å resolution three-wavelength multiwavelength anomalous dispersion data set collected from crystals grown from the Se-Met-derivatized protein. A 2.3 Å data set was subsequently collected from a crystal presoaked briefly in a mother liquor that also contained a potential ligand peptide, Arg-Pro-Pro-Gly. No ligand was bound, however, and consequently this data set will be referred to as a native one. All the data sets were processed with XDS (30). Further data handling employed programs available in the CCP4 package (31). The data collection statistics are set out in Table I.
Structure Determination and Structural Analysis-The three-wavelength multiwavelength anomalous dispersion data set was analyzed with SOLVE (32), which resulted in finding six Se sites of approximately equal occupancy. The mean figure of merit of the calculated phase sets was 0.65 for the 2.8 Å data set. The asymmetric unit comprises two molecules, each having three methionines, with all six methionines apparently being well ordered. An initial model consisting of 120 residues could be built up using RESOLVE (32). The phases were improved by solvent flattening using DM (31), and a complete model could be built up for both molecules using O (33) and Xfit (34). Refmac (35) was used to refine this structure to a resolution of 2.8 Å. This model was further refined at 2.3 Å resolution against the native data set with weak NCS restraints, with 73 water molecules having been added. No other solvent molecules were found in the electron density maps. The refinement statistics for the native structure are summarized in Table  I. The atomic coordinates and structure factors have been deposited in the Protein Data Bank (accession code 1TJC).
The quality of the model was monitored with PROCHECK (36), and the structure was analyzed with O and ICM (37). The fold of the C-P4H ␣(I) peptide-substrate-binding domain was compared with known folds in the Protein Data Bank using the DALI server (38). Figs. 2-4 were made with the ICM package (37), which was also used to build a poly-L-proline helix in the type II (PPII) conformation ( ϭ Ϫ69°, ϭ 149°, 1 ϭ 27°). The secondary structure was calculated using DSSP (39).

RESULTS
Overview of the Structure-The recombinant peptide-substrate-binding domain covers residues Phe 144 -Ser 244 of the 517residue human C-P4H ␣(I)-subunit (14) (Fig. 1). The only cysteine in the domain, Cys 150 , was mutated to serine, and the C terminus contained a six-histidine tag. The recombinant domain produced in E. coli has three methionines, the translation initiation methionine, Met 170 , and Met 235 (numbered according to the ␣(I)-subunit residues). It therefore seemed possible to resolve the structure by derivatization with Se-Met, performed by expressing the polypeptide in a medium containing this amino acid. The crystal structure of the domain was refined at 2.3 Å resolution with good refinement statistics (Table I). All the residues have / values in the allowed regions of the Ramachandran plot. It is an all-helical protein with five ␣-helices running antiparallel to each other (Fig. 2). There are two molecules/asymmetric unit, with the polypeptide chain of each molecule being well defined from the N terminus until residue 237. There are no regions with high B factors, except for the seven C-terminal residues that are disordered. The two molecules of the asymmetric unit have the same structure, so that on superposition of residues 143-237 of the two molecules, for example, the root mean square difference between the corresponding C␣ atoms is 0.2 Å.
The Helical Bundle of the Domain Is a 2.5-TPR Repeat-The five antiparallel ␣-helices create a bowl-like surface, with helices ␣1 and ␣5 at the rim, ␣2, ␣3, and ␣4 on the bottom, and ␣3 the central helix (Fig. 2). Comparison of the fold of the domain with other structures present in the Protein Data Bank, using the DALI server (39), indicated that the C-P4H ␣(I) peptidesubstrate-binding domain has a high structural similarity to TPR domains. The TPR motif is a 34-residue repeat consisting of two antiparallel ␣-helices. Naturally occurring domains that are assembled from TPR motifs often comprise a number of these motifs plus an additional ␣-helix, referred to as the "solvating" helix (24,25). The C-P4H domain has two TPR motifs, referred to as TPR-1 and TPR-2, plus the solvating helix (Fig.  2), which are also indicated in Fig. 1. Most characterized TPR domains have at least three motifs (25). Because the sequence similarity between TPR units is very low, it was not known that the C-P4H domain would have this structure. Nevertheless, the TPR consensus sequence consisting of a conserved pattern of small and large hydrophobic residues (25) is mostly present in both TPR-1 and TPR-2 of the C-P4H ␣(I) peptide-substratebinding domain. The residues corresponding to the TPR consensus sequence Trp 4 -Leu 7 -Gly 8 -Tyr 11 -Ala 20 -Phe 24 -Ala 27 -Pro 32 (25)   plexed with the C-terminal peptide of Hsp70 (27), are shown in Fig. 3. The root mean square distances between corresponding C␣ atoms are 1.6 Å for the C-P4H-Pex5 superposition and 1.9 Å for the C-P4H-Hop superposition when the residues of the loop between C-P4H TPR-1 and TPR-2 are excluded from the calculations. The sequence identity between the C-P4H domain and the Pex5 and Hop domains is very low (13% in both cases). Nevertheless, there is remarkable structural similarity between the C-P4H domain and the two other TPR domains (  (Fig. 4). Below this groove there is a polar pocket filled with a water molecule (Fig. 4A, Wat15), hydrogen bonded to the oxygen side-chain atoms of Asn 227 , Ser 195 , and Thr 211 .
A Possible Mode of Binding of Poly-L-proline and Proline-rich Peptide Substrates-TPR domains are known to be involved in protein-peptide and protein-protein interactions but have not been known to constitute a framework for the binding of proline-rich peptides. Well characterized proline-rich peptidebinding domains include the SH3, WW, EVH1, UEV, and GYF domains and the proline-rich peptide-binding protein profilin (15)(16)(17)(18)(19)(20)(21)(22). Proline-rich peptides preferably fold into a poly-Lproline type II (PPII) helix conformation (18, 40 -44) and proline-rich peptide ligands are usually bound in this conformation, but several different modes of binding have been observed. A unique feature of the PPII conformation is its ϳ2-fold axis around a carbonyl CO bond (44). Modes of binding using this property, as seen in the complexes of peptides with the SH3 and WW domains, for example, have also been referred to as recognition of the X-Pro motif. It has been found that prolinerich ligands can become bound to the same binding surface in both directions, from N to C and from C to N (16,20,45). In this case the binding groove is rather shallow. A different mode of binding is seen in the case of the EVH1 domain, for example, where two proline side chains of a Pro-X-X-Pro motif bind in a cleft lined with aromatic residues (17,18). A common feature of all characterized binding sites for proline-rich peptides is the presence of solvent-exposed aromatic residues. The prolines and aromatic residues can interact with each other in several ways, for example through van der Waals stacking interactions or CH⅐⅐⅐ aromatic hydrogen bonding interactions, or both (46).
The aromatic patch on the inner, concave surface of the C-P4H ␣(I) peptide-substrate-binding domain is dominated by the three parallel, solvent-exposed side chains of Tyr 233 , Tyr 230 , and Tyr 196 (Figs. 2 and 4). This aromatic surface ends abruptly at Tyr 158 (Fig. 4) where the deep groove curves almost 90°away from the linear arrangement of the three tyrosine side chains. The repeat distance of these tyrosine side chains, ϳ8.5 Å (see above) is similar to the repeat distance of the Pro-X-X-Pro-X-X-Pro-X-X motif, which is ϳ9.1 Å for the C␥ atoms, of a PPII helix. Rigid docking of a PPII helix onto the aromatic groove indicates that the proline side chains of the Pro-X-X repeat can stack on the Tyr 233 , Tyr 230 , and Tyr 196 side chains of the C-P4H domain when docked in the same N to C direction as seen for the peptide ligands bound to Pex5 and Hop (Fig. 3). The proposed mode of binding is supported by previous data on the role of Tyr 233 in the inhibition of the C-P4H activity by poly-Lproline (13). This peptide is an effective competitive inhibitor of type I C-P4H, which contains Tyr 233 , but not of the type II or type III C-P4H or of the main Caenorhabditis elegans C-P4H form in which the tyrosine is replaced by a non-aromatic residue (1)(2)(3)13). Replacement of Tyr 233 in the human C-P4H ␣(I)-subunit by a glutamine, a residue present in the corresponding position in the human ␣(II)-subunit, has been shown to lead to a C-P4H mutant with a markedly increased K i for poly-L-proline, whereas mutation of the glutamine present in the ␣(II)-subunit to tyrosine led to a type II C-P4H mutant with a distinctly decreased K i for poly-L-proline (13). In the predicted mode of binding to the C-P4H ␣(I) peptide-substrate-binding domain, the proline-rich ligand also interacts with Asn 227 , Arg 223 , and Asp 192 , with these residues also being highly conserved in sequence alignments.
Binding Experiments with Mutant Variants of the Domain and the C-P4H Tetramer-To investigate further the roles of tyrosines, we mutated all the tyrosine residues of the peptidesubstrate-binding domain to alanines and determined the apparent K d values for the binding of the peptide substrate (Pro-Pro-Gly) 10 and the inhibitor poly-L-proline by isothermal titration calorimetry. The corresponding mutations were also generated in the full-length C-P4H ␣(I)-subunit, and their effects on the K m values for the substrate (Pro-Pro-Gly) 10 and the IC 50 values for the inhibitor poly-L-proline in the enzyme reaction were analyzed.
The apparent K d of the wild-type domain for (Pro-Pro-Gly) 10 was about 40 M, whereas the K d values of the Y196A, Y193A, and Y230A mutants were too high to be measured by isothermal titration calorimetry (Table II). The mutants Y233A, Y199A, Y158A, and Y163A gave slightly increased K d values, whereas Y164A gave the same value as the wild-type domain (Table II). These data suggested that the eight tyrosines can be divided into three categories in terms of their importance (indicated as Groups 1-3 in Table II), namely highly important, of intermediate importance, and not important. The K d of the wild-type domain for poly-L-proline, M r 5000 -8000, was 2 M, whereas the K d values of the three Group 1 mutants ranged from 120 M to too high to be measured. Those of the four Group 2 mutants ranged from 15 to 60 M and that of the only Group 3 mutant, Y164A, was only 6 M (Table II).
The mutant full-length ␣(I)-subunits were coexpressed with the PDI polypeptide (i.e. the ␤-subunit of the C-P4H ␣ 2 ␤ 2 tetramer) in insect cells, and the assembly of the mutant tetramers was analyzed by non-denaturing PAGE. Each mutant ␣(I)-subunit formed an enzyme tetramer with the PDI polypeptide, indicating that none of the replaced tyrosines played a structural role in the ␣(I)-subunit to the extent that its mutation to alanine would impair tetramer assembly (Fig. 5). The K m of the wild-type C-P4H for (Pro-Pro-Gly) 10 was 30 M, whereas those of the three Group 1 mutants, Y196A, Y193A, and Y230A, ranged from 110 to 350 M (Table II). The four Group 2 mutants, Y233A, Y199A, Y158A, and Y163A, had distinctly lower K m values, ranging from 40 to 80 M, whereas the Group 3 mutant Y164A showed no increase in K m (Table  II). The three Group 1 mutants also showed the highest IC 50  (Table II). The V max values of all the mutant C-P4H-I tetramers were essentially identical to that of the wild-type enzyme (data not shown).

DISCUSSION
All attempts to crystallize a vertebrate C-P4H ␣ 2 ␤ 2 tetramer or its ␤-subunit, i.e. the PDI polypeptide, have so far been unsuccessful, but the structures of three of the four PDI domains have been resolved by NMR and shown to have the thioredoxin fold (47)(48)(49). The full-length catalytic ␣-subunit is not amenable to structural studies as such, as it is totally insoluble and non-functional unless assembled with PDI (1-3). Our data on the peptide-substrate-binding domain provide the very first structural information on the catalytic ␣-subunit of any C-P4H. Structural information on this subunit could potentially have a major impact on the rational design of inhibitors for the treatment of fibrotic diseases caused by excessive collagen accumulation, as they constitute a major problem in medicine and inhibition of C-P4Hs is regarded as a particularly suitable approach for antifibrotic therapy.
The crystal structure of the C-P4H ␣(I) peptide-substrate-binding domain reported here shows that it is an entirely helical protein consisting of five antiparallel ␣-helices. Previous NMR characterization of the domain likewise indicated that it consists of five ␣-helices and one short putative ␤-strand between helices ␣2 and ␣3 (14). The crystal structure shows, however, that residues 179 -186 form a loop instead of a ␤-strand (Fig. 2). The lengths of helices ␣2, ␣3, and ␣4 are identical to those determined by NMR (14), whereas ␣1 starts one residue earlier, from Ala 147 . The N terminus of ␣5 could not be determined by NMR because of missing assignments, but based on the predicted secondary structure it was expected to start from His 221 (14). The crystal structure shows, however, that it starts one residue later, from Gln 222 . Our structural data indicate that the C-P4H ␣(I) peptidesubstrate-binding domain belongs to the family of TPR domains, consisting of two TPR motifs plus the solvating helix. The most prominent surface feature of the domain is a deep groove on its concave side, lined with tyrosine residues. All eight tyrosines of the domain are at or near its surface, and none of them belongs to the TPR consensus sequence. At least one tyrosine, Tyr 164 , is far away from the proposed binding groove and completely exposed to solvent, whereas Tyr 163 and Tyr 199 are much closer to the other five tyrosines, Tyr 233 , Tyr 230 , Tyr 196 , Tyr 158 , and Tyr 193 , all of which participate directly in shaping the aromatic groove. Because the structure of the domain in itself does not indicate which surface features are important for binding of the peptide substrates and which features are important for the assembly of the domain in the full-length tetramer, it is important to note that all eight tyrosine to alanine mutant full-length ␣-subunits formed the ␣ 2 ␤ 2 tetramer with the PDI polypeptide. Thus, none of the tyrosines was critical for tetramer assembly. Consequently, the mutagenesis data strengthen the hypothesis that the complete groove may also be important for peptide binding in the C-P4H tetramer. This hypothesis is further strengthened by the finding that the binding properties of the domain and the tetramer were highly similar. Furthermore, the only mutant that had essentially no effect was Y164A, which concerns a solventexposed side chain far away from the binding groove. The combined mutational data suggest that the three Group 10 and poly-L-proline, M r 5000 -8000, were determined by isothermal titration calorimetry (14). Similar values were obtained in at least two independent experiments. b Similar K m values of the wild-type and mutant type I C-P4H for (Pro-Pro-Gly) 10 and IC 50 values for poly-L-proline, M r 5000 -8000, were obtained in at least three independent experiments. c The value could not be determined accurately because of weak affinity.
FIG. 5. Non-denaturing PAGE analysis of C-P4H tetramer formation from the wild-type or mutant ␣(I)-subunits and the wildtype PDI polypeptide (␤-subunit) expressed in insect cells by means of recombinant baculoviruses. The cells were homogenized in a buffer containing Triton X-100, and the supernatants were analyzed by nondenaturing PAGE followed by Coomassie Blue staining. Mutant enzymes are indicated by the numbers of the ␣(I)-subunit tyrosine residues converted to alanine and are shown in the order of importance in the binding of proline-rich peptides (Groups 1-3) defined under "Results" and in Table II. The arrow indicates the position of the ␣ 2 ␤ 2 enzyme tetramer.
tween Tyr 233 and Tyr 193 (Fig. 4) in the binding of the prolinerich peptides, such that Tyr 196 and Tyr 193 are the most important sites for interaction. It is of interest that these two tyrosines, like Asp 192 and Tyr 199 , protrude out of helix ␣3. Previous NMR studies also showed that binding of (Pro-Pro-Gly) 2 had the greatest effect on the environment of the main chain of this helix (14). Of the seven tyrosines implicated as being important for peptide binding, only Tyr 199 and Tyr 163 do not line the aromatic groove. Because the side-chain hydroxyls of Tyr 199 and Tyr 230 are hydrogen-bonded to each other (Fig. 4), Tyr 199 may be important for fixing Tyr 230 in the right position. Tyr 163 sits in a surface groove near Tyr 196 and Tyr 158 (Fig. 4), and consequently the Y163A mutation may induce structural rearrangements near Tyr 196 and Tyr 158 that weaken the peptide-protein interactions. Altogether, the mutagenesis data agree very well with the structure-based hypothesis that the proline-rich peptides become bound to the deep groove on the concave surface of the domain.
Previous analyses of conformational features of synthetic peptide substrates and their interaction with the C-P4H tetramer have suggested that a minimal structural requirement for proline 4-hydroxylation may be a sequence in the PPII helix conformation followed by a ␤-turn in the Pro-Gly segment (50). The PPII helix conformation may be necessary for effective interaction at the substrate binding sites, whereas the ␤-turn may be essential for hydroxylation at the catalytic site (50). Interestingly, the structure of the peptide-substrate-binding domain shows that its concave surface contains three parallel, solvent-exposed side chains of Tyr 233 , Tyr 230 , and Tyr 196 (Fig.  4) with a repeat distance similar to that of a PPII helix. Furthermore, the aromatic surface ends at Tyr 158 (Fig. 4) where the groove curves almost 90°away from the linear arrangement of the three tyrosine side chains, possibly inducing a bent conformation in the bound peptide.
Crystallographic binding studies of the domain have been initiated, but the quality of the current crystals deteriorates greatly when they are soaked with several proline-rich peptides. In this crystal form the side chains of Tyr 230 and Tyr 233 are both involved in crystal contacts in both molecules. Although the observed crystal damage further confirms the importance of these tyrosines for the binding of proline-rich peptides, it also shows that a further search for suitable crystallization conditions for the peptide-domain complex is required to study the interactions in detail. The present data highlight the importance and versatility of the TPR unit for the formation of peptide-protein complexes, and it also emphasizes the importance of aromatic residues for the formation of binding grooves for proline-rich peptides.