Cloning and Characterization of a Low Molecular Weight Prolyl 4-Hydroxylase from Arabidopsis thaliana

4-Hydroxyproline is found in collagens and collagen-like proteins in animals and in many glycoproteins in plants. Animal prolyl 4-hydroxylases (P4Hs) have been cloned and characterized from many sources, but no plant P4H has been cloned so far. We report here that the genome of Arabidopsis thaliana encodes six P4H-like polypeptides, one of which, a 283-residue soluble monomer, was cloned and characterized here as a recombinant protein. Catalytically critical residues identified in animal P4Hs are conserved in this P4H, and their mutagenesis led to complete or almost complete inactivation. The recombinant P4H effectively hydroxylated poly(l-proline) and many synthetic peptides corresponding to proline-rich repeats present in plant glycoproteins and other proteins. Surprisingly, collagen-like peptides were also good substrates, the V max with (Pro-Pro-Gly)10 being similar to that with poly(l-proline). The enzyme acted in this peptide preferentially on prolines in Y positions in theX-Y-Gly triplets. Correspondingly, (Gly-Pro-4Hyp)5 and (Pro-Ala-Gly)5 were poor substrates, with V max values less than 5 and 20% of that obtained with (Pro-Pro-Gly)10, respectively, the K m for the latter also being high. Peptides representing the N- and C-terminal hydroxylation sites present in hypoxia-inducible transcription factor α also served as substrates. As these peptides contain only one proline residue, a poly(l-proline) type II conformation was clearly not required for hydroxylation.


4-Hydroxyproline is found in collagens and collagenlike proteins in animals and in many glycoproteins in plants.
Animal prolyl 4-hydroxylases (P4Hs) have been cloned and characterized from many sources, but no plant P4H has been cloned so far. We report here that the genome of Arabidopsis thaliana encodes six P4Hlike polypeptides, one of which, a 283-residue soluble monomer, was cloned and characterized here as a recombinant protein. Catalytically critical residues identified in animal P4Hs are conserved in this P4H, and their mutagenesis led to complete or almost complete inactivation. The recombinant P4H effectively hydroxylated poly(L-proline) and many synthetic peptides corresponding to proline-rich repeats present in plant glycoproteins and other proteins. Surprisingly, collagen-like peptides were also good substrates, the V max with (Pro-Pro-Gly) 10 being similar to that with poly(L-proline). The enzyme acted in this peptide preferentially on prolines in Y positions in the X-Y-Gly triplets. Correspondingly, (Gly-Pro-4Hyp) 5 and (Pro-Ala-Gly) 5 were poor substrates, with V max values less than 5 and 20% of that obtained with (Pro-Pro-Gly) 10 , respectively, the K m for the latter also being high. Peptides representing the N-and C-terminal hydroxylation sites present in hypoxia-inducible transcription factor ␣ also served as substrates. As these peptides contain only one proline residue, a poly(L-proline) type II conformation was clearly not required for hydroxylation. 4-Hydroxyproline is found in collagens, elastin, and more than 15 additional proteins with collagen-like domains in animal tissues (1)(2)(3). Its formation is catalyzed by prolyl 4-hydroxylases (P4Hs), 1 which act within the lumen of the endoplasmic reticulum and hydroxylate -X-Pro-Gly-sequences. The reaction requires Fe 2ϩ , 2-oxoglutarate, O 2 , and ascorbate and involves an oxidative decarboxylation of 2-oxoglutarate (for reviews, see Refs. 3 and 4). The vertebrate enzymes are ␣ 2 ␤ 2 tetramers with a molecular weight of approximately 240,000, in which the ␤ subunit is identical to the enzyme and chaperone protein-disulfide isomerase (3,4). Two isoforms of the catalytic ␣ subunit have been cloned and extensively characterized from several sources and shown to form [␣(I) 2 ]␤ 2 and [␣(II) 2 ]␤ 2 tetramers with protein-disulfide isomerase (3)(4)(5)(6). Animal P4Hs have also been cloned and characterized from Caenorhabditis elegans (7)(8)(9)(10)(11) and Drosophila melanogaster (12). In addition, a family of cytoplasmic P4Hs that hydroxylate proline in -Leu-X-X-Leu-Ala-Pro-sequences has very recently been found to play a critical role in the regulation of the hypoxia-inducible transcription factor (HIF) ␣ (13)(14)(15)(16).
4-Hydroxyproline is also found in many plant glycoproteins, especially in extensins, proline-rich glycoproteins, and arabinogalactan proteins (17)(18)(19)(20). P4Hs from unicellular and multicellular green algae are 60-kDa monomers (21,22). Those from higher plants are also likely to be monomers, although the variable presence of an additional polypeptide has been reported in partially purified preparations (23,24). No plant P4H has been cloned and characterized in detail so far, however. Plant P4Hs require the same cosubstrates as the animal enzymes, but they differ from them in that they act primarily on poly(L-proline)-like sequences and may require the poly(L-proline) II helix (for a review, see Ref. 25). Very low hydroxylation rates have also been reported with random-coil but not triplehelical forms of (Pro-Pro-Gly) 5 and (Pro-Pro-Gly) 10 , however (25). Recently, a viral P4H has been cloned and characterized from Paramecium bursaria chlorella virus-1 (PBCV-1) (26). This enzyme is a 242-amino acid monomer that resembles plant P4Hs in that it hydroxylates poly(L-proline) and (Pro-Pro-Gly) 10 , the latter with a much higher K m (26). The recombinant viral P4H also hydroxylates many synthetic peptides corresponding to proline-rich repeats coded by the viral genome (26).
Our sequence homology searches of the Arabidopsis thaliana genome indicated that it contains six open reading frames coding for P4H-like polypeptides. We have now cloned one of these, which encodes a 283-amino acid polypeptide. The recombinant enzyme expressed in insect cells was found to be a monomer that hydroxylated poly(L-proline) and many other prolinerich peptides. Surprisingly, it also effectively hydroxylated the collagen-like peptides (Pro-Pro-Gly) 10 and (Ala-Pro-Gly) 5 with K m values that are similar to those reported for animal P4Hs. Furthermore, the recombinant A. thaliana P4H resembled the animal enzymes in that it preferentially hydroxylated proline residues preceding glycines in (X-Y-Gly) n peptides. indicated the presence of six genes encoding polypeptides of 280 -332 amino acids (GenBank TM accession nos. AAC64297, AAB80790, AAF88161, NP_197391, AAF08583, and BAB02864) with similarity to the catalytic C-terminal halves of the human P4H ␣(I) and ␣(II) subunits (6,28). These amino acid sequences were aligned (29) with those of the human ␣(I) and ␣(II) subunits and the PBCV-1 P4H (26), and the cleavage sites of the signal peptides were predicted (30).

Identification of
Cloning and Expression of a Recombinant A. thaliana P4H in Insect Cells-PCR primers 5Ј-GCGGGATCCCTCCTTGTTACAATTGGCCTT-TA-3Ј and 5Ј-CGGGATCCTCAAGAAGTAGCTTTTTGCCTCAT-3Ј were synthesized based on the gene encoding the polypeptide AAC64297 (named here At-P4H-1) and used to obtain a 783-base pair PCR product from a whole plant A. thaliana cDNA library (Stratagene). The PCR template was prepared by incubating 2 l of the cDNA library in a 200-l final volume in 1% Nonidet P-40, 100 g/ml proteinase K, 1 mM EDTA, 10 mM Tris, pH 8.0, at 55°C for 45 min, followed by 10 min at 95°C, centrifuged at 12,000 rpm for 5 min, and 10 l of the prepared template was used in a 50-l PCR reaction. Hot-start PCR with preincubation for 5 min at 94°C and 2 min at 72°C before the addition of 2 l of Pfu polymerase (Promega) was used, after which 30 PCR cycles were performed as follows: denaturation for 1 min at 94°C, annealing for 2 min at 65°C, and extension for 3 min at 72°C. To increase the amount of the obtained PCR product, a second PCR reaction was performed with 10 l of 1:50 diluted first PCR reaction product as the template. The PCR cycles were as above, with the exception that the annealing temperature was 58°C. The obtained PCR fragment coding for residues Ser 23 -Ser 283 of At-P4H-1 had BamHI restriction sites at both ends (underlined in the primer sequences) and one cytosine before the codon for Ser 23 , and it was cloned into a BamHI-digested baculovirus vector pACGP67-A (Invitrogen). The sequences were verified on an automated DNA sequencer (Abi Prism 377, Applied Biosystems).
The recombinant vector was cotransfected into Spodoptera frugiperda Sf9 cells with BaculoGold DNA (PharMingen) by calcium phosphate transfection, and the recombinant viruses were amplified (31). Sf9 or High Five insect cells (Invitrogen) were cultured as monolayers in TNM-FH medium (Sigma) supplemented with 10% fetal bovine serum (BioClear) or in suspension in Sf900IISFM serum-free medium (Invitrogen). The cells were seeded at a density of 5 ϫ 10 6 cells/100-mm plate or 1 ϫ Expression of a Recombinant A. thaliana P4H in Escherichia coli-PCR primers 5Ј-GGAATTCCATATGTCCTTGTTACAATTGGCCTTTA-T-3Ј and 5Ј-CGGGATCCTCAAGAAGTAGCTTTTTGCCTCAT-3Ј were used to amplify the At-P4H-1 cDNA without the signal sequence and with flanking NdeI and BamHI sites as above, and the product was cloned into a NdeI-BamHI-digested expression vector pET15b (Novagen).
The expression plasmid was transformed into the E. coli BL21(DE3) strain (Novagen). The cells were grown at 37°C to an optical density of 0.5 at 600 nm, incubated at 30°C for 30 min, and expression was induced with 1 mM isopropyl-1-thio-␤-D-galactopyranoside. The cells were harvested 3 h after induction, suspended in 0.1 volume of 50 mM Tris-HCl, pH 8.0, with or without 0.1% Triton X-100, sonicated, centrifuged at 17,000 ϫ g for 20 min, and the soluble and insoluble fractions analyzed by 12% SDS-PAGE.
Site-directed Mutagenesis-Histidines 180 and 260 in the At-P4H-1 were converted individually to glutamate (codon GAA) and alanine (GCT), Asp 182 to alanine (GCT) and glutamate (GAA), Lys 270 to arginine (AGG) and alanine (GCG), Ser 272 to alanine (GCT), and Arg 278 to alanine (GCG) and histidine (CAC). The mutagenesis reactions were performed in the pET15b vector containing the full-length At-P4H-1 cDNA using a QuikChange TM site-directed mutagenesis kit (Stratagene). The mutant cDNAs were amplified by PCR using the primers with flanking BamHI sites (above), and the products were digested with BamHI and cloned into BamHI-digested pACGP67-A. Recombinant baculoviruses were generated and used to infect insect cells as above.
Other Assays-P4H activity was assayed at 30°C by a method based on the hydroxylation-coupled decarboxylation of 2-oxo-[1-14 C]glutarate (32). Poly(L-proline) was purchased from Sigma, whereas all the other synthetic peptides were from Innovagen. All the peptides except poly(L-proline) and those representing HIF␣ sequences were denatured by heating to 100°C for 10 min, followed by rapid cooling before addition to the enzyme reaction mixture. In some experiments the amount of 4-hydroxyproline formed was determined by a colorimetric method in samples hydrolyzed with 6 M HCl at 120°C overnight (33), and in others the partially hydroxylated (Pro-Pro-Gly) 10 peptide was purified from the reaction mixture with HiTrap Q-Sepharose (Amersham Biosciences) and reverse phase HPLC, hydrolyzed by the manual gasphase hydrolysis method, and analyzed in an Applied Biosystems 421A amino acid analyzer. N-terminal sequencing of the purified (Pro-Pro-Gly) 10 peptide was performed in an Applied Biosystems 492 Procise TM protein sequencer. Typically, approximately one third of the amount of proline and 4-hydroxyproline present in each sequencing cycle was carried over to the next cycle. The values obtained were corrected for this carry over, but they were not corrected for a significant background level, as this could not be quantitated accurately. K m values were determined as described previously (34). The molecular weight of the recombinant At-P4H-1 was analyzed by gel filtration in a calibrated HiPrep Sephacryl S-100 HR column (Amersham Biosciences).

RESULTS
The A. thaliana Genome Encodes Several P4H-like Polypeptides-A sequence homology search indicated that the A. thaliana genome contains six open reading frames encoding polypeptides of 280 -332 residues ( Fig. 1) that show an identity of 21-27% to the catalytically important C-terminal regions of the human P4H ␣(I) and ␣(II) subunits (6,28). The six polypeptides are 33-81% identical to each other ( Fig. 1). The two histidines and one aspartate that bind the Fe 2ϩ atom at the catalytic site (34,35) and the lysine that binds the C-5 carboxyl group of the 2-oxoglutarate in all collagen P4Hs (34) are all conserved in the A. thaliana sequences (Fig. 1). The fifth critical residue, a histidine that is probably involved in the binding of the C-1 carboxyl group of 2-oxoglutarate to the Fe 2ϩ atom and the decarboxylation of this cosubstrate (34), is conserved in five of the sequences, but is replaced by an arginine in the AAC64297 polypeptide. This position is also occupied by an arginine in a Drosophila P4H ␣ subunit (12) and in the PBCV-1 viral P4H (26), suggesting that all six A. thaliana polypeptides are P4Hs. However, like the PBCV-1 enzyme, these polypeptides show no similarity to the peptide substrate binding domain that is located between residues 140 and 240 in the animal P4H ␣ subunits (36). A noncleavable signal peptide was predicted in the AAC64297 polypeptide and cleavable ones in the AAF88161, NP_197391 and AAF08583 polypeptides, whereas no signal peptide was present in AAB80790 and BAB02864 (Fig. 1). The sequence identity of the 283-amino acid polypeptide AAC64297 to the human ␣(I) and ␣(II) subunits was highest, 25 and 27%, respectively. The cDNA encoding this polypeptide, named At-P4H-1, was cloned and recombinantly expressed. At-P4H-1 has four cysteine residues, the first being conserved in all six A. thaliana P4H-like polypeptides and the fourth, in position ϩ3 with respect to the second Fe 2ϩ -binding histidine, also being conserved in the human P4H ␣ subunits. Two of the cysteines are not conserved in the six A. thaliana polypeptides and are present in a region that has no homologous counterpart in the PBCV-1 P4H or the human ␣ subunits (Fig. 1). At-P4H-1 has no potential N-glycosylation sites, whereas the other A. thaliana P4H-like polypeptides have one to four such sites.
Expression of a Recombinant A. thaliana P4H-like Polypeptide in Insect Cells and E. coli-A cDNA encoding At-P4H-1 residues Ser 23 -Ser 283 was synthesized by PCR, cloned into the baculovirus vector pACGP67-A in frame with the GP67 signal sequence, and used to generate a recombinant virus. The cells infected with this virus were harvested 72 h after infection, homogenized in a buffer containing 0.1% Triton X-100, and centrifuged. The remaining pellet was further solubilized in 1% SDS, and the samples were analyzed by SDS-PAGE and Coomassie Blue staining (Fig. 2). Very little of the recombinant 29-kDa polypeptide was extracted with 0.1% Triton X-100 (Fig.  2, lane 1), whereas most of it remained in the insoluble fraction and could be extracted with 1% SDS (Fig. 2, lane 4). Therefore, various means of extracting the polypeptide more efficiently were tested, including sonication, the use of buffers with high salt concentrations, various detergents, and low concentrations of urea (details not shown). A buffer containing 1% Triton X-100 slightly improved the solubility (Fig. 2, lane 2), whereas a solution consisting of 50% glycerol, 0.6 M NaCl, 1% Nonidet P-40, 0.1 M glycine, 100 M DTT, and 0.06 M Tris, pH 7.8 (37), was found to be the best solubilizing method among those tested (Fig. 2, lane 3), although even this solubilized only approximately 10% of the enzyme.
A cDNA encoding At-P4H-1 residues Ser 23 -Ser 283 was also cloned into the pET15b E. coli expression vector with an Nterminal histidine tag. The recombinant polypeptide expressed in E. coli remained insoluble, however, and accumulated into inclusion bodies (data not shown).
The Recombinant At-P4H-1 Is a P4H That Hydroxylates Poly(L-proline)-To study whether the recombinant A. thaliana polypeptide expressed in insect cells had any P4H activity, 50 l of the soluble fraction of the cell homogenate was assayed by a method based on the hydroxylation-coupled decarboxylation of 2-oxo-[1-14 C]glutarate (32). When 0.1 mg/ml poly(L-proline) (M r 5,000) was used as the peptide substrate, a significant amount of P4H activity was observed even in the sample homogenized in the buffer containing 0.1% Triton X-100 (typically approximately 7000 cpm over various background values of approximately 200 -300 cpm), although the recombinant enzyme could not be readily detected in the soluble fraction when analyzed by Coomassie Blue-stained SDS-PAGE (Fig. 2, lane  1). A further increase in the amount of activity was observed when the polypeptide was solubilized more efficiently, the activity levels ranging up to more than 30,000 cpm (an example is shown in Table II). Gel filtration experiments in a calibrated HiPrep Sephacryl S-100 HR column showed that enzyme activity was eluted in fractions that corresponded to a molecular weight of approximately 30,000 (details not shown). As the calculated molecular weight of the recombinant At-P4H-1 without the signal peptide is 29,252, the recombinant At-P4H-1 is a monomer.
As expected, the A. thaliana P4H required Fe 2ϩ , 2-oxoglutarate, O 2 , and ascorbate (details not shown). The K m for Fe 2ϩ Amino acids that are identical between two of the polypeptides are shown with black backgrounds. Gaps were introduced for maximal alignment. The three Fe 2ϩ -binding residues, two histidines and one aspartate, the lysine that binds the C-5 carboxyl group of 2-oxoglutarate, and the serine and arginine in positions ϩ2 and ϩ8 from the lysine, respectively, are indicated by asterisks. ( Table I) was close to the values reported for partially purified P4Hs from the algae Chlamydomonas reinhardtii and Volvox carteri (21,22), but approximately 40-fold higher than that of the PBCV-1 enzyme (26). The K m for 2-oxoglutarate (Table I) was between those of the algal enzymes (21,22), but approximately 6-fold higher than those of the PBCV-1 (26) and human enzymes (6,34) and approximately the same as those of lysyl hydroxylase isoenzymes 1 and 3 (37,38). The K m for ascorbate was approximately the same ( Table I) as those of the algal, PBCV-1, and human P4Hs (6,21,22,26,34). The K m for poly(L-proline), M r 5,000, was 5-10-fold lower (Table I) than for poly(L-proline), M r 7,000, with algal enzymes (21,22), and that for poly(L-proline), M r 20,000, was 35-fold lower than that for poly(L-proline), M r 31,000, with an algal enzyme (21), and 2500fold lower than that for poly(L-proline), M r 13,000, with the viral enzyme (26). K m values ranging from 4 to 40 M and a value of 5 M have previously been reported for poly(L-proline), M r 6,000 and 30,000, respectively, in the case of P4Hs from higher plants (23, 39 -41).
Histidines 180 and 260, Aspartate 182, Lysine 270, and Arginine 278 are the Catalytically Critical Residues in At-P4H-1-A sequence homology comparison (Fig. 1) indicated that the At-P4H-1 residues His 180 , Asp 182 , and His 260 correspond to the Fe 2ϩ binding residues in the human P4H ␣(I) subunit, the At-P4H-1 Lys 270 corresponds to the lysine that binds the C-5 carboxyl group of 2-oxoglutarate, and the At-P4H-1 Arg 278 corresponds to the fifth critical residue, a histidine or an arginine depending on the species, which is probably involved in both the binding of the C-1 carboxyl group of 2-oxoglutarate to the Fe 2ϩ atom and the decarboxylation of this cosubstrate (12,26,34). To study the function of these residues in the At-P4H-1 polypeptide, His 180 , Asp 182 , and His 260 were converted individually to alanine and glutamate, Lys 270 to alanine and arginine, and Arg 278 to alanine and histidine. The crystal structure of cephalosporin synthase (42) has shown that, in addition to forming a salt bridge with an arginine residue (that corresponds to the lysine in P4Hs), the C-5 carboxyl group of 2-oxoglutarate is hydrogen-bonded to a serine residue in position ϩ2 with respect to the positively charged arginine. We therefore also studied the role of Ser 272 in the catalytic activity of At-P4H-1 by converting it to alanine.
The mutant At-P4H-1 polypeptides were expressed in insect cells, and the cells were harvested, homogenized, and assayed for P4H activity as above. Mutation of the Fe 2ϩ binding residues His 180 , Asp 182 , or His 260 to alanine or glutamate completely inactivated At-P4H-1 (Table II). Mutation of Lys 270 to alanine or arginine also inactivated the enzyme completely, whereas mutation of Ser 272 to alanine reduced the enzyme activity by 83% (Table II). Conversion of the At-P4H-1 residue Arg 278 to alanine completely inactivated the enzyme, whereas its replacement with a histidine reduced the activity to approximately 26% (Table II).
The A. thaliana Enzyme Effectively Hydroxylates Collagenlike Peptides Acting Preferentially on Prolines Preceding Glycine-Highly surprisingly, the A. thaliana enzyme was found to hydroxylate denatured (Pro-Pro-Gly) 10 with a K m of approximately 60 M (Table IV), this value being similar to the K m values of 20 and 100 M determined for human type I and type II prolyl 4-hydroxylases, respectively (6,34). The V max of At-P4H-1 with (Pro-Pro-Gly) 10 was close to that obtained with poly(L-proline) ( Table IV). The K m values for (Pro-Pro-Gly) 5 and (Ala-Pro-Gly) 5 were approximately 120 and 100 M, the V max values with these peptides being 65 and 50% of that obtained with (Pro-Pro-Gly) 10 .
To study the hydroxylation pattern of the (Pro-Pro-Gly) 10 peptide, it was partially hydroxylated with the recombinant At-P4H-1, purified from the reaction mixture by anion exchange chromatography and HPLC, and subjected to N-terminal sequencing. The Y position prolines in the repeating X-Pro-Gly triplets were found to be preferentially hydroxylated (Fig.  3), and the hydroxylation pattern was similar to the pattern observed with vertebrate P4Hs in that the Y position proline in the 9th triplet from the N terminus was most readily hydroxylated (43,44). The sequencing results also indicated the presence of small amounts of 4-hydroxyproline in the X positions. The values shown in Fig. 3 have been corrected for the amounts of proline and 4-hydroxyproline carried over from the previous sequencing cycles but not for a significant background, as it could not be quantitated accurately. Therefore, the true degrees of hydroxylation of proline residues in the X positions are even smaller than those shown in Fig. 3. The data obtained in three additional experiments were essentially identical to those shown.
To study the hydroxylation of proline residues in X positions further, the peptides (Gly-Pro-4Hyp) 5 and (Pro-Ala-Gly) 5 were tested as substrates (Table IV). The K m value for (Pro-Ala-Gly) 5 was found to be 280 M, and the V max values obtained with it and (Gly-Pro-4Hyp) 5 were 20% and less than 5% of that obtained with (Pro-Pro-Gly) 10 , respectively (Table IV). Because of the low reaction rate, the K m for (Gly-Pro-4Hyp) 5 could not be determined. Measurement of the amount of 4-hydroxyproline formed in a 1-ml P4H reaction mixture containing 200 M (Pro-Ala-Gly) 5 by a colorimetric method showed this amount to be approximately 30% of that formed in the (Ala-Pro-Gly) 5 peptide in the same experiment (details not shown). This percentage is lower than the V max of 40% for (Pro-Ala-Gly) 5 relative to (Ala-Pro-Gly) 5 (Table IV), evidently because of the combined effect of the higher K m and lower V max .

The A. thaliana Enzyme Effectively Hydroxylates Peptides
Representing Transcription Factor HIF␣ Sequences and Containing Only One Proline Residue-The recombinant At-P4H-1 was also found to effectively hydroxylate synthetic peptides representing the two hydroxylated sequences in human transcription factor HIF-1␣ (13)(14)(15)(16). The K m values for the peptides representing the N-terminal (Asp-Ala-Leu-Thr-Leu-Leu-Ala-Pro-Ala-Ala-Gly-Asp-Thr-Ile-Ile-Ser-Leu-Phe-Gly) and Cterminal (Asp-Leu-Asp-Leu-Glu-Met-Leu-Ala-Pro-Tyr-Ile-Pro-MetAsp-Asp-Asp-Phe-Gln-Leu) hydroxylation sites in HIF␣ were 100 and 50 M, respectively (Table V). The V max with the peptide representing the N-terminal hydroxylation site was approximately 70% of that obtained with poly(L-proline), whereas that with the peptide representing the C-terminal hydroxylation site was approximately 20% (Table V). DISCUSSION The data reported here indicate that the A. thaliana genome probably encodes a family of P4Hs with at least six members. The high degree of sequence similarity and the strict conservation of the residues that bind the Fe 2ϩ atom and the C-5 carboxyl group of 2-oxoglutarate suggest that all six are P4Hs. The animal P4Hs had for a long time been assumed to be of one type only, until a second isoenzyme was cloned and characterized from mouse and human tissues (5,6), and very recently a family of three additional cytoplasmic human P4Hs has been cloned and shown to be involved in the hydroxylation of the hypoxia-inducible transcription factor HIF␣ (15,16). Our searches of the completed genomes of C. elegans and D. melanogaster suggest that the former may have more than 5 P4Hs and the latter more than 10, although only 4 P4Hs have so far been cloned from the former (7, 9 -11, 15) and 2 from the latter (12,16). P4Hs thus appear to constitute enzyme families in both plant and animal tissues.

FIG. 3.
Analysis of the hydroxylation of the proline residues in (Pro-Pro-Gly) 10 by At-P4H-1. The hydroxylation reaction was carried out with 100 g/ml (Pro-Pro-Gly) 10 as the substrate in the standard P4H reaction mixture under conditions that gave a high extent of hydroxylation of the substrate but not complete hydroxylation. The peptide was purified from the reaction mixture with HiTrap Q-Sepharose and HPLC and subjected to N-terminal sequencing. The columns indicate the degree of hydroxylation of the proline residues in the X and Y positions in the X-Y-Gly triplets. The values have been corrected for the amounts of proline and 4-hydroxyproline carried over from the previous sequencing cycles but not for a significant background, as it could not be quantitated accurately. P, proline; G, glycine. unicellular and multicellular green algae have likewise indicated that these enzymes are monomers (21,22), whereas early studies of a P4H from Phaseolus vulgaris suggested that this enzyme may have two kinds of subunit (23). Subsequent work has demonstrated, however, that the ratio of the co-purifying polypeptide to the catalytic polypeptide varies in a range well below 1:1 (24), indicating that this P4H is likewise a monomer. It thus seems that plant P4Hs, like the animal P4Hs involved in the hydroxylation of HIF␣ (15,16), may be monomers, whereas the vertebrate P4Hs involved in the hydroxylation of collagens are ␣ 2 ␤ 2 tetramers (3,4).
Site-directed mutagenesis showed that replacement of the conserved At-P4H-1 residues His 180 , Asp 182 , and His 260 , which correspond to the Fe 2ϩ -binding ligands in human P4H (34), by alanine or glutamate completely inactivated the enzyme, thus demonstrating their critical role in catalytic activity. The corresponding mutations of the Fe 2ϩ -binding residues also result in complete inactivation in human type I P4H, with the exception that replacement of Asp 414 with glutamate retains approximately 15% of the activity (34). These results differ from those with aspartyl (asparaginyl) ␤-hydroxylase, in which replacement of the Fe 2ϩ -binding histidines with a negatively charged amino acid resulted in 10 -20% residual activity (45). Mutation of At-P4H-1 Lys 270 , which corresponds to the Lys 493 that ionically binds the C-5 carboxyl group of 2-oxoglutarate in human type I P4H (34), to alanine or arginine completely inactivated the enzyme. In comparison, replacement of Lys 493 in human type I P4H with arginine (34), and the corresponding Arg 700 in lysyl hydroxylase 1 with lysine (46), reduced the activities to approximately 15%, whereas replacement with alanine completely inactivated the enzymes. In the case of cephalosporin synthase, the C-5 carboxyl group of 2-oxoglutarate forms a salt bridge to an arginine and a hydrogen bond to a serine in position ϩ2 with respect to the arginine (42). The critical role of the corresponding serine in At-P4H-1 was demonstrated here by 83% inactivation of the enzyme when Ser 272 was converted to alanine. An additional catalytically important positively charged residue in animal P4Hs is located in position ϩ8 with respect to the lysine that binds the C-5 carboxyl group of 2-oxoglutarate (34). This residue is probably involved in both the binding of the C-1 carboxyl group of 2-oxoglutarate and the decarboxylation of this cosubstrate (34). Replacement of this His 501 in human type I P4H with a serine completely inactivates the enzyme, whereas substitution with arginine or lysine reduces the activity to approximately 10 -15% (34). Likewise, replacement of the corresponding Arg 278 in At-P4H-1 with alanine completely inactivated the enzyme, whereas mutation to histidine reduced the activity to approximately 26%. The corresponding mutations Arg 490 3 Ser and Arg 490 3 His inactivated a Drosophila P4H by 70 and 10%, respectively (12).
The most distinct difference in catalytic properties between plant and animal P4Hs is found in the hydroxylation of poly(L-proline). This polypeptide is an effective substrate for all plant P4Hs studied (21-25, 39 -41), whereas some animal P4Hs recognize it as an effective competitive inhibitor and some as a weak one, but none of the animal P4Hs characterized so far has used it as a substrate (3,4). Data indicating that free proline, (Pro) 2 , and (Pro) 3 are not hydroxylated by the P4H from Vinca rosea have been interpreted as indicating that plant P4Hs may require a poly(L-proline) type II helix conformation (39,47). The recombinant viral PBCV-1 P4H likewise used poly(L-proline) as an efficient substrate (26). In the present work recombinant At-P4H-1 efficiently hydroxylated poly(L-proline), the K m values for M r 5000 and 20,000 poly(L-prolines) being 2 and 0.2 M, respectively, i.e. lower than those previously reported for any algal or higher plant P4Hs.
The recombinant At-P4H-1 also efficiently hydroxylated other proline-rich peptides, the K m values for peptides representing A. thaliana arabinogalactan protein and extensin sequences varying between 10 and 40 M. Synthetic peptides representing proline-rich sequences coded by the PBCV-1 genome were likewise efficiently hydroxylated, with K m values varying from 2 to 90 M. Highly interestingly, the recombinant At-P4H-1 also hydroxylated peptides representing the two hydroxylated sequences in the human HIF␣, with K m values of 100 and 50 M, respectively. Unlike the HIF P4Hs (15), At-P4H-1 hydroxylated the peptide representing the N-terminal hydroxylation site in HIF␣ at a higher rate, the V max obtained with this peptide being approximately 3.5-fold when compared with that obtained with the peptide representing the C-terminal hydroxylation site. As only one proline residue is present in each of the HIF␣ peptides, these data clearly indicate that a poly(L-proline) type II helix conformation is not required for hydroxylation by At-P4H-1. A similar conclusion on sequencespecific rather than poly(L-proline) II conformation-specific hydroxylation by plant P4Hs was recently reached in a comparison of sequences containing 4-hydroxyproline present in various plant proteins (48).
A highly surprising finding was that the recombinant A. thaliana P4H also effectively hydroxylated (Pro-Pro-Gly) 10 , with a K m similar to those reported for vertebrate collagen P4Hs (3-6, 34, 36) and a V max close to that obtained with poly(L-proline). Plant P4Hs have previously been reported either not to hydroxylate (Pro-Pro-Gly) 10 at all or to hydroxylate it only at a very low rate (21, 22, 39 -41). Although an early study indicated that a partially purified carrot P4H hydroxylates protocollagen, a protein consisting of nonhydroxylated procollagen chains (49), subsequent efforts have been unable to validate these results (25). The viral PBCV-1 P4H that hydroxylates poly(L-proline) was also found to hydroxylate (Pro-Pro-Gly) 10 , but with a K m of approximately 2.9 mM, i.e. approximately 50-fold higher than the K m of 60 M measured here for At-P4H-1. Sequencing of a (Pro-Pro-Gly) 10 peptide partially hydroxylated by the recombinant At-P4H-1 showed that the enzyme had acted preferentially on prolines preceding glycine. However, unlike the animal P4Hs, At-P4H-1 was not absolutely specific for hydroxylation of only these positions, as small amounts of 4-hydroxyproline were also found in the positions following glycine. In agreement with this, (Gly-Pro-4Hyp) 5 was found to serve as a substrate, although the V max obtained with it was less than 5% of that obtained with (Pro-Pro-Gly) 10 . Preferential hydroxylation of proline in the Y position of X-Y-Gly triplets was also seen in a comparison of the hydroxylation of (Ala-Pro-Gly) 5 and (Pro-Ala-Gly) 5 , in that the K m of the latter was approximately 3-fold and the V max approximately 40%. It may be noted that (Pro-Ala-Gly) n has also been reported to be hydroxylated by a vertebrate P4H in vitro, although at a rate that is only approximately 7% of that obtained Poly(L-proline), M r ϭ 5,000-10,000 2 100 Asp-Ala-Leu-Thr-Leu-Leu-Ala-Pro-Ala-Ala-Gly-Asp-Thr-Ile-Ile-Ser-Leu-Phe-Gly a 100 70 Asp-Leu-Asp-Leu-Glu-Met-Leu-Ala-Pro-Tyr-Ile-Pro-Met-Asp-Asp-Asp-Phe-Gln-Leu b 50 20 a Sequence representing the N-terminal hydroxylation site in the human transcription factor HIF␣ (15).
b Sequence representing the C-terminal hydroxylation site in the human transcription factor HIF␣ (15). with (Ala-Pro-Gly) n (50), and the tetrapeptide Pro-Pro-Ala-Pro is likewise known to act as a substrate for vertebrate P4Hs because of hydroxylation of the proline preceding alanine (51). Interestingly, the pattern of hydroxylation of the Y position prolines in (Pro-Pro-Gly) 10 with At-P4H-1 was similar to the asymmetrical hydroxylation pattern produced by the vertebrate enzymes, the 9th triplet from the N-terminal end being most readily hydroxylated (43,44), whereas no asymmetric hydroxylation was seen with the viral PBCV-1 enzyme (26).