Characterization of a Second Arabidopsis thaliana Prolyl 4-Hydroxylase with Distinct Substrate Specificity*

4-Hydroxyproline is found in collagens, collagen-like proteins, elastin, and the hypoxia-inducible transcription factor in animals and in many hydroxyproline-rich glycoproteins in plants. We report here on the cloning and characterization of a second plant P4H (prolyl 4-hydroxylase), At-P4H-2, from Arabidopsis thaliana. It consists of 299 amino acids and shows 33% sequence identity to the first characterized isoenzyme, At-P4H-1. A characteristic feature of the At-P4H-2 polypeptide is a 49-amino-acid C-terminal toxin homology domain with 6 cysteines that is not found in At-P4H-1 but is present in a putative rice P4H homologue. At-P4H-2 differed distinctly from At-P4H-1 in its substrate specificity. Recombinant At-P4H-2 hydroxylated poly(l-proline) and extensin and arabinogalactan-like peptides effectively but with much higher Km values than At-P4H-1, suggesting different roles for the two At-P4Hs in the plant cell. Unlike At-P4H-1, At-P4H-2 hydroxylated collagen-like peptides only very inefficiently and did not hydroxylate hypoxia-inducible transcription factor α-like peptides at all. All the peptides efficiently hydroxylated by At-P4H-2 had at least 3 consecutive prolines, suggesting that these may represent a minimum requirement for efficient hydroxylation by this isoenzyme. N-terminal sequencing of an extensin-like peptide SPPPVYKSPPPPVKHYSPPPV indicated that At-P4H-2 preferentially hydroxylated the 3rd proline in the C-terminal PPP triplet. The Km values of At-P4H-2 for the reaction cosubstrates Fe2+, 2-oxoglutarate, and ascorbate were similar to those of At-P4H-1 with the exception that the Km for iron was about 3-fold lower. Pyridine-2,4-dicarboxylate and pyridine-2,5-dicarboxylate, well known competitive inhibitors of the vertebrate P4Hs with respect to 2-oxoglutarate, were also competitive inhibitors of At-P4H-2 but with Ki values 5–100-fold higher than those of human type I collagen P4H. It thus seems that there are some distinct differences in the structure of the 2-oxoglutarate-binding site between At-P4H-2 and the animal collagen P4Hs.

4-Hydroxyproline is found in animal proteins almost exclusively in collagens, elastin, and more than 20 additional pro-teins with collagen-like sequences (for reviews, see Refs. [1][2][3]. The formation of 4-hydroxyproline in these proteins is catalyzed by the collagen prolyl 4-hydroxylases (C-P4Hs, 1 EC 1.14.11.2) that reside in the endoplasmic reticulum and act on proline residues in -XPG-sequences (1,2). All vertebrate C-P4Hs are ␣ 2 ␤ 2 tetramers in which the catalytic sites are located in the ␣ subunits and the ␤ subunits are identical to the enzyme and chaperone protein disulfide isomerase (1)(2)(3). Three isoforms of the ␣ subunit have been identified and characterized from human and other sources and shown to combine with protein disulfide isomerase to form enzyme tetramers known as the type I, II, and III C-P4Hs (4 -7). The function of 4-hydroxyproline residues in collagens is to stabilize their triplehelical structure at body temperature (1,2).
Another family of cytoplasmic and nuclear P4Hs that act on the ␣ subunit of the hypoxia-inducible transcription factor (HIF) has been cloned and characterized from the vertebrates and Caenorhabditis elegans and Drosophila melanogaster (8 -10). These HIF-P4Hs play a key role in the regulation of oxygen homeostasis by hydroxylating single proline residues in LXX-LAP sequences, the 4-hydroxyproline residue formed acting as a signal for the proteasomal degradation of HIF␣ (8 -10). The K m values of the HIF-P4Hs for O 2 are slightly above its atmospheric concentration, making these enzymes effective oxygen sensors (11).
4-Hydroxyproline-rich glycoproteins (HRGPs) are found ubiquitously in the extracellular matrix of plants and green algae, accounting for as much as 10 -20% of the dry weight of their cell walls (for reviews, see Refs. 12 and 13). HRGPs are implicated in all aspects of plant growth and development, however, from fertilization to apoptosis and responses to stress (12). The HRGP superfamily contains three major groups of proteins: proline-rich glycoproteins, which are highly periodic and lightly arabinosylated repetitive proteins, extensins, which are also periodic, and arabinogalactan proteins, which are the most highly glycosylated and the least periodic (12,13).
P4Hs have been partially characterized from many plants (for a review, see Ref. 14), but only one plant P4H has been cloned and characterized in detail, an Arabidopsis thaliana P4H termed At-P4H-1 (15). The recombinant enzyme is a 29-kDa monomer that efficiently hydroxylates poly(L-proline) and many synthetic peptides corresponding to proline-rich repeats in plant glycoproteins and other proteins (15). Surprisingly, it also effectively hydroxylated collagen-like and HIF␣-like peptides (15). A monomeric P4H has also been cloned from the Paramecium bursaria Chlorella virus-1 (PBCV-1) and has like-wise been found to hydroxylate poly(L-proline), other prolinerich peptides, and collagen-like peptides (16).
All P4Hs require Fe 2ϩ , 2-oxoglutarate, O 2 , and ascorbate, and their reaction involves stoichiometric decarboxylation of 2-oxoglutarate (1,2). The Fe 2ϩ atom is bound to the catalytic site by 2 histidines and 1 aspartate, whereas the C-5 carboxyl group of the 2-oxoglutarate is bound to a lysine located in position ϩ10 with respect to the 2nd iron-binding histidine (17). The C-P4Hs have a peptide substrate-binding domain that is distinct from the C-terminal catalytic domain (18,19), but it is currently unknown whether the other P4Hs also have a separate peptide substrate-binding domain or whether their substrates become bound directly to the catalytic domain.
We report here on the cloning and characterization of a second A. thaliana P4H isoenzyme, At-P4H-2. This is a 33-kDa monomer that differs distinctly from At-P4H-1 in its substrate specificity. It was found to hydroxylate poly(L-proline) and extensin and arabinogalactan-like peptides but with much higher K m values than At-P4H-1, suggesting a different role in the plant cell. Unlike At-P4H-1, it hydroxylated collagen-like peptides only very inefficiently and did not hydroxylate HIF␣-like peptides at all. We also determined the K m values of At-P4H-2 for the reaction cosubstrates and K i values for two 2-oxoglutarate analogue inhibitors.

Cloning and Expression of Recombinant A. thaliana P4H-2 in Insect
Cells-PCR primers 5Ј-GCGGCGGCCGCCGATTGCTATAATGTCGA-TGTCTCGGCTCGG-3Ј and 5Ј-GCGTCTAGAGAGGAGTTTAAAGTCT-TTAACAAGCTTTACAGC-3Ј were synthesized based on the sequence of the gene encoding the polypeptide NP566279.1 (GenBank TM , named here At-P4H-2) and used to obtain a 947-bp PCR product coding for full-length At-P4H-2 cDNA from an A. thaliana Uni-ZAP XR cDNA Library (Stratagene). The PCR product obtained had added NotI and XbaI restriction sites at its 5Ј and 3Ј ends, respectively (underlined in the primer sequences), and it was cloned into a NotI-XbaI-digested baculovirus vector pVL1392 (Pharmingen). The sequence was verified on an ABI Prism 377 automated DNA sequencer (Applied Biosystems).
The recombinant vector was cotransfected into Spodoptera frugiperda (Sf9) insect cells with a modified Autographa californica nuclear polyhedrosis virus DNA (BaculoGold, Pharmingen) by calcium phosphate transfection, and the recombinant virus was amplified (20). Monolayer cultures of Sf9 cells in TNM-FH medium (Sigma) supplemented with 10% fetal bovine serum (BioClear) or High Five cells in SF-900IISFM serum-free medium (Invitrogen) were infected with a recombinant virus coding for At-P4H-2 or human ERp57 (21) at a multiplicity of 5 and a cell density of 5 ϫ 10 6 /100-mm plate. The cells were harvested 72 h after infection, washed with a solution of 0.15 M NaCl and 0.02 M phosphate, pH 7.4, homogenized in a 0.1 M NaCl, 0.1 M glycine, 2 mM CaCl 2 , 10 M dithiothreitol, 0.1% Triton X-100, and 0.01 M Tris buffer, pH 7.8, and centrifuged at 10,000 ϫ g for 20 min. The insoluble pellet was further solubilized in two consecutive steps, first in the same buffer supplemented with 1% Triton X-100 and then in a 50% glycerol, 0.6 M NaCl, 1% Nonidet P-40, 0.1 M glycine, 10 M dithiothreitol, and 0.01 M Tris buffer, pH 7.8. The remaining pellet was solubilized in 1% SDS, and aliquots of all fractions were analyzed by 10% SDS-PAGE under reducing conditions followed by Coomassie Blue staining. The presence of At-P4H-2 in the 0.1% Triton X-100-soluble fraction was verified by N-terminal sequencing. Triton X-100 and SDS-soluble samples were transferred from SDS-PAGE to a ProBlott TM membrane (Applied Biosystems) by electroblotting and stained with Coomassie Blue. The At-P4H-2 polypeptide visible in the SDS-soluble sample and a slice of the corresponding position of the Triton X-100-soluble sample were excised from the blot and analyzed in a ProciseTM 492 sequencer (Applied Biosystems).
P4H Activity Assay-P4H activity was analyzed by a method based on the hydroxylation-coupled decarboxylation of 2-oxo[1-14 C]glutarate (22). The reaction was performed in a final volume of 0.5 ml, which contained 10 -50 l of the 0.1% Triton X-100-soluble fractions of the insect cells expressing recombinant At-P4H-2 or ERp57, which was used as a negative control, as sources of the enzyme, differing amounts of various peptide substrates (Sigma, Innovagen), 0.025 mol of FeSO 4 , 0.15 mol of 2-oxo[1-14 C]glutarate (150 000 dpm), 1 mol of ascorbate, 30 g of catalase (Sigma), 0.05 mol of dithiothreitol, 1 mg of bovine serum albumin (Sigma), and 25 mol of Tris-HCl buffer, adjusted to pH 7.8 at 25°C. The enzyme reaction was carried out at 37°C for 30 min. The amounts of total cellular protein in the Triton X-100 extracts were assayed by Roti-Quant (Roth) and adjusted to be equal in the activity assay reactions. The K m and K i values of At-P4H-2 were determined as described previously (17,22) with poly(L-proline) as a substrate. The amount of 4-hydroxyproline formed in a poly(L-proline) substrate was determined by a colorimetric method in samples hydrolyzed with 6 M HCl at 120°C overnight (23).
Other Assays-The hydroxylation site specificity of At-P4H-2 in the peptide SPPPVYKSPPPPVKHYSPPPV was studied by performing the enzyme incubation in the presence of Complete EDTA-free protease inhibitor cocktail (Roche Diagnostics) for 7 h and purifying the partially 4-hydroxylated peptide from the reaction mixture by reverse phase HPLC. The molecular mass of the hydroxylated peptide was analyzed by a Voyager DE-STR matrix-assisted laser desorption/ionization-time of flight (Applied Biosystems), and its N-terminal sequencing was performed.

Cloning of At-P4H-2 and Comparison of Its Amino Acid
Sequence with Those of Other P4Hs-A sequence homology search of the A. thaliana genome indicated that it encodes at least six polypeptides of 280 -332 residues that show 21-27% identity to the catalytically important C-terminal regions of the ␣ subunits of the human type I and type II C-P4Hs (15). One of these, a 283-residue polypeptide with a GenBank TM accession number AAC64297, was cloned and characterized recently and named At-P4H-1 (15), being the first plant P4H to be cloned from any source. We have now cloned a second 299-amino-acid A. thaliana P4H based on the GenBank TM sequence NP566279.1 and have named it At-P4H-2, its gene being located in chromosome 3. This polypeptide corresponds to AAF08583 as identified in the first sequence homology search (15), the first 178 amino acids encoded by NP566279.1 being identical to those of AAF08583, whereas the subsequent AAF08583 sequence has a deletion corresponding to 21 amino acids. A third GenBank sequence, AAM67123.1, is identical to NP566279.1 except that it lacks the first two amino acids of the encoded polypeptide.
We used PCR to amplify a 947-nucleotide product from an A. thaliana cDNA library encoding 11 nucleotides of the 5Јuntranslated sequence, the entire coding region, and 16 nucleotides of the 3Ј-untranslated sequence. This cDNA encodes a 299-amino-acid polypeptide identical to NP566279.1. The calculated molecular mass of At-P4H-2 is 33,019 Da, its isoelectric point is 5.93, and the polypeptide contains three potential N-glycosylation sites. The 3 amino acids, 2 histidines, and 1 aspartate that bind the iron atom and the lysine that binds the C-5 carboxyl group of the 2-oxoglutarate (17) are conserved in its sequence ( Fig. 1).
At-P4H-2 shows a 33% amino acid sequence identity to At-P4H-1 ( Fig. 1), the identity of the C-terminal catalytic regions being distinctly higher. The identity of the 126-amino-acid stretches starting at residue 118 of At-P4H-2 and at residue 159 of At-P4H-1 is 51%, for example ( Fig. 1). A characteristic feature of the At-P4H-2 polypeptide is a 49-amino-acid C-terminal extension with 6 cysteines that is not found in At-P4H-1 ( Fig. 1). Interestingly, a closer homologue of At-P4H-2 is a 310-amino-acid polypeptide found in rice (AAO72377.1), which is 53% identical to At-P4H-2. These two polypeptides also share the C-terminal cysteine-rich extension (Fig. 1). The P. bursaria Chlorella virus-1 (PBCV-1) P4H (16) shows a 25% overall sequence identity to At-P4H-2, the C-terminal catalytic regions being more alike, the identity of a 126-amino-acid stretch starting at residue 118 of At-P4H-2 to that starting at residue 131 of the PBCV-1 P4H being 34% (Fig. 1). Interestingly, a 24-aminoacid stretch in the N-terminal region of At-P4H-2, starting at residue 52, is 67 and 54% identical to the corresponding regions in the rice and PBCV-1 P4Hs, whereas its identity to the corresponding region in At-P4H-1 is only 42% (Fig. 1). The sequence identity between At-P4H-2 and the 517-residue ␣ subunit of the human type I C-P4H is 23% (data not shown), being thus very similar to the identity of 25% between At-P4H-1 and this ␣ subunit (15).
Expression of Recombinant At-P4H-2 in Insect Cells-A baculovirus expressing At-P4H-2 was generated and used to infect Sf9 and H5 insect cells. The cells were harvested 72 h after infection, homogenized in a buffer containing 0.1% Triton X-100, and centrifuged. The pellet obtained was further solubilized consecutively in a buffer containing 1% Triton X-100 and in a buffer containing 50% glycerol, 0.6 M NaCl, and 1% Nonidet P-40, and 1% SDS. Analysis of the glycerol and SDS extracts by SDS-PAGE under reducing conditions indicated that both insect cell types expressed the recombinant At-P4H-2 as a polypeptide of about 35 kDa, as shown for Sf9 cells in Fig. 2 (lanes 3 and 4), whereas its solubility in the two Triton X-100-containing buffers was poor (Fig. 2, lanes 1 and 2). The 0.1% Triton X-100 and SDS-soluble samples were electroblotted, and the membrane was stained by Coomassie Blue. N-terminal sequencing of the band representing the SDS-soluble At-P4H-2 polypeptide gave the sequence MSMSRLGLLLF, corresponding to the N terminus of At-P4H-2. When a slice of the blot cut from the corresponding migration position of the 0.1% Triton X-100-soluble fraction was analyzed, the same sequence was identified, thus verifying the presence of At-P4H-2 in this fraction. High levels of At-P4H-2 activity were seen in the 0.1% Triton X-100-soluble fraction, the activity generated in Sf9 cells being twice that obtained in H5 cells (data not shown), and therefore the former cell line was chosen for further investigation. The At-P4H-2 activity in the 50% glycerol-soluble fraction was only 16% of that in the 0.1% Triton X-100-soluble fraction when expressed per 100 g of protein in the fraction (data not shown), suggesting that a considerable percentage of the glycerol-soluble polypeptide was incorrectly folded. The 0.1% Triton X-100-soluble fraction was therefore used as the At-P4H-2 source in all subsequent experiments.
The Recombinant At-P4H-2 Effectively Hydroxylates Poly(Lproline)-P4H activity generated in the 0.1% Triton X-100soluble fraction of Sf9 cells expressing either At-P4H-2 or human ERp57, an endoplasmic reticulum glycoprotein-specific thiol-disulfide oxidoreductase (24, 25) that was used here as a negative control, was analyzed by a method based on the hydroxylation-coupled decarboxylation of 2-oxo-[1-14 C]glutarate (22). A high amount of P4H activity was found in the soluble fraction of cells expressing At-P4H-2 when 200 M poly(L-proline) was used as the peptide substrate, the amount of activity increasing with the molecular weight of the poly(L-proline) ( Table I). The activity level found in the corresponding fraction of cells expressing ERp57 was less than 0.1% of that with At-P4H-2 (Table I).
The C-P4Hs are known to catalyze a very low rate of uncoupled decarboxylation of 2-oxoglutarate, i.e. decarboxylation without subsequent hydroxylation of a peptide substrate (1,17). To verify that the 2-oxoglutarate decarboxylation found with At-P4H-2 indeed represented hydroxylation-coupled decarboxylation, the amount of 4-hydroxyproline formed by the insect cell extract expressing At-P4H-2 with poly(L-proline) as a substrate was measured by a colorimetric method (23). Sf9 cells have small amounts of endogenous C-P4H activity and produce small amounts of endogenous collagens (26). Therefore small amounts of 4-hydroxyproline were found in the sample from cells expressing ERp57, evidently being derived from the 4-hydroxyproline present in the cell extract used as the enzyme source (Table II). The amount of 4-hydroxyproline generated by At-P4H-2 was about 12 times of that obtained with this negative control, however (Table II).
At-P4H-2 Effectively Hydroxylates Peptides with Sequences Corresponding to Those in Arabinogalactan Proteins and Extensins-A synthetic peptide with a sequence (ATPPPV) 3 , corresponding to that in arabinogalactan proteins, and two peptides with sequences SPPPVSPPPVSPPPPV and SPPPVY-KSPPPPVKHYSPPPV, corresponding to those in extensins (15), were studied as substrates using the assay based on the hydroxylation-coupled decarboxylation of 2-oxoglutarate as described above. All three peptides were good substrates, the level of enzyme activity ranging from 5370 to 13,790 dpm/100 g of cell extract protein, the longer extensin-like peptide being a better substrate than the shorter one (Table I). These peptides were not as effective at being substrates as the three poly(L-proline) peptides with different M r values, however, particularly as the activities shown for the latter were obtained with lower substrate concentrations (Table I). The three plant peptides also served as substrates for At-P4H-1, but in this case, the longer extensin-like peptide was the least satisfactory substrate (15). A surprising finding was that the two extensinlike peptides were also hydroxylated to a minor extent by the endogenous P4H present in insect cells, whereas no such hydroxylation was seen with the arabinogalactan protein-like peptide (Table I). No reports are available on the substrate specificity of insect cell P4H, but the present data indicate that it appears to differ distinctly from those of the vertebrate C- P4Hs (1, 2).
At-P4H-2 Hydroxylates Very Ineffectively Collagen and HIF␣-like Peptides and Some Other Peptides Reported to Contain 4-Hydroxyproline-(PPG) 10 , a collagen-like peptide that is used as a standard substrate in the assays of C-P4H activity (1,2,22) and has also been reported to act as a good substrate for At-P4H-1 (15), was a very ineffective substrate for At-P4H-2, the level of activity obtained even with a 1 mM concentration being less than 3% of that obtained with a 200 M concentration of poly(L-proline), M r 32,000 (Table I). The level of activity obtained with (PPG) 10 was nevertheless much higher than that obtained with the ERp57-expressing negative control, indicating that this peptide did become hydroxylated by At-P4H-2, although very ineffectively. (PPG) 5 was hydroxylated to a lower extent than the five-triplet longer (PPG) 10 , whereas (PAG) 5 gave essentially no activity, and the small amount of activity obtained with (APG) 5 was at the level of the negative control, suggesting that it may have been due to a very low rate of hydroxylation by the endogenous insect cell enzyme (Table I).
Bradykinin, a peptide involved in the regulation of blood pressure, contains a single -PPG-triplet that can be hydroxylated by the C-P4Hs in vitro (1). Correspondingly, human plasma, urine, and ascitic fluid contain small amounts of hydroxyproline-lysyl-bradykinin with a single 4-hydroxyproline residue (1). We therefore tested whether bradykinin would serve as a substrate for At-P4H-2 but found no significant hydroxylation (Table I).
The prion protein involved in the pathogenesis of transmissible spongiform encephalopathies contains a 4-hydroxyproline residue in its N-terminal domain that may have a poly(Lproline) type II helix (27). A 17-amino-acid peptide corresponding to this site was found to serve as a substrate for At-P4H-2, and a low level of activity was also found with the ERp57expressing negative control, suggesting that the endogenous insect cell P4H was able to hydroxylate this peptide at a low  10 250 320 (PPG) 10 1000 950 Ͻ20 (PPG) 5 500 270 Ͻ20 (PAG) 5 500 140 (APG) 5 (Table I).

TABLE II Formation of 4-hydroxyproline by At-P4H-2 with poly(L-proline), M r 32,000, as a substrate
At-P4H-1 has been found to hydroxylate synthetic peptides corresponding to the two hydroxylation sites in human HIF-1␣ (15), which was a highly surprising finding as the sequences of these peptides are very different from those of the proline-rich plant proteins and as the HIF system has not been characterized in plants. At-P4H-2, however, showed no activity with any of the four peptides representing the C-terminal and N-terminal hydroxylation sites in HIF-1␣ and HIF-2␣ (Table I).
It has been suggested that human RNA polymerase II may become bound to the von Hippel Lindau protein by a similar 4-hydroxyproline-mediated mechanism as HIF-␣, which would suggest that a LXXLAP sequence present in this polymerase may be hydroxylated (28). We therefore tested whether At-P4H-2 would act on peptides of different lengths corresponding to this sequence, but no hydroxylation was found (as shown in Table I for the longest peptide studied).
The K m Values of At-P4H-2 for Peptide Substrates Are Markedly Higher Than Those of At-P4H-1-K m values were determined for poly(L-proline) peptides of two molecular weights and five other synthetic peptides that were found to serve as substrates for At-P4H-2. In agreement with the data in Table I, the lowest K m values were found for the M r 30,000 -40,000 and 5000 -10,000 poly(L-proline) peptides, being 13 and 30 M, respectively (Table III). These values are distinctly higher than those for At-P4H-1 as its K m for poly(L-proline), M r 10,000 -20,000, was as low as 0.2 M (Table III). The two extensin-like peptides, SPPPVYKSPPPPVKHYSPPPV and SPPPPVSPP-PVSPPPPV, had the lowest K m values among the other synthetic peptides studied, 380 and 660 M, respectively, whereas the arabinogalactan-like peptide had a K m of more than 1 mM (Table III). All these values are about 10 -60 times higher than those of At-P4H-1 (Table III). The K m values for (PPG) 10 and the prion peptide were even higher, nearly 3 mM, the former K m being about 50 times higher than that of At-P4H-1 (Table III). The K m values of At-P4H-2 for poly(L-proline) were much lower than those of the P. bursaria Chlorella virus-1 P4H, whereas the K m values of these two P4Hs for (PPG) 10 were essentially identical (Table III).
At-P4H-2 Preferentially Hydroxylates in the Extensin-like Peptide SPPPVYKSPPPPVKHYSPPPV the Third Proline in the C-terminal PPP Triplet-To study the hydroxylation pattern of the most efficient synthetic peptide substrate, the longer extensin-like peptide SPPPVYKSPPPPVKHYSPPPV, this peptide was partially hydroxylated with the recombinant At-P4H-2. The peptide was then purified from the enzyme reaction mixture, and the molecular mass of the peptide fraction was analyzed by mass spectrometry. The majority of the purified peptide had a molecular mass identical to that of the nonhydroxylated peptide, whereas about 20% had a higher molecular mass. About 85% of this fraction had a molecular mass 16 Da higher and the rest had a molecular mass that was 32 Da higher than that of the non-hydroxylated peptide, suggesting the presence of one and two 4-hydroxyproline residues, respectively (data not shown). N-terminal sequencing of the peptide fraction showed that 4-hydroxyproline residues were present in the 9th and 20th positions of the peptide, i.e. the first residue of the central PPPP sequence and the last residue of the Cterminal PPP triplet (Fig. 3), 8.8% of the former and 41.3% of the latter being hydroxylated, whereas no 4-hydroxyproline was found in the N-terminal PPP triplet. No data are available on the hydroxylation pattern of At-P4H-1 with these peptides or any other plant peptide, but it hydroxylated in (PPG) 10 both consecutive prolines, although it preferentially acted on the prolines preceding the glycine, this pattern being similar to that of the vertebrate C- P4Hs (1-3, 15).
The K m Values of At-P4H-2 for Reaction Cosubstrates Are Similar to Those of At-P4H-1-The K m values of At-P4H-2 were determined for Fe 2ϩ , 2-oxoglutarate, and ascorbate (Table IV). The K m for iron was 5 M, about one-third of that of At-P4H-1 but more than twice that of the type I C-P4H, whereas the viral P4H has a much lower value (Table IV). The K m for 2-oxoglut-   arate, 170 M, is slightly higher than that of 130 M reported for At-P4H-1 (15), both of these values being much higher than those of the type I C-P4H and the PBCV-1 P4H (Table IV). The K m for ascorbate was found to be identical for the two plant P4Hs, the human type I C-P4H and the viral P4H (Table IV).
At-P4H-2 Is Inhibited by Two 2-Oxoglutarate Analogues-Pyridine-2,4-dicarboxylate and pyridine-2,5-dicarboxylate are well known competitive inhibitors of the vertebrate C-P4Hs with respect to 2-oxoglutarate (1). These two compounds were also competitive inhibitors of At-P4H-2, but their K i values were 5-100 times those of the type I C-P4H (Table IV). Another distinct difference relative to type I C-P4H was that pyridine-2,4-dicarboxylate was a more effective At-P4H-2 inhibitor than pyridine-2,5-dicarboxylate, whereas the former was a less effective C-P4H-I inhibitor than the latter (Table IV). DISCUSSION We report here on the characterization of a second P4H isoenzyme from the flowering plant A. thaliana, representing only the second plant P4H to be cloned from any source. A monomer like the previously characterized At-P4H-1, At-P4H-2 consists of 299 amino acids. The other plant P4Hs that have been partially characterized also appear to be monomers (14,15), and the viral P4H cloned and characterized from the P. bursaria Clorella virus-1 is likewise a monomer (16). The well characterized vertebrate C-P4Hs are ␣ 2 ␤ 2 tetramers of about 240 kDa, (1)(2)(3), whereas the vertebrate HIF-P4Hs again consist of only one type of subunit (8 -10).
The amino acid sequence identity between At-P4H-2 and At-P4H-1 is 33%, the most striking difference being a 49amino-acid C-terminal extension with 6 cysteines present in At-P4H-2. A putative rice homologue of At-P4H-2 also possesses this C-terminal extension, the overall sequence identity between these two polypeptides being 53%. This extension, which may represent a separate domain, is also found in several proteins of C. elegans, Caenorhabditis briggsae, Hydra vulgaris, and Podocoryne carnea with different functions, such as a matrix metalloproteinase, thyroid peroxidase, and acetylcholine receptor, and it is known as a toxin homology domain (29,30). It ends with the characteristic pattern CXXXCXXC so that the last 8 residues contain 3 of its 6 cysteines. The function of this domain in At-P4H-2 is unknown.
The C-P4Hs have a peptide substrate-binding domain that is distinct from the C-terminal catalytic domain and is located between residues 144 and 244 in the 517-residue human ␣ (I) subunit (18,19). It is currently unknown whether the much shorter plant polypeptides also have a separate peptide-binding domain by which they interact with their substrates or whether the peptides become bound directly to the catalytic domain. Interestingly, the N-terminal region of At-P4H-2 contains a stretch of 24 amino acids in which the sequence identity to the rice and viral P4Hs is much higher than to At-P4H-1. If the plant P4Hs have a separate peptide-binding region, this sequence could explain some of the major differences found in the hydroxylation patterns of At-P4H-2 and the viral P4H relative to At-P4H-1.
P4Hs have been identified in species ranging from mammals to insects, nematodes, plants, and viruses and have been found to have multiple isoenzymes in almost all organisms studied (1-3, 14, 16). The vertebrate C-P4Hs hydroxylate only prolines in collagen-like XPG sequences, poly(L-proline) being a competitive inhibitor (1-3, 6, 7), whereas the three HIF-P4Hs act on LXXLAP sequences in HIF␣ (8,9). The functions of 4-hydroxyproline in collagens and HIF␣ are distinctly different as the 4-hydroxyproline residues are required for the stability of collagen molecules, whereas in HIF␣, the 4-hydroxyproline acts as signal for protein degradation (1-3, 31, 32). Plants and green algae have many different HRGPs that play important structural roles in their cell walls but are also involved in various aspects of plant growth, development, and differentiation and responses to stress (12,13). 4-Hydroxyproline residues serve as sites for O-glycosylation in plant HRGPs (12), whereas no reports are available on the glycosylation of 4-hydroxyproline residues in animal proteins. According to a contiguity hypothesis, the extent of arabinosylation of 4-hydroxyproline residues by small 4 -6-residue glycosylated modules increases with 4-hydroxyproline contiguity, clustered non-contiguous 4-hydroxyproline residues being sites for the addition of larger arabinogalactan polysaccharides (33).
The first characterized A. thaliana isoenzyme, At-P4H-1, is a highly unusual enzyme among the various P4Hs as it hydroxylated substrates varying from poly(L-proline) and plant HRGPlike peptides to the collagen model peptide (PPG) 10 and HIF␣like peptides (15). Its substrate specificity also contradicted the previous suggestions that a requirement for the hydroxylation by the plant P4Hs may be a substrate in the poly(L-proline) type II helix conformation (34). Others have suggested that the hydroxylation of proline residues in HRGPs may depend on multiple, sequence-specific P4Hs rather than a single poly(Lproline) type II conformation-dependent enzyme (35). At-P4H-2 was found to differ distinctly from At-P4H-1 in its substrate specificity. All the peptides studied here were much better substrates, defined by a lower K m , for At-P4H-1 than for At-P4H-2, and At-P4H-2 did not hydroxylate the HIF␣-like peptides at all. At-P4H-2 resembled the viral PBCV-1 P4H (16) in that they both hydroxylated (PPG) 10 only very inefficiently, but the viral P4H had K m values that were even much higher for poly(L-proline) than those for At-P4H-2. Unlike the C-P4Hs, At-P4H-2 failed to hydroxylate bradykinin at any significant rate, but it did hydroxylate a 17-residue peptide corresponding to the sequence around the hydroxylation site in the prion protein, although ineffectively and with a high K m (27). The prion peptide RYPGQGSPGGNRYPPQG studied here has altogether 4 proline residues, 2 of them in XPG sequences and 2 consecutive ones in a PPQ sequence. The 4-hydroxyproline identified in this peptide is located in the SPG triplet (27), but due to the low hydroxylation level, it was not possible to analyze which one(s) of the prolines became hydroxylated by At-P4H-2. At-P4H-1 preferentially hydroxylated the prolines preceding glycine in the (PPG) 10 peptide, but the other prolines were also hydroxylated to a small extent (15). The extensin and arabinogalactan-like peptides that acted as good substrates for At-P4H-2 all have at least 3 consecutive prolines, suggesting that this isoenzyme may require a stretch of at least 3 successive prolines for effective hydroxylation. (PPG) 10 , which has stretches of 2 consecutive prolines, was hydroxylated at a low rate, and it is possible that hydroxylation of the prion peptide also occurred in its PPQ triplet with 2 consecutive prolines, whereas (PAG) 5 , (APG) 5 , and the HIF␣like and RNA polymerase II-like peptides, with single proline residues, were not hydroxylated at all. Bradykinin, in which the two consecutive prolines are located in the sequence RP-PGFSPFR very close to the N terminus of the peptide, was likewise not hydroxylated at all. N-terminal sequencing of the partially hydroxylated peptide SPPPVYKSPPPPVKHYSPPPV indicated that At-P4H-2 acted preferentially on the 3rd consecutive proline in the C-terminal PPP triplet, but a small amount of 4-hydroxylation was also found in the 1st proline of the central PPPP sequence. As this peptide was hydroxylated to a low extent also by the endogenous prolyl 4-hydroxylase(s) present in the insect cell extract (Table I), the possibility cannot be excluded that hydroxylation of the latter position was carried out by the insect cell enzyme(s) rather than At-P4H-2. Nevertheless, it is clear that the enzyme can also act on more than 3 consecutive prolines as poly(L-proline) was an effective substrate. The lack of hydroxylation of any proline in the Nterminal PPP triplet and the last proline in the central PPPP triplet may be due to a preferential hydroxylation in short peptide prolines in the C-terminal region. The cell types and developmental stages expressing At-P4H-1 and At-P4H-2 in vivo are currently unknown. It is therefore likewise unknown whether At-P4H-2 has any major role in hydroxylating poly(Lproline)-like sequences in HRGPs in vivo in cases in which both At-P4H isoenzymes are expressed at the same time and in the same location as At-P4H-1 possesses much lower K m values for such peptides. No substrates with lower K m values for At-P4H-2 than for At-P4H-1 have yet been identified.
The K m values of At-P4H-2 for the reaction cosubstrates iron, 2-oxoglutarate, and ascorbate were highly similar to those reported for At-P4H-1 (15). The most striking difference in these values between the vertebrate C-P4Hs and the two characterized A. thaliana enzymes is that the K m values of At-P4H-1 and At-P4H-2 for 2-oxoglutarate were 130 and 170 M, respectively, being thus 6.5-8.5 times the value 20 M of the type I C-P4H (17). The K m values of the three HIF-P4Hs for 2-oxoglutarate, 55-60 M, are higher than that of C-P4H-I but still only about one-third of the value of At-P4H-2 (11,17). It was therefore of interest to study whether the two well characterized competitive 2-oxoglutarate analogue inhibitors of the C-P4Hs, pyridine-2,4-dicarboxylate and pyridine-2,5-dicarboxylate (1), would also inhibit At-P4H-2. It transpired that pyridine-2,4dicarboxylate was an effective inhibitor, with a K i of 10 M, whereas pyridine-2,5-dicarboxylate had a K i of 80 M, although it is a more effective inhibitor of the C-P4Hs than the former (1). It thus seems that there are some distinct differences in the structure of the 2-oxoglutarate-binding site between At-P4H-2 and the C-P4Hs.