Characterization of a Novel Caenorhabditis elegans Prolyl 4-Hydroxylase with a Unique Substrate Specificity and Restricted Expression in the Pharynx and Excretory Duct*

Collagen prolyl 4-hydroxylases (C-P4Hs) have a critical role in collagen synthesis, since 4-hydroxyproline residues are necessary for folding of the triple-helical molecules. Vertebrate C-P4Hs are α2β2 tetramers in which the β subunit is identical to protein-disulfide isomerase (PDI). Three isoforms of the catalytic α subunit, PHY-1, PHY-2, and PHY-3, have been characterized from Caenorhabditis elegans, PHY-1 and PHY-2 being responsible for the hydroxylation of cuticle collagens, whereas PHY-3 is predicted to be involved in collagen synthesis in early embryos. We have characterized transcripts of two additional C. elegans α subunit-like genes, Y43F8B.4 and C14E2.4. Three transcripts were generated from Y43F8B.4, and a polypeptide encoded by one of them, named PHY-4.1, assembled into active (PHY-4.1)2/(PDI-2)2 tetramers and PHY-4.1/PDI-2 dimers when coexpressed with C. elegans PDI-2 in insect cells. The C14E2.4 transcript was found to have a frameshift leading to the absence of codons for two residues critical for P4H catalytic activity. Thus, C. elegans has altogether four functional C-P4H α subunits, PHY-1, PHY-2, PHY-3, and PHY-4.1. The tetramers and dimers containing recombinant PHY-4.1 had a distinct substrate specificity from the other C-P4Hs in that they hydroxylated poly(l-proline) and certain other proline-rich peptides, including ones that are expressed in the pharynx, in addition to collagen-like peptides. These data and the observed restricted expression of the phy-4.1 transcript and PHY-4.1 polypeptide in the pharyngeal gland cells and the excretory duct suggest that in addition to collagens, PHY-4.1 may hydroxylate additional proline-rich proteins in vivo.

Animal C-P4Hs and HIF-P4Hs have also been characterized from the fly Drosophila melanogaster and the nematode Caenorhabditis elegans. Both species probably have several C-P4H isoenzymes but only a single HIF-P4H (2,3,21,22,28). The D. melanogaster genome contains a large gene family of 20 members that encode C-P4H ␣-subunit-like polypeptides (29). One of these has been characterized as a recombinant protein and shown to assemble into an ␣ 2 ␤ 2 tetramer with PDI and to hydroxylate collagen-like peptides (30). Expression of two other D. melanogaster C-P4H ␣-subunit-like genes has been shown to be restricted to the salivary glands and to be necessary for the maintenance of an open salivary gland lumen and proper secretory function (29,31). Since D. melanogaster has only three genes coding for collagens, all of these polypeptides being found in its basement membranes (3), and since salivary glands do not produce these collagens, it is highly likely that these ␣-subunit-like polypeptides hydroxylate some other, yet unknown, collagen-related proteins with proline-rich sequences (29,31).
In contrast to D. melanogaster, C. elegans has a very large collagen family consisting of cuticle collagens with close to 180 different polypeptides and basement membrane collagens with three different polypeptides (3). Three C-P4H ␣ subunits, PHY-1, PHY-2, and PHY-3, have been characterized in C. elegans (32)(33)(34)(35)(36)(37), two of which, PHY-1 and PHY-2, are expressed in the cuticle collagen-synthesizing hypodermal cells in a cyclical fashion that corresponds the moulting cycle and the times of maximal cuticle collagen synthesis (34,35). Analyses of genetically mutant nematodes or nematodes subjected to RNA interference (RNAi) have shown that PHY-1 and PHY-2 are involved in the synthesis of cuticle collagens in C. elegans and its close homologue Caenorhabditis briggsae (33)(34)(35)(36)38). C. elegans PHY-3 is expressed in the embryos, late larval stages, and adult nematodes, its expression in the larval and adult stages being restricted to the spermatheca, a specialized region of the gonad where fertilization of the oocytes occurs (37). Nematodes lacking PHY-3 have no obvious phenotypic abnormalities, but the 4-hydroxyproline content of their embryos is markedly reduced, suggesting that PHY-3 is involved in the synthesis of collagens in early embryos, most likely the collagens present in the egg shell (37). C. elegans has three PDI isoforms, PDI-1, PDI-2, and PDI-3, of which PDI-2 functions as the C-P4H ␤ subunit (38,39).
The Caenorhabditis C-P4Hs assembled from the PHY-1 and PHY-2 polypeptides have unique molecular compositions by comparison with the vertebrate C-P4Hs and the one D. melanogaster C-P4H that has been characterized in detail. The main C-P4H form in C. elegans is a PHY-1/PHY-2/(PDI-2) 2 mixed tetramer, but both C. elegans PHY polypeptides can also assemble into an active dimer with PDI-2, albeit much less effectively (36). The assembly properties of the highly homologous C. briggsae PHY-1 and PHY-2 are similar to those of the corresponding C. elegans polypeptides with the exception that in the absence of PHY-1, C. briggsae PHY-2 assembles into an active (PHY-2) 2 /(PDI-2) 2 tetramer instead of a PHY-2/PDI-2 dimer (40). Combined disruption of phy-1 and phy-2 gene function leads to the absence of all cuticle C-P4H forms in the two Caenorhabditis species and results in an embryonic lethal phenotype (33,35,36,40). Disruption of the phy-2 gene alone produces no visible phenotype, since the enhanced assembly of the PHY-1/PDI-2 dimer can fully compensate for the lack of the mixed tetramer (33,35,36,40). Inactivation of the C. elegans and C. briggsae phy-1 genes alone leads to abnormal cuticle formation and dumpy and small phenotypes, respectively (33)(34)(35)(36)40). Although enhanced assembly of the PHY-2/PDI-2 dimer and (PHY-2) 2 /(PDI-2) 2 tetramer occurs in phy-1 null C. elegans and C. briggsae, respectively, these C-P4H forms cannot fully compensate for the lack of the mixed tetramer and the PHY-1/PDI-2 dimer (36,40). The recombinant C. elegans PHY-3 differs from PHY-1 and PHY-2 in that C-P4H activity is only generated when it is coexpressed with another C. elegans PDI isoform, PDI-1, but not with PDI-2 (37,38). Whether PHY-3 assembles into a complex with PDI-1 or whether PDI-1 just assists in its folding is as yet unknown, however (37).
In the present study we characterized transcripts, generated from the C. elegans genes Y43F8B.4 and C14E2.4, which were predicted to code for polypeptides with similarity to the catalytic ␣ subunits of the human and C. elegans C-P4Hs. Three alternative transcripts were generated from the Y43F8B.4 gene, one of which, named phy-4.1, coded for a polypeptide that, when coexpressed in insect cells, assembled into an active C-P4H dimer and tetramer with C. elegans PDI-2. The spatial expression of the phy-4.1 transcript and PHY-4.1 polypeptide was found to be restricted to the pharyngeal gland cells and the excretory duct. Down-regulation of the phy-4.1 expression by RNAi did not lead to any obvious phenotypic abnormalities, however. A frameshift in the C14E2.4 transcript led to an omission of two of the four catalytically critical residues in the encoded polypeptide, which thus cannot possess P4H activity. These data indicate that C. elegans possesses four functional C-P4H ␣ subunits, namely PHY-1, PHY-2, PHY-3, and PHY-4.1. The recombinant (PHY-4.1) 2 /(PDI-2) 2 tetramer and PHY-4.1/PDI-2 dimer had a distinct substrate specificity from other C-P4Hs, in that they hydroxylated poly(L-proline) and certain other proline-rich peptides in addition to collagen-like peptides. These data together with the unique expression pattern suggest that in addition to collagens, PHY-4.1 may be involved in the hydroxylation of some other proline-rich proteins in vivo.

EXPERIMENTAL PROCEDURES
C. elegans Strains and cDNA Pools Used in the Cloning of phy-4-The wild-type Bristol (N2), daf-2(e1370), RNAi-sensitive rrf-3(NL2099), dpy-18(e364), and phy-2(ok177) strains were obtained from the Caenorhabditis Genetics Center and maintained at 20°C by standard methods (41). The phy-3(ok199) strain (37) was generated from nematodes carrying a deletion in phy-3 that were obtained from the C. elegans Genome Deletion Consortium. Heat shock treatment of N2 nematodes was performed at 37°C for 1 h, and dauer formation was induced in the daf-2(e1370) strain by maintaining it at 25°C for 1 week with a low food supply according to instructions in the C. elegans data base (available on the World Wide Web). Total RNA and mRNA were isolated from normal and heat-shocked N2 and dauer daf-2(e1370) mixed stage cultures with the TRIzol LS reagent (Invitrogen) and a Poly(A)quik mRNA isolation kit (Stratagene), respectively, according to the manufacturers' instructions. cDNA was produced from the isolated RNAs with a SMART TM RACE cDNA Kit (BD Biosciences), following the manufacturer's instructions.
Analysis of the Predicted Genes Y43F8B.4 and C14E2.4 and Their Transcripts-A sequence homology search of the C. elegans data base (available on the World Wide Web) indicated the existence of two putative genes, Y43F8B.4 and C14E2.4, encoding polypeptides with similarity to the catalytic C-terminal regions of the human and C. elegans C-P4H ␣ subunits (Fig.  1). The accuracy of the predicted Y43F8B.4 transcripts was analyzed by 5Ј-RACE reactions with primers from exons 1, 2, and 3 and 3Ј-RACE reactions with primers from exons 3 and 4 and the cDNA pools generated above; and PCR with primer pairs from exons 1 and 3 and exons 3 and 5, using the C. elegans mixed stage cDNA library UNIZAP 937006 (Stratagene) as a template. The C14E2.4 transcripts were analyzed by 3Ј-RACE reactions of the cDNA pools using primers from exons 1, 3, 4, and 5 and by PCR of the cDNA library with primer pairs from exons 1 and 3 and exons 3 and 5. The RACE PCR experiments were performed with Pfu polymerase (Stratagene) as follows: predenaturation for 1 min at 95°C, followed by 25 cycles of denaturation for 30 s at 95°C, annealing for 1 min at 68°C, and extension for 3 min at 72°C. The PCR template from the UNIZAP 937006 cDNA library was prepared by incubating 2 l of the cDNA library at a final volume of 200 l in 1% Nonidet P-40, 100 g/ml proteinase K, 1 mM EDTA, 10 mM Tris, pH 8.0, at 55°C for 45 min, followed by 10 min at 95°C and centrifugation at 12,000 rpm for 5 min. 10 l of the template was used in a PCR with a final volume of 50 l. The PCR amplifications were performed as follows: preincubation for 5 min at 94°C and 2 min at 72°C before the addition of the Pfu polymerase, followed by 30 cycles of denaturation for 1 min at 94°C, annealing for 2 min at 65°C, and extension for 3 min at 72°C. All RACE and PCR products were cloned into pUC18 using the SureClone ligation kit (GE Healthcare) and sequenced using an automated DNA sequencer (Abi Prism 377; Applied Biosystems).
Expression and Analysis of Recombinant PHY-4 Variants in Insect Cells-The cDNAs for the transcripts phy-4.1 and phy-4.2 ( Fig. 1) without the sequences for the predicted transmembrane helix and signal peptide, respectively, were amplified by PCR and cloned into a BamHI-NotI-digested baculovirus expression vector pACGP67-A (Invitrogen) in frame with the baculovirus GP67 signal sequence. The phy-4.1 cDNA was amplified from the normal N2 mixed stage cDNA pool by PCR with the primers Phy4ABamHI5ЈFNFL (5Ј-gcgggatcccTTCAATTTTCTCACACCGTTC-3Ј) and Phy4ANotI3ЈStop 5Ј-gcggccgcTTATGAGAAAAAGTGTT-TTGTGTTGTA-3Ј (the artificial restriction sites are set in lowercase type and underlined, and an additional cytosine following the restriction site was introduced to obtain in-frame cloning with the GP67 signal sequence) and the longer variant phy-4.2 with the primers Phy4ABamHI5ЈFNFL and Phy4BNotI3ЈStop (5Ј-taaagcggccgcCTAGGTGATCGTTT-TGGGTGGT-3Ј) that is located in exon 5 in the prediction Y43F8B.4. 30 PCR cycles with Pfu polymerase were performed as follows: denaturation for 1 min at 94°C, annealing for 2 min at 55°C and extension for 3 min at 72°C, the final cycle including an additional extension for 10 min at 72°C. The sequences of the expression constructs were verified using an automated DNA sequencer. The recombinant baculovirus vectors were cotransfected into Spodoptera frugiperda insect cells (Sf9; Invitrogen) with a modified Autographa californica nuclear polyhedrosis virus DNA using the BaculoGold transfection kit (Pharmingen), and the resultant viral pools were collected and amplified (42).
Insect cells (Sf9 or High Five; Invitrogen) were cultured in TNM-FH medium (Sigma) supplemented with 10% insect cell qualified fetal bovine serum (Invitrogen) as monolayers at 27°C. Cells seeded at a density of 5 ϫ 10 6 cells/100-mm plate were infected at a multiplicity of 5 with the viruses coding for the PHY-4.1 or PHY-4.2 polypeptide with or without a virus coding for C. elegans PDI-1 or PDI-2 (38). In control experiments, the cells were infected with the virus coding for C. elegans PDI-2 alone or coinfected with those coding for C. elegans PHY-1, PHY-2, and PDI-2 (32,36,38). The cells were harvested 72 h after infection, washed with a solution of 0.15 M NaCl and 0.02 M phosphate, pH 7.4 (phosphate-buffered saline, PBS), homogenized in a solution of 0.1 M glycine, 0.1 M NaCl, 10 M dithiothreitol, 0.1% Triton X-100, and 0.01 M Tris, pH 7.8, and centrifuged at 10,000 ϫ g for 20 min. The remaining pellets were further solubilized in 1% SDS. Aliquots of the samples were analyzed by 8% reducing SDS-PAGE and nondenaturating PAGE followed by Coomassie Blue staining or Western blotting with polyclonal antibodies against C. elegans PHY-4.1 (see below), PDI-1, and PDI-2 (36,38). N-Glycosidase F treatment was performed according to the manufacturer's instructions (Roche Applied Science). P4H activity in the Triton X-100-soluble fractions was assayed by a method based on the hydroxylation-coupled decarboxylation of 2-oxo-[1-14 C]glutarate (43). The synthetic collagen-like peptide substrate (Pro-Pro-Gly) 10 was purchased from the Peptide Institute, other synthetic peptides were from Innovagen, and poly(L-proline) was from Sigma. The peptides, except for poly(L-proline), were denatured by heating to 100°C for 10 min, followed by rapid cooling before the addition to the enzyme reaction mixture. K m values were determined as described previously (9). The molecular weight of the P4H assembled from PHY-4.1 and PDI-2 was analyzed by gel filtration on a calibrated Superdex S-200 column (GE Healthcare). N-terminal sequencing was performed using AccQ⅐TAG chemistry (Waters) in a Procise protein sequencer (Applied Biosystems).
Generation of a phy-4 Reporter Gene Construct and Germ Line Transformation-A DNA fragment extending from Ϫ953 to ϩ6 relative to the ATG start codon of phy-4.1 was amplified from the genomic clone Y43F8B with the primers Phy4PROM5XbaI 5Ј-cgcagatctAGGATACATTTTTAACG-GTG-3Ј and Phy4PROM3SmaI 5Ј-gcgcccgggGGGGGTCGA-AAACTGGGTCAA-3Ј (artificial restriction sites set in lower-case type and underlined). The fragment was digested with XbaI-SmaI and ligated into the promoterless nucleus-localized vector pPD96.04 (Addgene), allowing in-frame fusion of the phy-4 promoter sequence and green fluorescent protein/lacZ reporter genes.
Immunofluorescence Staining-A polyclonal rabbit antibody was generated against a synthetic PHY-4.1 peptide TLWVHEKKQEFSLP (Innovagen) and purified using a HiTrap Protein A HP column (GE Healthcare) followed by desalting in a PDI-10 column (GE Healthcare) according to the manufacturer's instructions. Wild-type and heatshocked nematodes were washed from the plates with icecold PBS. The washing step was repeated 3-10 times, and the animals were pipetted onto poly-L-lysine-coated slides, permeabilized by freeze-cracking (46) and fixed in methanol for 10 min followed by acetone for 10 min, both at Ϫ20°C. In an alternative method, the nematodes were fixed in 49% Bowin fixative (75% saturated picric acid, 25% formaldehyde, and 5% glacial acetic acid), 49% methanol, and 1.2% 2-mercaptoethanol for 30 min at room temperature, frozen in liquid N 2 , and quickly thawed under hot water (47). The slides were blocked in PBS containing 0.1% Tween 20 (PBST) and 5% dried skimmed milk and incubated with the PHY-4.1 antibody at 1:100 in PBST, followed by extensive washing and incubation in Alexa Fluor 594 goat anti-rabbit IgG (Molecular Probes, Inc., Eugene, OR) at 1:100 in PBST. Samples were washed and viewed under epifluorescence on an Olympus BX51 microscope. Images were captured with an Olympus Imaging Digital DP50 digital camera.
Microarray Analysis-Total RNA was extracted from normal and heat-shocked N2 worms using the RNeasy MIDI-kit (Qiagen) according to the manufacturer's instructions. The experimental procedures for GeneChip analysis were performed according to the Affymetrix GeneChip expression analysis technical manual. In essence, double-stranded DNA was synthesized using 10 g of total RNA as a template by means of the Superscript Choice System (Invitrogen) and the T7-(dT)24 primer, and the DNA was purified using a Gene-Chip sample cleanup module (Qiagen). In vitro transcription was performed to produce biotin-labeled cRNA using a Bio-Array HighYield RNA transcription labeling kit (Enzo Diagnostics), following the manufacturer's instructions. Biotinylated cRNA was cleaned with a GeneChip sample cleanup module (Qiagen), fragmented to 35-200 nucleotides, and hybridized to the Affymetrix C. elegans genome array, which contains ϳ22,500 C. elegans transcripts. After washing, the array was stained with streptavidin-phycoerythrin (Molecular Probes), and the signal was amplified with biotinylated anti-streptavidin (Vector Laboratories) and a second staining with streptavidin-phycoerythrin, followed by scanning on an HP GeneArray Scanner. The expression data were analyzed using the Affymetrix MicroArray Suite version 5.0. The signal intensities of all of the probe sets were scaled to the target value of 500.
RNA Interference-The effects of down-regulation of phy-4.1 were examined using standard RNAi injection protocols (35). A fragment covering the nucleotides 162-723 of the coding region of the phy-4.1 cDNA was amplified from C. elegans mixed stage cDNA by PCR using the primers 5ЈRNAi-Phy4 5Ј-gcgggatccTCCGAGCCCTTCATCCTACAG-3Ј and 3ЈRNAi-Phy4 5Ј-ggcgcggccgcTTCTTCTCGAGGACCCATAAA-GTC-3Ј (artificial restriction sites set in lowercase type and underlined). The PCR product was cloned into the BamHI-NotI-digested vector pPD129:36 (48), and a double-stranded RNA was produced in vitro as described previously (35). Fifty young adults of the wild-type N2 strain were microinjected, allowed to recover overnight, and transferred singly to fresh plates, and their progenies were scored. Bacterially mediated RNAi was performed as described previously (48), by transferring several L4 animals of the wild-type N2, RNAi-sensitive NL2099, dpy-18(e364), phy-2(ok177), and phy-3(ok199) strains to feeding plates containing E. coli transformed with the pPD129:36-phy-4 construct. The plates were incubated for 2 days at 15, 20, and 25°C, after which the adults were transferred to fresh feeding plates and allowed to lay eggs for 24 h, and the progenies were scored.
Analysis of the Ultrastructure of the Pharynx and Pharyngeal Pumping-Bacterially mediated RNAi with the pPD129:36phy-4 construct was performed on L1 larvae of the wild-type N2 strain. The L1 larvae were generated using the basic bleaching method (41). The bleached eggs were transferred to an unseeded plate and incubated at 20°C overnight. The hatched L1 larvae were transferred at 20°C to fresh feeding plates containing E. coli transformed with the pPD129:36-phy-4 construct. The pharyngeal pumping rate of the phy-4 RNAi and wild-type L1, L2, L3, L4, and adult nematodes was analyzed under an Olympus BX51 light microscope.
For high pressure freezing, the phy-4 RNAi and wild-type L2 nematodes were mixed into a thick paste made from dry bakers' yeast and 10% methanol. The paste was placed to flat specimen carriers (Leica Microsystems) and cryofixed in a Leica EM Pact high pressure freezer. The specimens were further freeze-substituted for Epon embedding in a Leica AFS freeze-substitution system. The specimens were infiltrated in 1% osmiumtetroxide and 0.1% uranyl acetate in acetone at Ϫ90°C for 48 h and gradually warmed to 20°C in 5°C steps. The specimens were removed from the carriers and infiltrated to Epon LX112 (Ladd Research Industries) at room temperature and polymerized at 60°C for 48 h. Thin sections were cut with a Leica Ultracut UCT ultramicrotome, followed by staining in uranyl acetate and lead citrate, and examined in a Philips CM100 transmission electron microscope. Images were captured by a CCD camera equipped with TCL-EM-Menu version 3 from Tietz Video and Image Processing Systems GmbH.

Analysis of Transcripts and Polypeptides Generated from the Predicted Genes Y43F8B.4 and C14E2.4-
A sequence homology search of the C. elegans genome indicated the presence of two novel predicted genes Y43F8B.4 and C14E2.4 coding for polypeptides that show sequence similarity to the C-terminal regions of the catalytic subunits of the human and C. elegans C-P4Hs (i.e. human ␣(I), ␣(II), and ␣(III), and C. elegans PHY-1, PHY-2, and PHY-3) (Fig. 1). The predicted Y43F8B.4 and C14E2.4 genes have seven and six exons, respectively (Fig. 1). The putative Y43F8B.4 gene was predicted to code for a 533amino acid polypeptide that included two sets of the catalytically critical conserved residues, two histidines and one aspartate that bind the Fe 2ϩ atom and a lysine that binds the C-5 carboxyl group of 2-oxoglutarate (8,9), coded by exons 3 and 4 and exons 6 and 7 (Fig. 1A). The C14E2.4 gene was predicted to code for a 429-amino acid polypeptide, the catalytically critical amino acids being coded by exons 4 and 5 (Fig. 1B). A search for protein motifs (available on the World Wide Web) in the predicted polypeptide coded by Y43F8B.4 indicated that residues 6 -13 form a helical transmembrane domain, but there is only a very low probability of signal peptide cleavage between residues Pro 20 and Phe 21 (available on the World Wide Web), suggesting that the polypeptide may have a noncleavable signal peptide. A 20-residue signal peptide was predicted in the polypeptide coded by C14E2.4. We analyzed the transcripts generated from the Y43F8B.4 and C14E2.4 genes by PCR amplification of a mixed stage C. elegans cDNA library with specific primer pairs from various exons of both genes and by 5Ј-and/or 3Ј-RACE of cDNA pools generated from normal and heat-shocked mixed stage N2 and dauer daf-2(e1370) C. elegans strains.
Sequencing of the PCR and RACE products obtained with the Y43F8B.4-specific primers showed that no transcript containing all seven predicted exons was generated, but instead three transcripts, termed phy-4.1, phy-4.2, and phy-4.3, were identified (Fig. 1A). Transcript phy-4.1 contains the sequences of the predicted exons 1-4, but instead of exon 4 being spliced to exon 5, the transcript continues to intron 4, where an in-frame translation stop codon occurs 65 nucleotides downstream from the 3Ј-end of exon 4 (Fig. 1A). The phy-4.1 transcript thus codes for a 278-residue polypeptide (Fig. 2). The transcript phy-4.2 also contains the exon 1-3 sequences, exon 4 now being spliced into exon 5, but so that the last nucleotide of the predicted exon 4 is lost, which leads to an in-frame stop codon 78 nucleotides downstream in exon 5 (Fig. 1A). The phy-4.2 transcript codes for a 282residue polypeptide (Fig. 2). The PHY-4.1 and PHY-4.2 polypeptides thus only differ from each other at their C-terminal ends starting from residue 257 (Fig.  2). The third cDNA product identified, termed phy-4.3, contains the last 83 nucleotides of the predicted exon 5, which is spliced into exon 6, but the latter is spliced into exon 7 in such a manner that the last two nucleotides of the predicted exon 6 and the first two of exon 7 are lost, which leads to a frameshift and an in-frame stop codon 73 nucleotides downstream in exon 7 (Fig. 1A). As a result of the frameshift, the codons for the second histidine binding the Fe 2ϩ and the lysine binding the C-5 carboxyl group of 2-oxoglutarate are not present in the phy-4.3 transcript (Fig. 1A). These catalytically critical residues are thus lacking in the PHY-4.3 polypeptide, which therefore was not analyzed further.
Sequencing of the PCR and RACE products obtained with the C14E2.4-specific primers showed that a single transcript is generated from this gene that contains exons 1-3 as predicted, but 4 nucleotides in the data base sequence of the 3Ј end of exon 4 are missing, which leads to a frameshift when spliced into exon 5, omission of codons for two catalytically critical residues, and a premature stop codon (Fig. 1B). Since the encoded polypeptide cannot function as a P4H, it was not studied further.
The PHY-4.1 amino acid sequence starting from Leu 18 after the N-terminal transmembrane helix is 23% identical to residues 214 -523 of the 543-amino acid C. elegans PHY-1, 25% identical to the corresponding residues of PHY-2, and 27% identical to the 295-residue PHY-3 (Fig. 2). The corresponding identities of the PHY-4.2 polypeptide with the C. elegans PHY-1, PHY-2 and PHY-3 are 21, 22, and 26%, respectively (Fig.  2). The sequence conservation is highest within the C-terminal regions where the catalytically critical residues are located, the PHY-4.1 and PHY-4.2 amino acids 138 -256 being 35-37% identical to the corresponding regions of PHY-1, PHY-2, and PHY-3 (Fig. 2). The two histidines and one aspartate that bind the Fe 2ϩ atom and the lysine that binds the C-5 carboxyl group of 2-oxoglutare (8,9) are all conserved (Fig. 2). The human C-P4H ␣ subunits and the C. elegans PHY-1 and PHY-2 have four cysteine residues in conserved positions that have been shown by site-directed mutagenesis of the human ␣(I) subunit to be involved in the formation of an essential intrachain disul-fide bond (8,49). The PHY-4.1 polypeptide has cysteine residues in the corresponding positions, whereas the fourth one is missing in PHY-4.2 (Fig. 2). The PHY-4.1 and PHY-4.2 polypeptides contain one potential attachment site for N-linked oligosaccharide units, which is located in the same position as one of the five potential glycosylation sites of PHY-3 (37) (Fig. 2).
To study whether the potential N-glycosylation site present in PHY-4.1 and PHY-4.2 is utilized in insect cells, recombinant PHY-4.1 was digested with N-glycosidase F (Fig. 3A, lane 3). Two forms were seen in the nondigested sample, whereas only the lower band remained after the digestion, indicating that the nondigested sample contained both monoglycosylated and nonglycosylated forms of the PHY-4.1 polypeptide (Fig. 3A,  lanes 2 and 3). N-terminal sequencing showed that the GP67 signal peptide was correctly cleaved from the PHY-4.1 polypeptide (data not shown). Taken together, these two results indicate that the recombinant PHY-4.1 polypeptide was correctly translocated into the lumen of the endoplasmic reticulum in the insect cells. Since the N terminus of the recombinant PHY-4.2 polypeptide is identical to that of PHY-4.1, it can be expected to have the same subcellular location.
To study the association of PHY-4.1 and PHY-4.2 with PDI, insect cells were infected with recombinant viruses coding for PHY-4.1 or PHY-4.2 with or without those coding for C. elegans PDI-1 or PDI-2. In control experiments, insect cells were infected with the virus coding for PDI-2 alone or coinfected with those coding for PHY-1, PHY-2, and PDI-2. Triton X-100soluble extracts of the cell homogenates were analyzed by nondenaturing PAGE followed by Western blotting with antibodies against PHY-4.1, PDI-1, and PDI-2 (Fig. 3, B and C). When   6). The cells were harvested and homogenized as above, and the Triton X-100-soluble fractions were analyzed by 8% nondenaturing PAGE followed by Western blotting using antibodies against PHY-4.1 (B) and PDI-2 (C). T indicates a tetramer and D a dimer.
PHY-4.1 was coexpressed with PDI-2, two bands with mobilities intermediate between those of the PHY-1/PHY-2/(PDI-2) 2 tetramer and the free PDI-2 polypeptide were stained by the PHY-4.1 and PDI-2 antibodies, indicating that these two polypeptides had become associated into soluble complexes (Fig. 3, B and C, lanes 3). In contrast, PHY-4.2 did not associate with PDI-2 (Fig. 3C, lane 5), and neither PHY-4.1 nor PHY-4.2 associated with PDI-1 (data not shown). To study the molecular composition of the soluble complexes formed by PHY-4.1 and PDI-2, Triton X-100 extracts from cells coexpressing PHY-4.1 and PDI-2 were applied to a calibrated Superdex-200 gel filtration column, and the fractions were analyzed by nondenaturing PAGE followed by Western blotting. The two bands stained by both PHY-4.1 and PDI-2 antibodies eluted in positions corresponding to molecular weights of ϳ180,000 and 90,000 (data not shown), indicating that PHY-4.1 and PDI-2 had assembled into a (PHY-4.1) 2 /(PDI-2) 2 tetramer and a PHY-4.1/PDI-2 dimer. The nondenaturing PAGE analysis indicated that the assembly of the PHY-4.1/PDI-2 dimer was much more efficient than that of the (PHY-4.1) 2 /(PDI-2) 2 tetramer (Fig. 3, B and C,  lanes 3).
P4H activity in the Triton X-100-soluble extracts from cells coexpressing PHY-4.1 and PDI-2 was analyzed using an assay based on the hydroxylation-coupled decarboxylation of 2-oxo-[1-14 C]glutarate with a collagen-like (Pro-Pro-Gly) 10 peptide as a substrate. Definite C-P4H activity was generated in these extracts but only when relatively high (Pro-Pro-Gly) 10 concentrations were used (data not shown). However, even when (Pro-Pro-Gly) 10 was used at a concentration of up to 4 mM, the amount of C-P4H activity generated in these samples was typically 10 -30-fold lower than that in extracts from cells expressing the PHY-1/PHY-2/(PDI-2) 2 tetramer with an 80 M concentration of (Pro-Pro-Gly) 10 as a substrate (data not shown). No C-P4H activity was generated in the samples from cells expressing PHY-4.1 alone or coexpressing PHY-4.1 and PDI-1, PHY-4.2 and PDI-2, or PHY-4.2 and PDI-1 (data not shown).

C. elegans P4H Tetramers and Dimers Containing PHY-4.1 as the Catalytic Subunit Have a Unique Substrate Specificity-
Since the Triton X-100-soluble extracts from insect cells expressing the PHY-4.1/PDI-2 dimers and (PHY-4.1) 2 /(PDI-2) 2 tetramers seemed to hydroxylate the collagen-like (Pro-Pro-Gly) 10 peptide relatively poorly, we studied whether they would hydroxylate other proline-rich sequences (Table 1). Recombinant P4Hs from the eukaryotic algal virus Paramecium bursaria Chlorella virus-1 (PBCV-1), the monocellular green alga Chlamydomonas reinhardtii, and the plant Arabidopsis thaliana have been shown to hydroxylate various proline-rich peptides, including poly(L-proline), whereas they mostly hydroxylate collagen-like peptides only very inefficiently (50 -53). In contrast, the vertebrate and C. elegans C-P4Hs characterized so far have a strict specificity for -X-Pro-Gly-sequences, and they do not hydroxylate poly(L-proline), which, in the case of the vertebrate C-P4H-I, acts as a powerful competitive inhibitor with respect to the collagen substrate (10, 14 -16). The peptides were studied at 200 -400 M concentrations as substrates for soluble extracts of insect cells coexpressing C. elegans PHY-4.1 and PDI-2, whereas extracts of insect cells expressing PDI-2 alone served as a negative control. About double the amount of P4H activity was found in the extracts of cells coexpressing PHY-4.1 and PDI-2 when the peptide (Pro-Glu-Pro-Pro-Ala) 5 was used as a substrate instead of (Pro-Pro-Gly) 10 (Table 1). Proteins containing Pro-Glu-Pro-Pro-Ala repeats are encoded in the PBCV-1 genome, and the (Pro-Glu-Pro-Pro-Ala) 5 peptide is efficiently hydroxylated by the PBCV-1 P4H and the A. thaliana P4H isoenzyme 1 (50,51). Extracts of cells coexpressing PHY-4.1 and PDI-2 also hydroxylated poly(L-proline), but the amount of P4H activity generated was 4 -6-fold lower than that obtained with (Pro-Pro-Gly) 10 (Table 1). We also tested various other synthetic peptides representing collagen-like sequences (i.e. (Pro-Ala-Gly) 5 and (Ala-Pro-Gly) 5  K m values were determined for those peptides that were found to serve as substrates for the soluble extracts that contained a mixture of the (PHY-4.1) 2 /(PDI-2) 2 tetramers and PHY-4.1/PDI-2 dimers, as described above ( Table 2). The K m for (Pro-Pro-Gly) 10 was Ͼ2500 M, more than 100-fold higher than that of the PHY-1/PHY-2/(PDI-2) 2 tetramer but similar to that of the PBCV-1 P4H ( Table 2). The K m for (Pro-Glu-Pro-  a The values are means Ϯ S.D. from 5-9 independent experiments. Triton X-100soluble extracts from insect cells expressing C. elegans PDI-2 alone were used as controls and gave Ͻ40 dpm/100 g of extract protein with all of the peptides tested. b Sequence encoded by the PBCV-1 genome. Pro-Ala) 5 , 250 M, was at least 10-fold lower than that for (Pro-Pro-Gly) 10 and 4-fold lower than the K m of PBCV-1 P4H for the same peptide ( Table 2). The K m values for the M r 5000 -10,000 and 30,000 -40,000 poly(L-proline) peptides were 220 and 110 M, respectively, about 2-fold lower than those of the PBCV-1 P4H (Table 2).

K m Values of C. elegans P4H Tetramers and Dimers Containing PHY-4.1 as the Catalytic Subunit for
Cosubstrates-K m values were determined for the P4H reaction cosubstrates Fe 2ϩ , 2-oxoglutarate, and ascorbate ( Table 3). The K m for Fe 2ϩ was 10 M, 5-fold higher than the values of the C. elegans PHY-1/ PHY-2/(PDI-2) 2 tetramer and the human C-P4H-I and onethird of that of the C. reinhardtii P4H-1 ( Table 3). The K m value for 2-oxoglutarate, 30 M, was about 2.5-fold lower than that reported for C. elegans PHY-1/PHY-2/(PDI-2) 2 and similar to that of human C-P4H-I, whereas the C. reinhardtii P4H-1 had an approximately 8-fold higher value ( Table 3). The K m for ascorbate was unusually low, 40 M, about 8-fold lower than those of the C. elegans PHY-1/PHY-2/(PDI-2) 2 and the human C-P4H-I and similar to that of the C. reinhardtii P4H-1 (Table 3).
Spatial Expression of phy-4.1 Is Restricted to the Pharyngeal Gland Cells and Excretory Duct-A DNA fragment spanning the region from Ϫ953 to ϩ6 relative to the translation initiation codon of phy-4.1 was ligated in frame to a lacZ reporter gene, and the construct was microinjected into the germ line with a marker plasmid containing a rol-6(su1006) gene. Three independent lines were generated, and a large number of individual nematodes from each line representing all of the different life cycle stages were examined by staining for ␤-galactosidase activity. Expression of phy-4.1::lacZ was detected in the nuclei of three pharyngeal gland cells g1AL, g1AR, and g1P, of all of the larval stages, adults, and late embryos (54) (Fig. 4A). These gland cells occupy the space in the terminal bulb of the pharynx, and each g1 cell extends three cuticle-lined ducts anteriorly from the terminal bulb (54). Two of these ducts pass through the isthmus before emptying into the pharyngeal lumen near the anterior bulb, and one extends much further and empties near the anterior limit of the pharynx (54).
Expression of the PHY-4.1 polypeptide in the pharynx was confirmed by immunofluorescence staining with a purified polyclonal antibody against PHY-4.1 (Fig. 4B). Staining was seen in the pharynx lumen and at the boundary between the pharynx and the gut (Fig. 4B). In addition, staining for the PHY-4.1 polypeptide was found in the excretory duct of the excretory system (Fig. 4B). This cuticle-lined tube lies next to the terminal bulb of the pharynx and is involved in the secretion of metabolites from the excretory system (55).

C. elegans P4H Tetramers and Dimers Containing PHY-4.1 as the Catalytic Subunit Hydroxylate Proline-rich Sequences
Expressed in the Pharynx-Based on the restricted expression pattern of the PHY-4.1 polypeptide, we searched for potential new substrates for the C. elegans P4H tetramers and dimers that contain PHY-4.1 as the catalytic subunit by screening the C. elegans data base (available on the World Wide Web) for proline-rich sequences that were expressed in the pharynx or excretory duct. The search identified two genes, D1007.7 and frl-1 (Y48G9A.4) that are expressed in various regions of the nematode, including the pharynx or pharyngeal gland cells, and code for polypeptides with proline-rich regions. Based on the amino acid sequences, two synthetic peptides Pro-Ile-Pro 7 -Leu-Pro-Gln-Asn-Leu-Ser-Gly-Ala-Pro 8 and Ala-Pro 4 -Gly-Ile-Pro-Gly-Tyr-Pro 2 -Ala-Pro 4 -Gly-Val-Gly-Pro 4 -Gln-Gly, representing the polypeptides coded by the frl-1 and D1007.7 genes, respectively, were ordered and tested as substrates for soluble extracts of insect cells coexpressing C. elegans PHY-4.1 and PDI-2. Both peptides were hydroxylated very efficiently at 200 M concentrations, the activities being 4 -6-fold relative to that obtained with the control peptide (Pro-Glu-Pro-Pro-Ala) 5 at 400 M concentration ( Table 4). The higher relative activities obtained were not due to lower K m values for the Pro-Ile-Pro 7 -Leu-Pro-Gln-Asn-Leu-Ser-Gly-Ala-Pro 8 and Ala-Pro 4 -Gly-Ile-Pro-Gly-Tyr-Pro 2 -Ala-Pro 4 -Gly-Val-Gly-Pro 4 -Gln-Gly peptides, since the values were 670 and 340 M, respectively, being thus about 3and 1.5-fold relative to that for the (Pro-Glu-Pro-Pro-Ala) 5 peptide. The C. elegans P4H tetramers and dimers that contain   PHY-4.1 as the catalytic subunit thus hydroxylate the novel substrates with much higher maximal reaction velocity than the (Pro-Glu-Pro-Pro-Ala) 5 peptide despite their lower affinity.
Analysis of the Effect of Heat Shock on the Expression Level of phy-4.1-It had seemed during the analysis of the phy-4 transcripts that phy-4.1 cDNA was more abundant in the cDNA pools generated from the heat-shocked nematodes than from those maintained at the normal culture temperature. To study the effect of heat shock on the expression level of the phy-4 gene, a microarray hybridization experiment was performed on an Affymetrix C. elegans genome array, which contains ϳ22,500 C. elegans genes, with probes isolated from N2 nematodes cultured at normal temperature or subjected to heat shock. Previous studies have identified 28 and 32 genes of 11,917 and 17,661 studied, respectively, to be up-regulated by about 2-110-fold by heat shock, several of these coding for members of the heat shock protein (HSP) families 70 and 16 (56,57). In accordance with the previous data (56, 57) the genes C12C8.1, F44E5.4, and T27E4.8 coding for the HSP-70 and HSP-16 proteins had the highest induction levels, 24 -120-fold in our analysis (data not shown). The expression level of the Y43F8B.4 gene coding for PHY-4.1 was not markedly induced by heat shock (1.13-fold) (data not shown). Likewise, the expression levels of the genes coding for PHY-1, PHY-2, PHY-3, and PDI-2 did not change significantly upon heat shock treatment.
phy-4.1 RNAi Does Not Lead to Any Obvious Phenotypic Abnormalities-RNAi analysis of the phy-4.1 transcript was performed using an RNAi construct that also targets the phy-4.2 transcript by injection and feeding in wild-type, RNAi-sensitive NL2099, dpy-18(e364) (also known as phy-1), phy-2(ok177), and phy-3(ok199) backgrounds. RNAi of phy-4.1 did not lead to any obvious defects in the overall morphology, fertility, or behavior of the wild-type and RNAi-sensitive strains. No additional morphological effects were observed for the phy-4.1 RNAi in the dpy-18(e364), phy-2(ok177), and phy-3(ok199) mutant backgrounds, the progenies remaining dumpy because of the phy-1 mutation in the dpy-18(e364) strain or wild type in the other two mutant strains.
The ultrastructure of the pharynx of phy-4.1 RNAi and wildtype L2 nematodes was analyzed by electron microscopy. No obvious structural differences in the pharynx area or in the three g1 gland cells were seen in the phy-4.1 RNAi and wildtype nematodes (data not shown). The phy-4.1 RNAi also did not affect the pharyngeal pumping rate of L1, L2, L3, L4, and adult nematodes (data not shown).

DISCUSSION
The C. elegans genome contains over 170 cuticle collagen genes and three basement membrane collagen genes (3). The synthesis of collagens with stable triple helices requires the hydroxylation of proline residues in the -X-Pro-Gly-sequences of their polypeptide chains. This hydroxylation is catalyzed by C-P4Hs, enzymes consisting of catalytic ␣ subunits and ␤ subunits that are identical to PDI. Three C. elegans ␣ subunit isoforms, PHY-1, PHY-2, and PHY-3, and their genes have been characterized previously (32)(33)(34)(35)(36)(37). In the present study, we show that the C. elegans genome contains one additional gene, Y43F8B.4, that codes for a functional P4H ␣ subunit, named PHY-4.1. The fifth C. elegans gene, C14E2.4, that has been predicted to code for an ␣ subunit-like polypeptide, turned out not to be a true ␣ subunit gene, since the reading frame required for the four catalytically critical residues was disrupted.
The molecular composition of the vertebrate C-P4Hs (14 -16) and the single D. melanogaster C-P4H characterized in detail so far (30) is an ␣ 2 ␤ 2 tetramer. Coexpression experiments with recombinant human ␣(I), ␣(II), and PDI/␤ subunits in insect cells have indicated that mixed tetramers containing two different ␣ subunit isoforms are not formed in vertebrates (15). The assembly properties of nematode PHY polypeptides have turned out to be much more versatile, however. The C. elegans PHY-1 and PHY-2 polypeptides assemble into mixed tetramers and dimers with PDI-2 (36). Additional assembly properties were uncovered here with the recombinant C. elegans PHY-4.1 polypeptide, which formed both (PHY-4.1) 2 /(PDI-2) 2 tetramers and PHY-4.1/PDI-2 dimers. Probably an extreme end of the range of nematode P4H assembly versatility is represented by the PHY-1 polypeptide of the filarial nematode Brugia malayi, which does not require PDI as the ␤ subunit but assembles into an active (PHY-1) 4 homotetramer (58).
The substrate specificity of the two C. elegans P4H forms that contain PHY-4.1 as the catalytic subunit indicates that they form a separate class of P4Hs in C. elegans that is distinct from the C-P4Hs and HIF-P4Hs. The P4H tetramers and dimers containing PHY-4.1 are capable of hydroxylating other prolinerich sequences, including poly(L-proline), in addition to collagen-like sequences. Since (Pro-Pro-Gly) 10 was hydroxylated less effectively than (Pro-Glu-Pro-Pro-Ala) 5 by the (PHY-4.1) 2 / (PDI-2) 2 tetramer and PHY-4.1/PDI-2 dimer, it is highly likely that their main in vivo substrates are not collagens but some other proline-rich proteins. Two putative new substrates are the proline-rich sequences encoded by the C. elegans D1007.7 and frl-1 genes. These genes are expressed in many organs of the nematode, including the pharynx or pharyngeal gland cells (59). The D1007.7 codes for a putative RNA-binding protein, whereas frl-1 encodes a homologue of human leukocyte formins (available on the World Wide Web). As the functions of these genes are yet unknown, it remains to be established whether they are hydroxylated in vivo.
The spatial expression of PHY-4.1 differed distinctly from those of PHY-1, PHY-2, and PHY-3. Expression of PHY-4.1 was seen in the pharyngeal glands, the lumen of the pharynx, and the excretory duct, whereas PHY-1 and PHY-2 are expressed in the hypodermal cells that synthesize cuticle collagens (35,36), and PHY-3 is expressed in the spermatheca (37). The distinct expression pattern of PHY-4.1 and its unusual substrate specificity suggest that it is not involved in the hydroxylation of cuti-cle collagens or those present in egg shells. This premise is supported by RNAi of phy-4.1, which did not produce any synthetic morphological effects over those observed in the phy-1, phy-2, and phy-3 mutant nematodes. Interestingly, it has been shown recently that two D. melanogaster C-P4H ␣ subunit-like polypeptides, SG1 and SG2, have restricted expression in the salivary gland and are necessary for the maintenance of an open and uniformly sized salivary gland lumen with proper secretory potential (29,31). Since the D. melanogaster collagen genes are not expressed in the salivary gland, it was suggested that P4Hs containing SG1 and SG2 as their catalytic subunits hydroxylate proline residues in other secreted or transmembrane proteins (31). The hydroxylation properties of the P4H tetramers and dimers containing PHY-4.1 and the specific localization of PHY-4.1 suggest that it may have a similar role in C. elegans.
Our RNAi experiments showed, however, that phy-4.1 is not essential for the proper structure and function of the pharynx.
Our current results complete the characterization of C. elegans genes encoding catalytic subunits of C-P4Hs and further emphasize specialized functions of individual family members. Surprisingly, inactivation of any of these genes either singly or in combination does not result in the body wall muscle defects that are characteristic of the abnormal function of C. elegans basement membranes (60,61). In contrast, inactivation of the let-268 locus coding for the single C. elegans lysyl hydroxylase results in the absence of the type IV collagen homologue from the basement membranes and its retention within the synthesizing cells (61). The type IV collagen homologue is required for mechanical support of the body wall muscles as the body wall muscle cells become separated from the underlying hypodermis upon muscle contraction, and therefore its absence in the let-268 mutant embryos leads to arrest at the 2-fold stage, shortly after elongation commences (61). The current and previous data imply that the hydroxylation of lysines in the C. elegans type IV collagen homologue may be more critical for its proper function than the hydroxylation of proline residues.