Egg shell collagen formation in Caenorhabditis elegans involves a novel prolyl 4-hydroxylase expressed in spermatheca and embryos and possessing many unique properties.

The collagen prolyl 4-hydroxylases (EC ) play a critical role in the synthesis of all collagens. The enzymes from all vertebrate species studied are alpha(2)beta(2) tetramers, in which the beta subunit is identical to protein disulfide isomerase (PDI). Two isoforms of the catalytic alpha subunit, PHY-1 and PHY-2, have previously been characterized from Caenorhabditis elegans. We report here on the cloning and characterization of a third C. elegans alpha subunit isoform, PHY-3. It is much shorter than the previously characterized vertebrate and C. elegans alpha subunits and shows 23-30% amino acid sequence identity to PHY-1 and PHY-2 within the catalytic C-terminal region. Recombinant PHY-3 coexpressed in insect cells with a C. elegans PDI isoform that does not associate with PHY-1 was found to be an active prolyl 4-hydroxylase. The phy-3 gene consists of five exons, and its expression pattern differs distinctly from the hypodermally expressed phy-1 and phy-2 in that it is expressed in embryos, late larval stages, and adult nematodes, expression in the latter being restricted to the spermatheca. Nematodes homozygous for a phy-3 deletion are phenotypically of the wild type and fertile, but the 4-hydroxyproline content of phy-3(-/-) early embryos was reduced by about 90%. PHY-3 is thus likely to be involved in the synthesis of collagens in early embryos, probably of those in the egg shell.

The prolyl 4-hydroxylases (P4Hs), 1 enzymes residing within the lumen of the endoplasmic reticulum, catalyze the formation of 4-hydroxyproline in collagens and more than 15 other proteins (1) by the hydroxylation of proline in X-Pro-Gly repeats (2,3). P4Hs have a central role in the synthesis of all collagens, as 4-hydroxyproline residues are essential for the folding of the collagen triple helix. In addition, a family of three cytoplasmic P4Hs plays a critical role in the regulation of the hypoxiainducible transcription factor HIF␣ (4,5).
The P4Hs require Fe 2ϩ , 2-oxoglutarate, O 2 , and ascorbate (2,3). The vertebrate enzymes, and also a P4H from Drosophila melanogaster (6), are ␣ 2 ␤ 2 tetramers in which the ␤ subunits are identical to protein disulfide isomerase (PDI) (2,3). At least two isoforms of the catalytic ␣ subunit are found in human (7) and mouse (8) tissues, the [␣(I)] 2 ␤ 2 tetramer being the main form in many cell types, whereas [␣(II)] 2 ␤ 2 is found especially in chondrocytes and vascular endothelial cells (9,10). The properties of the two isoenzymes are very similar, but differences are found between them in the binding of peptide substrates and peptide inhibitors (7,11).
Our sequence homology search of the C. elegans genome (17) indicated the presence of three additional less well conserved P4H ␣ subunit-like genes. We report here on the cloning and characterization of one of these, termed phy-3. Nematodes carrying homozygous deletion of the phy-3 gene were morphologically normal at both the adult and larval stages, but their early embryos had egg shells with a markedly reduced 4-hydroxyproline content. Our data thus indicate that PHY-3 is involved in the hydroxylation of proline residues in the early embryo, most probably in the collagens of the egg shell (18).
The full-length phy-3 cDNA was cloned by amplifying two fragments from the C. elegans mixed-stage cDNA library UNIZAP 937006 (Stratagene). The first fragment corresponding to the T20B3.7 open reading frame (Fig. 2C, PCR1) was amplified with the primers 5Ј-ATTCCAGAT-TCCATCTGCATCACCTACG-3Ј and 5Ј-AACATTCAAATTGTTCTA-CAAATATCAATTGGG-3Ј, while the second corresponding to a sequence beginning from the start codon in the upstream 208-bp exon and extending to a BamHI site in the fourth exon of T20B3.7 (Fig. 2C, PCR2) was amplified with 5Ј-GAAAACCATCATGATTTCTGTCACTT-TCCG-3Ј and 5Ј-ATTGTAGGCACGGATCCAAATAGTCGCGATC-3Ј. The fragments were cloned into the SmaI site of pUC18, fragment 1 was digested with BamHI, and fragment 2 with BamHI and EcoRI, and coligated into pBluescript (Stratagene) to generate pBS-phy-3. The DNA sequences were determined using an automated sequencer (ABI Prism 377, Applied Biosystems).
Semiquantitative Reverse Transcriptase PCR-Reverse transcriptase PCR was performed at different developmental stages of C. elegans (21). The abundance of the phy-3 transcript was measured in relation to the constitutively expressed ama-1 gene (21), PCR products corresponding to phy-3 and ama-1 being amplified simultaneously. The phy-3 cDNA was amplified with primers 5Ј-GGCACCGTATTCCCAAG-TATTGGTTCCAC-3Ј and 5Ј-GGGCACTGGCGACTGTGTCTGCATTTC C-3Ј, while ama-1 was amplified with the primers described previously (21). The PCR products were electrophoresed on 2% agarose gels, Southern blotted, and hybridized under stringent conditions with 32 Plabeled PCR products corresponding to the phy-3 and ama-1 genes. The hybridized bands were excised and counted by scintillation.
Generation of a Reporter Gene Construct and Germ Line Transformation-DNA fragment extending from Ϫ1480 to ϩ6 relative to the ATG codon of phy-3 was amplified from the genomic clone T20B3 (nucleotides 20,101-21,586, GeneBank TM accession number Z81593) with the primers 5Ј-GCGCTCTAGAACTGAACAAGTGAACATTCATCAT-3Ј and 5Ј-GCGCTCTAGAGAAATCATGATGGTTTTCTAAAAATAAATATG-3Ј. The fragment was digested with XbaI and ligated into the promoterless nucleus-localized vector pPD96.04 (obtained from A. Fire, J. Ahnn, G. Seydoux, and S.-Q. Xu), allowing in-frame fusion of the phy-3 promoter sequence and GFP/lacZ reporter genes.
Transgenic strains were generated by microinjecting the reporter plasmid (20 g/ml) together with the marker plasmid pRF-4 rol-6 (su 1006) (22) (100 g/ml) into the syncytial gonad of wild-type nematodes (23). Two independent lines were maintained and examined for reporter gene expression as described (24), except that slides were stained at 22°C with a solution containing 0.03% X-gal or at 37°C with a solution containing 0.3% X-gal (sensitive staining). The nematodes were viewed and photographed under Nomarski optics using a Zeiss Axoscop 2 microscope.
Immunofluorescence Staining-Wild-type and phy3 Ϫ/Ϫ nematodes were washed from the plates with ice-cold PBS. The washing step was repeated 5-10 times, and the animals were pipetted onto poly-L-lysinecoated slides, permeabilized by freeze-cracking, and fixed as described (25). The slides were incubated with a polyclonal antibody 2681 generated in rabbits to a synthetic PHY-3 peptide CPSLSNRFR-PEMQTQSPVPN (Sigma Genosys), followed by Alexafluor 546-conjugated goat anti-rabbit antibody (Molecular Probes), washed with PBS, and examined under an epifluorescence microscope.
Isolation of phy-3 Ϫ/Ϫ Deletion Mutants-Deletion mutants of phy-3 were generated by Dr. Gary Moulder at Oklahoma Medical Research Foundation as part of the C. elegans knock-out consortium, using treatment with trimethylpsoralen and UV light. The progeny of the mutagenized animals was cultured and genomic DNA isolated from several populations and screened by nested PCR to identify a population carrying deletions in phy-3. The primer pairs used corresponded to bp 24,818 -24,837 and 27,719 -27,700 and 24,870 -24,889 and 27,590 -27,571 in T20B3 for the first and second rounds of PCR, respectively. Populations carrying a deletion were subdivided until homozygotes were obtained, and these were outcrossed to wild-type N2 nematodes six times to purify the genetic background. Outcrossed phy3 Ϫ/Ϫ animals carrying the homozygous deletion (T20B3.7 ok199 TP7) were examined by microscopy. The end points of the deletion were determined by sequencing PCR products spanning the deleted region.
Amino Acid Analysis of the Early Embryos and Whole Nematodes-Synchronous cultures of wild-type and phy-3 Ϫ/Ϫ L1 larvae were prepared by starving, then inoculated onto 90-mm NGM agar plates containing a lawn of Escherichia coli OP50 (19) and grown up on several plates to adults. They were washed from the plates with ice-cold M9 buffer and allowed to settle on ice for 4 min. The washing step was repeated 5-10 times, and an aliquot of the pellet was analyzed microscopically to ensure that no developing embryos were present. The adult nematodes were resuspended into M9 buffer and bleached with an equal volume of bleach solution (2 volumes of 4 M NaOH and 3 volumes of 0.3% NaOCl), and incubated at 37°C for 10 min with agitation. The resulting worm debris was passed through a narrow gauge needle 40 -50 times, washed with a large volume of distilled water, and centrifuged at 3500 rpm for 5 min at 4°C, this step being repeated three to six times. An aliquot of the obtained egg pellet was analyzed by light microscopy to confirm that the embryos were at the early developmental stage. This pellet was resuspended in 400 l of ST buffer (0.125 mM Tris, pH 6.8, 1% SDS), heated to 100°C for 2 min, and incubated at 22°C overnight. The sample was then centrifuged and the resulting supernatant hydrolyzed. The pellet was resuspended in 400 l of ST buffer containing 5% ␤-mercaptoethanol, the above procedure was repeated, and the supernatant was hydrolyzed. SDS was removed from the protein solution by potassium salts (26).
Large populations of mixed-stage wild-type and phy-3 Ϫ/Ϫ nematodes were washed in M9 buffer, treated with the bleach solution (above) for 5 min at 37°C, and washed several times with distilled water. The nematodes were either sonicated in distilled water on ice, after which the sonicate was hydrolyzed or passed several times through a narrow gauge needle, centrifuged, and the pellet resuspended in ST buffer containing 5% ␤-mercaptoethanol, boiled for 2 min, incubated at room temperature for 10 h, and hydrolyzed.
All samples were hydrolyzed in 6 M HCl for 24 h at 110°C. Amino acid analyses were performed in an Applied Biosystems 421 amino acid analyzer.
Sf9 cells were cultured as monolayers (27) and infected with viruses coding for PHY-3, PHY-1, or the human ␣(I) alone or together with viruses coding for C. elegans PDI-1, PDI-2, or human PDI. The recombinant proteins were analyzed by 12% SDS-PAGE (27) and assayed for P4H activity by a method based on the decarboxylation of 2-oxo-[1-14 C]glutarate (29). N-Glycosidase F treatment was performed according to the instructions of the manufacturer (Roche Molecular Biochemicals) and Western analysis with the antibody 2681.

Amino Acid Sequence of PHY-3 and Its Comparison with
Those of Other P4H ␣ Subunits-A sequence homology search indicated that the C. elegans genome contains an open reading frame T20B3.7 consisting of four exons that encodes a 239amino acid polypeptide showing sequence similarity to the conserved C-terminal region of the vertebrate P4H ␣(I) and ␣(II) subunits (7,8,30) and C. elegans PHY-1 and PHY-2 (12,14,15) (Figs. 1 and 2B). However, this polypeptide did not contain a signal peptide, and analysis of the genome suggested the presence of an additional 208-bp exon 4151 bp upstream from T20B3.7, coding for 69 amino acids, including a signal peptide ( Fig. 2A). A cDNA containing the exon sequences from the start codon in this upstream exon to an internal BamHI site in the last exon of T20B3.7 was obtained by PCR from a mixed-stage cDNA library (Fig. 2C, PCR2), and sequencing of this product showed that in addition to exon 1, the predicted T20B3.7 gene lacked 29 bp from the 5Ј end of exon 2 (Fig. 2B). A full-length phy-3 cDNA was subsequently also obtained by PCR from mixed stage cDNA. The cDNA encodes a 318-amino acid polypeptide (Fig. 1), the most likely cleavage site of the signal peptide being located between Ser 23 and Gln 24 (20). The processed PHY-3 is thus 295 amino acids, much shorter than the vertebrate P4H ␣ subunits and C. elegans PHY-1 and PHY-2, with a size range from 514 to 542 residues.
The sequence of the processed PHY-3 is 17% identical to residues 256 -542 in PHY-1 (12) and 18 -20% identical to the corresponding residues in PHY-2 (14,15) and the human ␣(I) and ␣(II) subunits (8, 30) (Fig. 1). The sequence conservation is highest within the C-terminal regions, the PHY-3 amino acids 150 -295 being 23-30% identical to the corresponding residues of PHY-1 and PHY-2 and human ␣(I) and ␣(II). The two histidines and one aspartate that bind the Fe 2ϩ atom and the lysine that binds the C5 carboxyl group of 2-oxoglutarate (27) are all conserved (marked by * in Fig. 1).
Physical Structure of the phy-3 Gene-The sequence of phy-3 is found on cosmid T20B3 (GeneBank TM accession number Z81593), which maps to chromosome V (17). The nearest genes are T20B3.2 and T20B3.13 on the opposite strand of the phy-3 coding region. As the nearest genes located on the same strand as phy-3 are at a distance of several kilobases, it seems unlikely that phy-3 belongs to a gene cluster or operon. The phy-3 coding sequence is organized into five exons (Fig. 2) (17).
Temporal Expression of phy-3 Is Restricted to Embryos and Late Larval and Adult Stages-Semiquantitative reverse transcriptase PCR was performed with mRNA samples extracted from highly synchronous post-embryonic C. elegans cultures (21). The relative abundance of the phy-3 mRNA was determined with respect to a constitutively expressed gene ama-1, which codes for the large subunit of RNA polymerase II (21). Expression of phy-3 transcripts in these samples was detected only in the late larval stages (L3 and L4) and in the adult nematodes (Fig. 3). Additional PCR experiments with a cDNA pool prepared from embryos gave a strong signal, indicating the presence of the phy-3 mRNA also in embryos (details not shown).

Spatial Expression of phy-3 in Late Larvae and Adult Nematodes Is
Exclusive to the Spermatheca-A putative promoter fragment, 1480 bp upstream of the translation initiation codon, was ligated in-frame to a lacZ reporter gene. The construct, phy-3::lacZ, was microinjected into the germ line with a marker plasmid containing a rol-6(su1006) gene (22,23). A large number of nematodes from two independent lines, selected on the basis of their roller phenotype, were stained for ␤-galactosidase activity.
Expression of phy-3::lacZ was consistently detected in the spermatheca of L4 larvae and adult nematodes (Fig. 4, A and  B), this specialized region of the gonad being the site of oocyte fertilization (18). Some additional staining of gonadal cell nuclei was observed when a sensitive staining method was used (Fig. 4A), whereas no ␤-galactosidase expression was observed in the hypodermal cells. Expession of the PHY-3 polypeptide in spermatheca was confirmed by immunofluorescence staining with a polyclonal antibody against PHY-3 (Fig. 4, C and D).
Homozygous Deletion of phy-3 Leads to a Marked Reduction in the 4-Hydroxyproline Content-Homozygous deletion mutants in the phy-3 locus were backcrossed six times to remove non-related mutations. This mutant, phy-3 Ϫ/Ϫ ok199, contained a 1241-bp deletion that corresponds to position 25,891-27,132 in T20B3 (Fig. 2D) and removes exons 2, 3, and 4 and part of exon 5. The homozygous nematodes were phenotypically of the wild type, and no defects were found in their gross morphology, fertility, or behavior. As phy-3 was expressed exclusively in embryos and in the spermatheca of L4 larvae and adult nematodes, we carefully examined the early embryos. Those of the phy-3 Ϫ/Ϫ strain were morphologically of the wildtype when viewed with Nomarski optics (data not shown). The 4-hydroxyproline content of the phy-3 Ϫ/Ϫ early embryos was dramatically reduced by about 90% (p Ͻ 0.0005) relative to their wild-type counterparts, and a small decrease (p Ͻ 0.05) was also seen in the proline content (Table I). In contrast, the 4-hydroxyproline content of the whole phy-3 Ϫ/Ϫ nematodes was not decreased (data not shown).
Expression of Recombinant PHY-3 in Insect Cells-Recombinant PHY-3 was produced in insect cells, and the cell lysate was analyzed by 12% SDS-PAGE followed by Coomassie staining and Western blotting (Fig. 5). In agreement with data previously reported for PHY-1 (12) and the human P4H ␣ subunits (7, 28), the majority of PHY-3 formed insoluble aggregates, and its efficient extraction required 1% SDS (Fig. 5). To study whether any of the five potential N-glycosylation sites present in PHY-3 ( Fig. 1) are utilized in insect cells, samples were digested with N-glycosidase F. Several forms of PHY-3 were seen in SDS-PAGE of the nondigested sample, whereas only one major band and two minor bands, probably representing degradation products, were present after the treatment (Fig. 5).
To study the association of PHY-3 with various PDI isoforms, insect cells were coinfected with viruses coding for PHY-3 and human PDI (28) or C. elegans PDI-1 or PDI-2 (13), and Triton X-100 extracts of cell homogenates were analyzed for P4H activity with an assay based on the hydroxylation-coupled decarboxylation of 2-oxo-[1-14 C]glutarate (29). The recombinant PHY-3 yielded P4H activity only when coexpressed with C. elegans PDI-1 (Table II). This activity level was about 22-27% of that in extracts from cells expressing the C. elegans PHY-1/human PDI dimer or the human type I P4H (Table II). The amount of PHY-3 polypeptide in the soluble fraction was FIG. 3. Temporal expression pattern of phy-3. Semiquantitative reverse transcriptase PCR was used to study the ratio of phy-3 expression to that of the constitutive ama-1 gene (y axis). Values were obtained from mRNA samples isolated from synchronous C. elegans cultures sampled at hours indicated after L1 arrest (x axis). L1-L4, larval stages 1 to 4. very small, however, even when coexpressed with PDI-1, and therefore the specific activity of PHY-3 may not be significantly lower than that of the C. elegans PHY-1/human PDI dimer or the human type I P4H tetramer. As reported previously (13), neither the C. elegans PHY-1 polypeptide nor the human ␣(I) subunit assembled into an active P4H when coexpressed with the C. elegans PDI-1 (Table II). DISCUSSION The C. elegans phy-3 gene was found to encode a P4H ␣ subunit with many unique properties. The vast majority of the collagens in C. elegans are found in the cuticle and consist predominantly of short polypeptides coded by a gene family of more than 150 members (31). The previously characterized PHY-1 and PHY-2 are produced exclusively by hypodermal cells and are expressed at all developmental stages (14). In contrast, expression of phy-3 was detected only in embryos and in the late larval and adult nematodes, and expression in the latter was restricted to the spermatheca, indicating that PHY-3 is not involved in the synthesis of cuticle collagens. No such highly restricted expression pattern has been reported for any other P4H ␣ subunit from any species.
The basement membrane collagens in C. elegans include type IV, a heterotrimer of ␣1(IV) and ␣2(IV) chains, and type XVIII, an [␣1(XVIII)] 3 homotrimer (32-34). The basement membranes are distributed around the major organs, especially the muscles and gonad, a distribution that includes the L4/adult sper-matheca (33). The ␣1(IV) and ␣2(IV) chains are mainly synthesized in the body wall muscle cells, but in L4 and adult nematodes they are also synthesized in the spermatheca (33). Mutations in either of the ␣1(IV) or ␣2(IV) gene (emb-9 and let-2, respectively) are embryonic lethal (34). The other main collagen hydroxylase, lysyl hydroxylase, is encoded in C. elegans by a single gene (let-268) (35,36) that is likewise expressed in body wall muscle cells (36). Its expression coincides with that of type IV collagen, and its mutations lead to retention of type IV collagen within cells and result in embryonic lethality (36). Our data strongly suggest that PHY-3 is not involved in the synthesis of type IV collagen, as we did not observe any phy-3 expression in the body wall muscle cells, and as the temporal expression of this gene is very different from that of the type IV genes. Furthermore, if phy-3 were involved in type IV collagen synthesis, elimination of its function could be expected to lead to embryonic lethality. We cannot, however, exclude the possibility that PHY-3 expressed in spermatheca may have contributed to proline hydroxylation of the type IV collagen synthesized in this organ.
Elimination of phy-1 expression led to a dumpy phenotype, whereas elimination of phy-2 action gave no phenotype, but the phy-1;phy-2 double mutant was embryonic lethal (14,15). The present data indicate that elimination of phy-3 function led to no obvious phenotypic abnormalities, but the 4-hydroxyproline content was markedly reduced in the early embryos, probably in the collagens of the egg shell (18,37). Interestingly, the lack of 4-hydroxyproline in the phy-3 Ϫ/Ϫ early embryos had no detectable effect on the fertility and viability of the nematodes, at least not in the controlled, non-stressed laboratory environment.
The proline content of the phy-3 Ϫ/Ϫ early embryos was not increased, but rather slightly decreased. As 4-hydroxyproline residues are essential for the stability of the collagen triple helix (2,3), it is probable that the markedly 4-hydroxyprolinedeficient collagen chains synthesized in the phy-3 Ϫ/Ϫ embryos either formed no triple-helical molecules at all or formed molecules with unstable triple helices. Both possibilities would lead to a rapid degradation of the 4-hydroxyproline deficient protein, and thus there should be no accumulation of a protein with a corresponding increase in the proline content. The deficiency of collagen in the phy-3 Ϫ/Ϫ embryos may also have caused structural changes in the egg shells that may have led to a secondary loss of some additional egg shell proteins either in vivo or during isolation of the eggs, and this may have contributed to the decrease in the proline content. Our findings differ from those that applied to the cuticle collagens in the phy-1 and phy-2 mutants, in which the deficiency in 4-hydroxyproline led to a corresponding increase in the proline content (14,15). As the phy-1 and phy-2 mutations led only to a partial deficiency of 4-hydroxyproline, its remaining content was probably sufficient to stabilize the collagen triple helix to a considerable extent.
The size of the processed PHY-3 polypeptide, 295 amino acids, is markedly different from the more than 510 residues of PHY-1 and PHY-2 (12, 14, 15) and the vertebrate P4H ␣ subunits (2, 3). An even shorter ␣ subunit, of 210 residues, has been characterized from the Paramecium bursaria Chlorella   virus-1, however (38), and a 261 amino acid P4H ␣ subunit has very recently been cloned and characterized from Arabidopsis thaliana. 2 No short collagen P4H ␣ subunit form has so far been identified in vertebrates. PHY-1 and the vertebrate ␣ subunits form an active P4H with PDI (7,8,12,28), whereas the viral (38) and A. thaliana 2 ␣ subunits are catalytically active monomers. Coexpression of recombinant PHY-3 in insect cells with C. elegans PDI-1 produced a relatively small but distinct amount of P4H activity in all experiments (n Ͼ 10), whereas coexpression of PHY-1 with PDI-1 produces no activity (13). Due to the small amounts of the soluble PHY-3 polypeptide produced and due to an aggregation tendency of the solubilized protein, it was not possible to determine whether PDI-1 formed with PHY-3 a tetramer or dimer or whether PHY-3, like the PBCV-1 and A. thaliana P4Hs, is an active monomer, and PDI-1 only acted in the insect cell experiments to assist in its correct folding. Currently we also do not know whether PDI-1 is required for the synthesis of an active PHY-3 enzyme in the nematode or whether this function is replaced by some other chaperone in vivo. It is interesting to note that the pdi-1 gene is coexpressed in an operon with a second class of protein-folding catalyst, namely the proline cis-trans-isomerase cyp-9, and it has been hypothesized that their gene products may co-operate in a common protein folding or chaperoning event (39).