Mammalian Gene PEG10 Expresses Two Reading Frames by High Efficiency –1 Frameshifting in Embryonic-associated Tissues*

Paternally expressed gene 10 (PEG10) is a mammalian gene that is essential for embryonic development in mice. The gene contains two overlapping open reading frames (ORF1 and ORF2) and is derived from a retroelement that acquired a cellular function. It is not known if both reading frames are required for PEG10 function. Synthesis of ORF2 would be possible only if programmed –1 frameshifting occurred during ORF1 translation. In this study the frameshifting activity of PEG10 was analyzed in vivo, and a potential role for ORF2 was investigated. Phylogenetic analysis demonstrated that PEG10 is highly conserved in therian mammals, with all species retaining the elements necessary for frameshifting as well as functional motifs in each ORF. The frameshift site of PEG10 was highly active in cultured cells and produced the ORF1-2 protein. In mice, endogenous ORF1 and an ORF1-2 frameshift protein were detected in the developing placenta and amniotic membrane from 9.5 days post-coitus through to term with a very high frameshift efficiency (>60%). Mutagenesis of the active site motif of a putative protease within ORF2 showed that this enzyme is active and participates in post-translational processing of PEG10 ORF1-2. Both PEG10 proteins were also detected in first trimester human placenta. By contrast, neither protein expression nor frameshifting was detected in adult mouse tissues. These studies imply that the ORF1-2 protein, synthesized utilizing the most efficient –1 frameshift mechanism yet documented in vivo, will have an essential function that is intrinsic to the importance of PEG10 in mammals.

Paternally expressed gene 10 (PEG10) is a mammalian gene that is essential for embryonic development in mice. The gene contains two overlapping open reading frames (ORF1 and ORF2) and is derived from a retroelement that acquired a cellular function. It is not known if both reading frames are required for PEG10 function. Synthesis of ORF2 would be possible only if programmed ؊1 frameshifting occurred during ORF1 translation. In this study the frameshifting activity of PEG10 was analyzed in vivo, and a potential role for ORF2 was investigated. Phylogenetic analysis demonstrated that PEG10 is highly conserved in therian mammals, with all species retaining the elements necessary for frameshifting as well as functional motifs in each ORF. The frameshift site of PEG10 was highly active in cultured cells and produced the ORF1-2 protein. In mice, endogenous ORF1 and an ORF1-2 frameshift protein were detected in the developing placenta and amniotic membrane from 9.5 days post-coitus through to term with a very high frameshift efficiency (>60%). Mutagenesis of the active site motif of a putative protease within ORF2 showed that this enzyme is active and participates in post-translational processing of PEG10 ORF1-2. Both PEG10 proteins were also detected in first trimester human placenta. By contrast, neither protein expression nor frameshifting was detected in adult mouse tissues. These studies imply that the ORF1-2 protein, synthesized utilizing the most efficient ؊1 frameshift mechanism yet documented in vivo, will have an essential function that is intrinsic to the importance of PEG10 in mammals.
A fundamental property of translation is to maintain the reading frame of the mRNA. A rare exception is programmed frameshifting in which cis elements in the mRNA force the ribosome to slip one nucleotide forwards (ϩ1) or backwards (Ϫ1) on the mRNA at a defined frequency (1). The mechanism was first described in bacteria as a ϩ1 shift (2), and although extremely rare in cellular genes, it has been identified in all three kingdoms (3,4). In Ϫ1 frameshifting, two overlapping open reading frames (ORFs) 5 can be translated from one mRNA to give a fusion protein, thereby adding new sequence and functional domains to the ORF1 from the ORF2. This is common in retrotransposons and retroviruses, which utilize frameshifting to synthesize a precise ratio of structural Gag proteins to enzymatic Pol proteins (5,6). The mRNA elements facilitating backwards (Ϫ1) frameshifting include a heptanucleotide sequence, X XXY YYZ (X ϭ U, G, or A; Y ϭ U or A; Z ϭ U, C, or A) in the 0 frame, and a nearby 3Ј stimulatory pseudoknot or stem-loop. Translational pausing facilitated by these elements creates tension on the mRNA leading to tRNA deformation and release from the codons of the 0 frame followed by re-pairing to the sequence-compatible Ϫ1 frame (7).
Until recently no examples of mammalian genes that could undergo backward (Ϫ1) frameshifting were documented. Paternally expressed gene 10 (PEG10), which contains two overlapping reading frames, is a strong candidate for such a gene. It contains the gag-pol-like structure common in retroviruses and retroelements and shows greatest homology to the Ty3/Gypsy long terminal repeat (LTR) retrotransposon Sushiichi from fugu fish (8,9). However, PEG10 has undergone molecular domestication from a retroelement into a functional gene accompanied by the loss of several features including the LTRs as well as the integrase and most of the reverse transcriptase domain of the Pol protein (10). The retained overlapping reading frames and frameshift sequence element imply PEG10 expression could involve a Ϫ1 shift to produce a functionally important ORF1-2 fusion protein (11). The transcript of human PEG10 and its mouse homologue (also known as Edr) is highly expressed in ovary, testis, placenta, and other extraembryonic tissue as well as at lower levels in brain and developing mouse embryos (8,(11)(12)(13). Although mRNAs are found in these and other tissues, PEG10 protein expression in vivo is yet to be documented apart from the identification of PEG10 ORF1 in mid-term human placenta (14). Expression of mouse PEG10 by in vitro translation (11) and human PEG10 in cell culture (15) has shown that PEG10 contains all the functional elements for expression and frameshifts with 15-30% efficiency in these settings. The phenotype of a mouse PEG10 deletion knock-out was embryonic lethal by 10.5 days post-coitus (dpc) due to defects in placental formation (16). This suggests that a functional PEG10 protein is expressed during development but does not address whether both reading frames are important. PEG10 also shows significant oncogenic function in both solid tumors and B cell acute and chronic lymphocytic leukemia through its apparent ability to prevent apoptosis and promote cell proliferation (17,18).
It is still unknown whether PEG10 frameshifts in vivo and whether ORF2 has a function. This study answers these questions, with PEG10 as the first example of an endogenous mammalian gene utilizing Ϫ1 frameshifting within specific tissues. ORF2 encodes an active protease and is, therefore, likely to be important for the function of PEG10.
Vector Construction and Cloning of PEG10-Human and mouse PEG10 constructs were created and cloned into the vector pcDNA 3.1(ϩ) (Invitrogen). Human PEG10 was supplied as the cDNA KIAA1051 (AB028974) in the pBluescript II SK(ϩ) vector (12) by the Kazusa DNA Research Institute. The mouse PEG10 sequence was a gift from Dr. Ian Brierley (22) and is the same sequence utilized in Shigemoto et al. (11). This clone contains nucleotides (nt) 350-3501 of a mouse PEG10 cDNA cloned into pSP64T and lacks a portion of the repeat-rich insertion. Sequencing analysis in this study revealed it was missing 114 nt of the insertion due to in-frame deletions of 90 and 24 nt. In addition, the clone lacks the initiation codon, which was at nt 266 -268 in the original cDNA clone. For this study, to maximize the amount of mouse PEG10 sequence, a new start site was introduced where protein homology between mouse and human begins at nt 491 on the GenBank TM mouse consensus sequence (NM130877) creating a mouse PEG10 construct without codons 2-40.
PEG10 ORF1, PEG10 ORF2, and the full-length PEG10 coding sequence were constructed by PCR from mouse and human PEG10 placed within vectors. A two-step amplification protocol added an N-terminal FLAG tag and C-terminal His tag to all human sequences, and an N-terminal FLAG tag to mouse sequences. An initiation codon was added to the PEG10 ORF2 constructs. PCR primer sequences are shown in supplemental Table 1. PCR products were cloned into pcDNA 3.1(ϩ) after enzyme restriction with HindIII and XbaI and ligation with T4 DNA ligase (Roche Applied Science) before transformation into ultracompetent DH5␣ Escherichia coli cells using a heatshock method (23). Clones were sequenced as appropriate. As noted in Shigemoto et al. (11), mouse PEG10 ORF2 is highly liable to mutations and, indeed, the OF2 clone contained a mutation corresponding to nucleotide 876 (numbering from the full-length clone).
Site-directed Mutagenesis-Two PEG10 frameshift mutants (fs mutants 1 and 2) and a protease "active site motif" mutant were created by two-step PCR (24). The mutagenic primers used are listed in supplemental Table 1. The PEG10 fs mutant 1 contained an insertion (A) two nucleotides downstream of the end of the heptanucleotide slippery sequence at position 1441 (NM_015068). The PEG10 fs mutant 2 contained a substitution (A to G) in the heptanucleotide sequence at position 1436. The conserved catalytic aspartic acid in human PEG10 protease was changed to alanine.
Step one used a mutagenic forward primer and either the reverse PEG10 ORF2 XbaI primer (fs mutants) or the reverse pcDNA 3.1(ϩ) vector primer (protease mutant). The ϳ1-kb purified product was used as a reverse "megaprimer" in step two in combination with the PEG10 ORF1/ ORF2 HindIII forward primer (fs mutants) or the forward pcDNA 3.1(ϩ) vector primer (protease mutant).
Cell Cultures and Cell Transfection-Human hepatoma G2 (HepG2) and African green monkey (Cercopithecus aethiops) kidney (COS-7) cells sourced from the American Type Culture Collection (ATCC) (Manassas, VA) were grown at 37°C and in 5% CO 2 in Dulbecco's modified Eagle's medium (Invitrogen) supplemented with 10% fetal bovine serum (Invitrogen). Passaged COS-7 cells were seeded (1.5 ϫ 10 5 cells in 2 ml) into 6-well plates and grown for ϳ24 h before transfection (FuGENE 6, Roche Applied Science) using a 1:3 ratio of DNA: FuGENE 6 with 500 ng to 2 g of DNA. Cells were cultured for 48 h then washed in 1ϫ PBS before lysis in 400 l of 1ϫ passive lysis buffer (Promega). Protein concentrations were measured using the Bradford assay (25) with the appropriate dye reagent (Bio-Rad).
Mouse Tissue Collection-The mice studied were strain C57BL/6. All organs and tissues were dissected out immediately after death, washed in ice-cold 1ϫ PBS, and frozen on dry ice. Pregnant females were studied at dpc 8.5, 9.5, 10.5, 12.5, 15.5, and 21.5. Uterine horns were removed from earlier time points, whereas single products of conception (poc) were removed individually from later time points and washed in icecold 1ϫ PBS before the poc (whole embryo/fetus, placenta, and amniotic membrane) were dissected out in 1ϫ PBS.
Human Tissue Collection-Human placental tissue was obtained from first trimester terminations between the fourth and tenth week. Embryonic gestational ages were determined before termination by ultrasonographic measurement and using the Carnegie Stages of embryo development (26) where possible. Sections through the entire thickness of developing placenta were selected, washed in 1ϫ PBS, and snap-frozen on dry ice.
Protein Extraction-Protein was extracted from frozen tissue in ice-cold protease protective extraction buffer (50 mM Tris-HCl, pH 8.0, 50 mM NaCl, 1% Triton X-100 (BDH), 10 mM EGTA, 1 mM EDTA, 2 mM dithiothreitol, and complete mini EDTA-free protease inhibitor (Roche Applied Science)). Tissue was rapidly homogenized on ice using micro-pestles, cells were disrupted for 10 min in a sonication bath (Soniclean), and membranes were separated by centrifugation at 13,000 rpm for 15 min at 4°C in a microcentrifuge. The supernatant was collected and centrifuged at 13,000 rpm for 20 min at 4°C. Protein concentrations were measured using the Bradford assay (25).
Antibody Production-Polyclonal antibodies were raised in rabbits against each reading frame of the human PEG10 protein. Human PEG10 ORF1 and PEG10 ORF2 were cloned individually into pQE-80L (Qiagen) and the recombinant Histagged protein expressed in E. coli. His tag-purified ORF1 and ORF2 were injected into rabbits. Generation of the PEG10specific antibodies was determined by comparing PEG10 recognition between preimmune sera and sera obtained after inoculation and by testing antibody reactivity on untransfected and PEG10-transfected COS-7 cell lysates. Antibody sensitivity was determined by testing the antibody reactivity to a dilution series of purified PEG10 ORF1 or ORF2 protein.
Immunoblotting-Cell culture and tissue lysates were prepared in protein denaturing buffer and heated to 95°C for 5 min. Proteins were separated by SDS-PAGE on polyacrylamide gels (7-12.5% (w/v)). Proteins were transferred to Protran nitrocellulose membranes (Whatman) and blocked in 5% (w/v) milk powder in 1ϫ TBST (40 mM Tris-HCl, pH 7.6, 150 mM NaCl, 0.5% (v/v) Tween 20 (USB Corp.)) overnight. Then membranes were hybridized with PEG10 ORF1 or ORF2 antibody or a mouse monoclonal antibody against FLAG (Sigma) or rabbit anti-actin (Sigma). Horseradish peroxidase-conjugated antimouse or anti-rabbit IgG (Sigma) was used as a secondary antibody, and detection was performed using the enhanced chemiluminescence Western-blotting detection reagents (GE Healthcare).
Mass Spectrometry-Protein samples to be examined were separated by one-dimensional SDS-PAGE. One lane of sample was subjected to Western blot analysis to identify the position of target proteins. Thin gel slices (ϳ2 mm wide) of the respective molecular weight range were then excised from the gel and subjected to in-gel digestion as previously described (27). Proteins were fragmented by overnight tryptic digestion using a 1:10 ratio of g of trypsin to g of substrate protein. Tryptic peptides were fractionated by reverse-phase liquid chromatography on a nanoflow C18 column (PepMap C18, Dionex Co.) using an UltiMate TM 3000 LC system (Dionex) and eluted with a 1.6 -72% acetonitrile gradient over 60 min. Eluted fractions were automatically mixed with matrix (3 mg of ␣-cyano-4-hydroxycinnamic acid in 75% (v/v) acetonitrile, 0.1% (v/v) trifluoroacetic acid) and spotted onto the MALDI plate (Opti-TOF LC/MALDI Insert, Applied Biosystems) using a Probot Micro-fraction Collector (Dionex). Mass spectrometry (MS) was performed using a 4800 MALDI tandem time of flight analyzer (Applied Biosystems). MS data were acquired on each sample and subjected to a peptide mass fingerprint analysis against a short sequence data base containing the amino acid sequences derived from PEG10 ORF1 and PEG10 ORF1-2 using the MASCOT search engine (Matrix Science). MASCOT search parameters allowed carboxyamidomethyl cysteine as a variable modification, two missed trypsin cleavage sites, a precursor tolerance of 50 ppm, and a default charge state of 1ϩ. Collisioninduced dissociation mass spectra were then acquired on the 5 strongest precursor ions per sample spot including a list of 50 preferred ions that matched PEG10-derived peptides in the peptide mass fingerprint analysis. These potential PEG10 peptide ions were selected for collision-induced dissociation MS analysis if they were detected with a signal to noise threshold of 40. MS/MS data were analyzed with the MASCOT search engine, which searched the SwissProt data base with PEG10 ORF1 and PEG10 ORF1-2 protein sequences integrated into it. MASCOT search parameters were as given for the peptide mass fingerprint analysis and with a mass tolerance of 0.3 Da for fragment ions. Peptides identified by MASCOT as having a significant individual ion score at a threshold of p Ͻ 0.05 were considered significant peptide identifications.

RESULTS
Bioinformatic Analysis of PEG10-To investigate the phylogenetic distribution and conservation of PEG10, BLASTN searches were performed on the GenBank TM databases using the full-length ϳ6.2-kilobase cDNA of human PEG10 (8) as a query. The search focused on sequences containing homology to both reading frames, allowing inspection of the frameshift site as well as avoiding the selection of distantly related retrotransposons with partial homologies. Genes orthologous to full-length PEG10 were identified in 11 other mammalian species (see "Experimental Procedures" for accession numbers). Most of these orthologs have been listed by previous studies (28 -30). Those studies focused on the analysis of the Ty3-Gypsy or Mart family of retrotransposons. The present study undertook a more in-depth investigation of the phylogenetic conservation within the PEG10/Mart2 orthologs. An ortholog of PEG10 has recently been identified in the tammar wallaby (Macropus eugenii), extending the existence of PEG10 orthologs from the eutherians into the metatherians (31). The provided sequence was included in the analysis. A possible opossum homologue (29) has not been annotated as yet and, therefore, was not included.
Within all of the identified PEG10 orthologs two overlapping ORFs were predicted (Fig. 1A) with a very high level of conservation over the entire length of both reading frames and within regions of the 3Ј-untranslated region. Most importantly, there was extremely high conservation of the sequence elements required for functional frameshifting. The G GGA AAC consistent with the consensus heptanucleotide slippery sequence X XXY YYZ was completely conserved in all species, and the sequence of a downstream pseudoknot (22) was completely conserved except for one point mutation in the rodent sequence that strengthens the pseudoknot structure (22) and two mutations in the wallaby sequence at the bottom of stem two that would be expected to impair stem two formation (Fig.  1B). This conservation strongly suggests that all species containing PEG10 have retained the ability to express both reading frames of PEG10 via a frameshift mechanism. The conclusion is further supported by the analysis of conserved motifs found in both reading frames. Structures typical for LTR retrotransposons such as the major homology region, the CX 2 CX 4 HX 4 C zinc finger, and the DSG aspartic protease active site motif have been identified previously in both human and mouse PEG10 (9,11). On closer investigation of the 12 protein orthologs, a highly conserved major homology region and zinc-finger motif were identified in ORF1 (Fig. 1C). Analysis of PEG10 ORF2 revealed that all PEG10 orthologs lack motifs required for retrotransposition including the reverse transcriptase, RNase H, or integrase. In contrast, they show complete conservation of the DSG protease active site motif (Fig. 1, A and D). In addition, a striking polyproline stretch of 12-16 amino acids in length was identified at the C terminus of ORF2 in all eutherian species, whereas it was missing in the wallaby PEG10, which terminates immediately upstream of the stretch (Fig. 1D). Proline-rich motifs are found in signaling proteins and are commonly phosphorylated at serine and threonine residues within or immediately adjacent to the motif (32). A conserved SYST (SYSA in cow and rat, SCSA in mouse) motif was found immediately downstream of the proline stretch (Fig. 1D). The conservation of the protease despite the loss of the remaining protein domains commonly found in Pol proteins of retrotransposons implies the second ORF is functionally important and may have retained its protease activity. This would allow post-translational proteolytic processing of the full-length fusion product.
Within the group of eutherian PEG10 orthologs, mouse and rat PEG10 are the most divergent from the others and contain small and large insertions within both reading frames. A large insertion within the second reading frame of the mouse gene was also previously noted (29). In addition, the translation initiation site appears to have been translocated ϳ120 nt upstream of the human start codon. These changes may alter the protein function in rodents.
The human PEG10 ORF1 start codon, which is preceded by a strong Kozak sequence (33), is conserved within the primate species (human, chimpanzee, baboon, marmoset), whereas a CTG is found at this position in laurasiatheria (bat, cat, dog, cow, shrew, pig) (34). Translation initiation from CTG codons is possible and has been reported for a small number of vertebrate genes (35). Translation of pig PEG10 has been reported to start ϳ240 nt upstream of the human ATG (36), but this putative start codon has not been conserved within the remaining laurasiatherian species. By contrast, the PEG10 termination site and codon have been conserved for both reading frames between all species. The TAG stop codon of ORF1 is located within the predicted pseudoknot structure (Fig. 1B), whereas the ORF2 stop codon has been conserved after a proline-rich stretch of amino acids at the end of ORF2 in all eutherian species. In contrast, the wallaby PEG10 terminates directly before the start of the proline-rich stretch (see Fig. 1B).
This analysis of PEG10 orthologs identified a strikingly high conservation of the gene within metatherian and eutherian mammals. Together with the reported absence of PEG10 in the platypus (prototheria) (31), this suggests that a Ty3-Gypsy retrotransposon integrated into the genome of a therian ancestor after the split from the prototherians and subsequently lost its mobility. The conservation of the frameshift elements as well as both ORFs strongly suggests that the remaining protein acquired a functional importance and supports a prediction that as well as the short ORF1 protein, a full-length fusion protein is generated.
PEG10 Frameshifts in Cell Culture and Expresses an Active Protease-To investigate PEG10 expression for evidence of frameshifting and the presence of an active protease, polyclonal antibodies were raised that could recognize either the PEG10 ORF1 or the PEG10 ORF2 proteins. The sensitivity of the antibodies to detect antigen was determined using a dilution series of purified human PEG10 ORF1 or ORF2 proteins and demonstrated that the antibodies could recognize as little as 0.1 ng of the ORF1 and 0.1-1 ng of the ORF2 human proteins (Fig. 2A). The expression from human PEG10 constructs (PEG10 ORF1 and PEG10 ORF2 and full-length PEG10) transfected into COS-7 cells were then examined. All constructs also had a FLAG tag attached to the N terminus to allow an independent determination of PEG10 expression.
Human PEG10 underwent Ϫ1 frameshifting to produce an ϳ50-kDa ORF1 protein and an ϳ100-kDa ORF1-2 frameshift fusion protein with a 22% frameshift efficiency determined from the PEG10 construct as identified by the ORF1 antibody (Fig. 2B, lane 3). The 100-kDa protein was also identified by the PEG10 ORF2 antibody verifying that this protein contains peptide sequences from the two different reading frames (Fig. 2C,  lane 3). A number of additional antibody-reactive protein products were also detected (discussed in detail below).
Both the observed PEG10 ORF1 and ORF2 relative masses of ϳ50 kDa were larger than those predicted by ProtParam with the ORF1 protein Ͼ10-kDa and ORF2 protein 6 -7 kDa larger (Fig. 2, B and C, lanes 2). This difference, noted previously for ORF1 (15), suggests the presence of post-translational modification (note that the His and FLAG tags add less than 2 kDa to each ORF).
To confirm that the ORF1-2 fusion protein was produced by ribosomal frameshifting at the identified heptanucleotide sequence, two PEG10 frameshift mutants (fs mutants) were constructed as shown in Fig. 3A. PEG10 fs mutant 1 contained a single base insertion that placed ORF1 and ORF2 in the same (0) frame (and consequently the stop codon of ORF1 in the ϩ1 frame). PEG10 fs mutant 2 had an A to G change in the heptanucleotide sequence that was expected to decrease frameshifting by disrupting the integrity of the X XXY YYZ motif. As expected, the dominant PEG10 protein expressed from the 0-frame mutant and detected by the ORF1 antibody was the 100-Da protein (Fig. 3B, lane 4). It was also detected with the ORF2 antibody (Fig. 3C, lane 4). There was also an interesting pattern of other immunoreactive proteins of lower relative mass discussed in detail below (Fig. 3, B and C, lanes 3 and 4). Disrupting putative frameshifting in fs mutant 2 drastically decreased the presence of the 100-kDa fusion protein with only PEG10 ORF1 being efficiently expressed (Fig. 3, B and C, lanes  5). Translation of the 100-kDa PEG10 ORF1-2 fusion protein requires the frameshift motif to facilitate Ϫ1 frameshifting during translation of PEG10 mRNA.
What might be a possible functional role for the ORF2 protein? There has been no functional investigation of ORF2 previously. Our bioinformatic analysis showed that a conserved aspartic protease motif in ORF2 is present in all PEG10 orthologs. Because viral-type aspartic proteases, like that of HIV-1, catalyze the cleavage of a fusion protein into multiple protein products (37), we investigated whether the PEG10 protease was carrying out a similar function. Expression of PEG10 in cultured cells demonstrated that, apart from one protein product of ϳ43 kDa apparently formed from PEG10 ORF1, extra PEG10 protein products were likely to have derived from the PEG10 ORF1-2 fusion protein (Fig. 3, B and C, lanes 3) that were not present when either PEG10 ORF1 or ORF2 was expressed alone. Three fragments of ϳ40, ϳ37 and ϳ25 kDa were detected specifically with an ORF1 antibody (Fig. 3B, lanes  3 and 4). The ORF2 antibody also detected three products of ϳ52, ϳ55 and ϳ75 kDa intermediate in size between PEG10 ORF1 and ORF1-2 (Fig. 3C, lanes 3 and 4). Of the derived fragments, the ϳ37-kDa band was recognized by the FLAG antibody (targeting the N terminus) consistent with a previously reported product with an N-terminal tag/antibody recognition (15) (supplemental Fig. 1). This suggests that the 37-kDa product contains an intact N terminus.
To test whether the protease sequence within ORF2 is active and responsible for the putative cleavage products, the conserved catalytic aspartate (of the DSG) motif was mutated to an alanine, a mutation expected to disrupt catalytic activity. Expression of the PEG10 protease mutant showed that the production of all protein fragments normally derived from the fusion protein was abolished, and there was an increased amount of ORF1-2 fusion protein (Fig. 3, B and C, lanes 6). The disappearance of all extra bands identified by the ORF2 antibody also demonstrated that they did not result from internal initiation during translation. The frameshift efficiency with the disabled protease mutant dramatically increased (ϳ65% compared with ϳ20% for the original construct). Therefore, the frameshift efficiency was calculated to take into account all PEG10-derived protein fragments (from six separate transfections analyzed with ORF1 antibody and five transfections analyzed with the ORF2 antibody). The PEG10 frameshift efficiency was then consistently very high at ϳ60% in multiple expressions (that of HIV-1 is 5-10%). These results also demonstrated that PEG10 ORF2 contains an active protease that, similar to the HIV-1 Pol, is able to process fulllength fusion protein. This provides the first evidence for functional activity in ORF2. How the cleaved fragments contribute to PEG10 function is of ongoing interest.
Endogenous Expression of PEG10 in Mammals-Are PEG10 proteins expressed in vivo in mammalian tissues with active frameshifting, as has been demonstrated with transfected constructs in our cell culture model? Although previous studies have failed to uncover PEG10 protein expression, the mouse PEG10 knock-out was embryonic-lethal (16). We used the mouse as a model to search for endogenous protein expression in normal tissues. First, the mouse PEG10 coding sequence was cloned from the vector previously used (11). Both the full-length mouse PEG10 gene and the separate ORF1 and ORF2 reading frames were cloned and expressed in COS-7 cells. As shown in Fig. 3D the mouse ORF1 protein was ϳ50 kDa (lane 2) with an ϳ140-kDa ORF1-2 fusion protein (lane 3) that could be detected by both the ORF1 (lower panel) and ORF2 antibodies (upper panel). The antibodies raised against the human proteins recognized the mouse PEG10 proteins efficiently but with a 12-and 6-fold decrease in sensitivity respectively compared with human PEG10 (results not shown). A control for endogenous PEG10 expression is shown with protein extracts of human hepatoma (HepG2) cells (Fig. 3D, lane  4). This indicates that the difference in size between the mouse and human fusion proteins reflects insertions in the mouse gene (lanes 3 and 4).
A number of mouse tissues and organs were tested for the presence of PEG10 protein. Several studies have shown PEG10 transcripts in mouse adult tissues (11,28,29). A screen of adult mouse tissues from both male and female mice and with both

PEG10 Frameshifts in Embryo-associated Tissues
ORF1 (Fig. 4A, lower panel) and ORF2 (upper panel) antibodies provided no evidence of PEG10 expression or frameshifting (the expected mouse ORF1-2 product should be of higher relative mass than that of the control HepG2 human protein shown in lane 1, as illustrated in Fig. 4B, lanes 2 and 3). Of interest, was a band of ϳ48 kDa found in heart extracts and a band of ϳ55 kDa in spleen extracts that reacted with the ORF1 antibody (Fig. 4A, lanes 7 and 9, lower panel), although there was no evidence of a PEG10 frameshift protein. We concluded that the proteins were not PEG10-related for three reasons; (i) no PEG10 ORF1 peptides were detected in a mass spectrometry analysis of the 48-kDa fraction from heart, (ii) the protein band found in spleen extracts could only be detected with one of the two antibodies originally prepared against ORF1 and, therefore, was not analyzed further, and (iii) the heart extract was unable to ablate ORF1 antibody recognition of PEG10 when used to pre-absorb the active antibody species.
PEG10 transcripts have also been found in embryonic and extra-embryonic tissue (11,13,16,29). We examined these tissues at 10.5 dpc; placentas were strongly positive for PEG10 and expressed both the ϳ47-kDa PEG10 ORF1 and the ϳ150-kDa ORF1-2 fusion protein with an extremely high frameshift effi- ciency of 66% (Fig. 4B, lane 2), similar to that found for human PEG10 in cultured cells when proteolytic cleavage products were precluded. The size of the fusion protein (150 kDa) also demonstrated that the large repeat-rich insertion in mouse ORF2 was present in the translated protein. Amniotic membrane was similarly positive for PEG10 with highly efficient frameshifting (lane 3), whereas whole embryos from the same stage (dpc 10.5) showed a low level of PEG10 expression, but frameshifting was still detected. Probing these tissues using an ORF2 antibody (Fig. 4B, upper panel) and a second ORF1 antibody (data not shown) confirmed these results, although the fusion product was just detectable with anti-ORF2. A slight increase in the size of PEG10 ORF1 is present in the amniotic sample, suggesting that PEG10 may be uniquely post-translationally modified in this membrane.
Identification of Endogenous Placental Expression of PEG10 ORF1-2 by Mass Spectrometry-To provide direct evidence for the endogenous expression of PEG10 via a Ϫ1 frameshift mechanism, the 150-kDa protein fraction from a mouse placental sample was analyzed by tandem mass spectrometry. A MASCOT search of the MS/MS data revealed an unambiguous identification of the PEG10 ORF1-2 fusion protein with 19% sequence coverage and significant matches (p Ͻ 0.05) to 13 PEG10 peptides from both the ORF1 and ORF2 reading frames. Fig. 4C shows a MS/MS spectrum from a PEG10 ORF1 peptide (top) and ORF2 peptide (bottom) that are highly significant for the identification of PEG10. The best match shown for the ORF2 peptide had an E value of 5.1e Ϫ016 . A search of a derived SwissProt data base containing PEG10 demonstrated that this spectrum matched a protein sequence unique to mouse PEG10 ORF2. These results provide convincing evidence that PEG10 carries out Ϫ1 frameshifting during translation to produce a ϳ150-kDa frameshift protein.

PEG10 Frameshifts in Mouse Development from Midgestation-
To characterize the developmental expression profile of PEG10 in mouse placenta and whether expression always includes high efficiency frameshifting to produce the fusion protein, time-courses of PEG10 expression were carried out throughout gestation. As shown in Fig. 5A, expression was first detectable at dpc 9.5 and appeared to increase as gestation progresses peaking around dpc 12.5-15.5 when mature placenta is present. The activation of PEG10 expression at dpc 9.5 is consistent with findings (16) where the knock-out of the full PEG10 gene caused embryonic lethality due to placental malformation between dpc 9.5 and 10.5.
Expression of PEG10 in the placenta showed high efficiency frameshifting to produce the 150-kDa fusion protein throughout the time-course. From dpc 9.5 when PEG10 expression is first detectable, the frameshift protein is dominant, strongly implying importance for the overall function of the gene. The frameshift efficiency as detected by the ORF1 antibody shows an apparent decrease from 68% at dpc 9.5 to 43% by dpc 21.5 (Fig. 5B).
Similar to PEG10 expression in cell culture, there are additional PEG10 products derived from the frameshift protein. For example, a 105-kDa protein in mid to late gestation was recognized by anti-ORF2. By dpc 21.5 it was present in greater amounts than the full-length PEG10 (Fig. 5A, lanes 5 and 6; quantitated in Fig. 5C). Mass spectrometry identified the 105-kDa protein from dpc 15.5 placenta as a PEG10 product consisting primarily of PEG10 ORF2 but containing peptides from both reading frames (data not shown). The identification of a peptide less than 40 amino acids from the C terminus of the ϳ1000 amino acid protein precluded the possibility that the protein was a premature translation termination product. The known functionality of the ORF2 protease (Fig. 3, B and C) and delayed build-up of the 105-kDa PEG10 protein (Fig. 5C) compared with the frameshift protein is consistent with it being derived from the full-length fusion protein. We have named this product the PEG10 C-terminal fragment (PEG10-CTF). The accumulation of this smaller 105-kDa PEG10 frameshift product accounted for the observed apparent decrease (asterisk) in frameshift efficiency (Fig. 5B). Calculating the frameshift efficiency after including both PEG10 and PEG10-CTF frameshift products showed that the efficiency decreased at most only slightly with increased gestation (regression analysis indicated no significant decrease) (Fig. 5B). This is a strong indicator that the formation of the secondary protein is responsible for the apparent decrease in frameshifting. In contrast, taking only full-length PEG10 into account, a significant decrease in frameshifting was shown by regression analysis.
A time-course of PEG10 expression (detected by anti-ORF1) in amniotic membrane displays a similar profile to that of placenta with low expression at dpc 9.5 but increased and continued expression throughout gestation (Fig. 5D). The frameshift fusion protein was the dominant band as in placenta. Surprisingly, PEG10 ORF1 showed increased mass at dpc 10.5 compared with the other time points as noted in Fig. 4B (lane 2), and this finding was reproduced with a second mouse litter. At other gestational time points the slightly smaller ϳ47-kDa band was identical in size to that in placenta. If the observed mass increase is due to post-translational modification, this modification appears to be transient and may have functional significance.
Expression of PEG10 in First Trimester Human Placenta-PEG10 is also expressed in human placenta. To determine whether the PEG10 expression profiles between mouse and human are similar, first trimester human placentas between the 4th and 10th week of development post-ovulation were examined for PEG10 protein. Western blot analysis with anti-ORF1 revealed that PEG10 was expressed at all first trimester time points tested (Fig. 6). Similar to the PEG10 expression profile found in HepG2 cells, three ϳ50-kDa ORF1 proteins and a ϳ100-kDa frameshift protein were detected in human placenta. Week 5 in human development (postovulatory days 29 -35) corresponds to days 10 and 11 of mouse development (38) demonstrating that the PEG10 gene is active at similar times in mouse and human placental development. PEG10 is the first example of a human gene that utilizes a Ϫ1 frameshift mechanism for its expression in normal development. The PEG10 gene will likely play an integral role in the development of human placenta.

DISCUSSION
The findings of the present study have demonstrated that PEG10 has evolved from a retroelement into a gene that is required during prenatal development and has maintained its ability to carry out Ϫ1 frameshifting to express two proteins from the same mRNA. This is the first demonstration of Ϫ1 frameshifting for gene expression in animal tissues and is the highest frameshift efficiency reported. Our demonstration of a functional protease within the second reading frame is strongly indicative that the PEG10 frameshift product plays an essential role in the function of PEG10 during development.
Bioinformatic analysis showed that PEG10 has retained some characteristics of a retrotransposon, with ORF2 retaining a putative viral-type aspartic protease sequence. However, this reading frame has diverged significantly from a typical retroelement Pol protein with the loss of the sequences necessary for reverse transcriptase and integrase activity. It has acquired potential new functions in the C-terminal region through the polyproline domain and associated phosphorylation motif. This conservation of uncorrupted ORFs and the evolution of new functional motifs in ORF2 is highly suggestive that PEG10 ORF2 expression is important for the overall function of PEG10 and implies ORF2 may be under regulatory control via phosphorylation/dephosphorylation states.
Considering the importance of PEG10 to the development of the chorioallantoic placenta found in eutherian mammals (16), it was surprising that an uncorrupted PEG10 gene is also maintained in a marsupial, the tammar wallaby. The tammar wallaby contains a noninvasive choriovitelline placenta where most nutrient absorption is through the yolk sac (39). Expression of PEG10 mRNA has been detected in tammar wallaby yolk sac and in embryos (31). However, in mice PEG10 expression is not detectable until dpc 9.5, a time when the mouse placenta starts to transition from a choriovitelline to a chorioallantoic placenta. This suggests that PEG10 function is adaptable for the different groups. The polymorphisms present in the tammar wallaby PEG10 pseudoknot suggest that they are likely to decrease pseudoknot effectiveness; therefore, any frameshifting in the wallaby gene is likely to be significantly less efficient than that for placental mammals. A possible scheme for the evolution of PEG10 in mammals based on the multiple sequence alignments and the presence of a PEG10-derived sequence in marsupials is shown in Fig. 7.
Expression of native and mutant PEG10 constructs in cell culture demonstrated that to express the ORF1-2 fusion protein, PEG10 does indeed need an activeϪ1 programmed ribosomal frameshift element. PEG10 products derived from the fusion protein were abolished by mutation of the catalytic aspartate residue in the PEG10 ORF2 protease, demonstrating that the ORF2 protease was still catalytically active in PEG10. Both ORF1 and ORF2 antibodies detected three separate protein fragments suggesting the presence of multiple proteolytic cleavage sites in the PEG10 protein. There is a possible aspartic protease cleavage site at the junction between ORF1 and ORF2 (15), and the PEG10 cleavage products of ϳ51 and ϳ40 kDa may arise from cleavage at this site. This would result in cleavage of PEG10 "Gag" from "Gag-Pol," consistent with what is observed during the processing of HIV-1 and LTR retrotransposons (37). Aspartate proteases require two catalytic aspartate residues for activity, and so the PEG10 protease must dimerize like other viral-type aspartic proteases that only contain onehalf of the two catalytic aspartate residues (37). Future experiments will investigate whether PEG10 utilizes a dimer interface similar to HIV-1 protease, the reference viral-type aspartic protease. The D470A substitution may reduce the dimerization efficiency, but comparison to HIV-1 protease structures suggests otherwise. Several other regions confer stability at the dimer interface (40).
Investigation of PEG10 translation in mouse and human tissues did not separate the synthesis of ORF1 and the ORF1-2 fusion protein in any tissue as might have been anticipated. Rather, an extraordinarily high frameshift efficiency to produce the ORF1-2 fusion protein was always associated with PEG10 expression. The efficiency observed in mouse placenta at 66% is severalfold higher than that reported for most retroviruses and cellular genes (for review, see Refs. 7 and 41). This assumes that the ORF1 and ORF1-2 fusion products each have a similar halflife. An apparent high frameshift efficiency would be measured if the half-life of ORF1 was significantly shorter, but our studies with the individual ORFs in cultured cells do not support this. A recent study in yeast demonstrated with a reporter plasmid that potential frameshift elements could also support high efficiency frameshifting (42).
As in cell culture, PEG10 expression in vivo in placenta involved post-translational processing to produce PEG10-CTF. Although the protein C-terminal region was present, the closest peptide to the N terminus identified was an ORF1 peptide localized between the major homology region and zinc-finger motifs, suggesting that the capsid-like domain may not be in the PEG10-CTF.
Screening mouse tissues with PEG10 antibodies showed that no protein expression could be detected in adult tissues within the limits of our immunodetection despite evidence that PEG10 mRNA is expressed in some adult organs. This apparent paradox could be explained by a tight regulation of PEG10 translation by inhibitory proteins or microRNAs. Such negative translational regulation must be absent during development when PEG10 expression was observed, particularly in extraembryonic tissues.
Mouse placenta showed translation of PEG10 protein from dpc 9.5, a time when the developing placenta is undergoing chorioallantoic fusion and is about to begin chorionic folding and branching to form the labyrinth and differentiation of cells in the ectoplacental cone into the structural spongiotrophoblast layer that supports it (43). This was in good agreement with the results of the mouse PEG10 gene knock-out where the placenta failed to develop a three-layered structure preventing the flow of nutrients to the fetus (16). This labyrinth layer, which regulates the exchange of molecules between mother and fetus, and the spongiotrophoblast layer were missing in the PEG10 knock-out mouse. As both of these structures form from around dpc 9.5, this study provides evidence that PEG10 is likely to be involved in their formation. Comparison of the gestational ages when mouse PEG10 is expressed with the results from human first trimester placenta revealed similarities between the time of induction of expression in placental development. Because human PEG10 was already present by week 5, the time of the induction of PEG10 expression is unknown.
The genomes of humans and other mammals have incorporated a significant number of genes derived from retroelements and retroviruses by a process called molecular domestication (44). The evolution of the placenta is one process that has utilized the functionalities of retroelements and retroviruses. Along with PEG10, the retroviral envelope-derived protein syncytin plays an important role in human placental development by promoting cell fusion to form the human nutrient exchange layer (45).
Are there likely to be other domesticated genes that carry out Ϫ1 frameshifting? The human paraneoplastic antigen gene Ma3 has a functional frameshift element, shown to be active between two reporter genes at ϳ20% efficiency in vitro and ϳ18% in cell culture (46). The Ma3 ORF1 contains homology to retroviral Gag proteins in rodents and other primates, but the poorly conserved putative ORF2 protein does not contain a Pol-like sequence and shows no homology to any proteins in the data base (46). The potential importance of the ORF2 protein function to Ma3 is therefore unknown.
The identification of mammalian genes that utilize Ϫ1 frameshifting gives some caution to attempts to create frame- shift-disrupting therapeutics for viruses such as HIV-1. So far, our results are encouraging if PEG10 expression is indeed limited to embryonic-associated tissues. Conversely, the identification of the fact that the PEG10 fusion protein is functional suggests it might be a potential drug target for several human cancers. An oncogenic function for PEG10 has been suggested in a number of cancers (14,17,18), especially hepatocellular carcinoma. A drug designed to target PEG10 frameshifting or its protease activity may be able to interrupt PEG10 oncogenic function, perhaps by causing a pause during translation of the frameshift site that is long enough to effect so-called "no go"mediated decay of the mRNA (47).