The Human OCT-4 Isoforms Differ in Their Ability to Confer Self-renewal*

OCT-4 transcription factors play an important role in maintaining the pluripotent state of embryonic stem cells and may prevent expression of genes activated during differentiation. Human OCT-4 isoform mRNAs encode proteins that have identical POU DNA binding domains and C-terminal domains but differ in their N-terminal domains. We report here the cloning and characterization of the human OCT-4B isoform. Human OCT-4B cDNA encodes a 265-amino acid protein with a predicted molecular mass of 30 kDa. Embryonic stem (ES) cell-based complementation assays using ZHBTc4 ES cells showed that unlike human OCT-4A, OCT-4B cannot sustain ES cell self-renewal. In addition, OCT-4B does not bind to a probe carrying the OCT-4 consensus binding sequence, and we demonstrate that two separate regions of its N-terminal domain are responsible for inhibiting DNA binding. We also demonstrate that OCT-4B is mainly localized to the cytoplasm. Overexpression of OCT-4B did not activate transcription from OCT-4-dependent promoters, although OCT-4A did as reported previously. Furthermore, transcriptional activation by human OCT-4A was not inhibited by co-expression of OCT-4B. Taken together, these data suggest that the DNA binding, transactivation, and abilities to confer self-renewal of the human OCT-4 isoforms differ.

The oct-4 gene, also referred to as oct-3, encodes a nuclear protein that belongs to a family of transcription factors containing the POU DNA binding domain (1)(2)(3)(4)(5)(6). It is normally found in the pluripotent stem cells of pregastrulation embryos, including oocytes, early cleavage-stage embryos, and the inner cell mass of the blastocyst (1,3,7,8). Its expression is downregulated during differentiation, and knock-out of oct-4 causes early lethality in mice because of the absence of an inner cell mass (9). These results suggest that OCT-4 plays a pivotal role in mammalian development (10) and in the self-renewal of embryonic stem cells (11). During human development,  is expressed at least until the blastocyst stage where it regulates gene expression (12).
OCT-4 is a transcriptional regulator of genes involved in maintaining the undifferentiated pluripotent state and may also prevent expression of genes activated during differentiation (13). It activates transcription via octamer motifs located proximal or distal to transcriptional start sites (14). The POU domain of OCT-4 is a conserved DNA binding domain that binds as a monomer to the octamer sequence motif 5Ј-ATG-CAAAT-3Ј (15). This cis-acting element is important in controlling the activity of many promoters and enhancers of housekeeping and cell type-specific genes (16). OCT-4-binding sites have been found in various genes, including fgf-4 (fibroblast growth factor-4), pdgf␣r (platelet-derived growth factor-␣ receptor), osteopontin, and Nanog (17)(18)(19)(20)(21). In addition, genes expressed in the trophectoderm but not in the embryo prior to blastocyst formation, such as IFN-(-interferon) and the ␣ and ␤ subunits of chorionic gonadotropin, may be targets for silencing by OCT-4 (22)(23)(24). This suggests that OCT-4 functions as a master switch during differentiation by regulating cells that have pluripotent potential or can develop such potential (25,26).
Transcriptional regulation by OCT-4 is complex. In embryonic stem cells, the octamer sequence motif is active irrespective of its distance from the site of transcriptional initiation (2,28). However, in differentiated cells, OCT-4 can transactivate only when the octamer motif is in a proximal position (1,13,29); to be active from distal sites, it requires stem cell-specific bridging factors that link it to the transcription initiation site (29). A number of factors such as Sox2, high mobility group, E7, and E1A are known to influence the ability of OCT-4 to act as activator or repressor (15, 29 -32). Recently, physical association of OCT-4 with Ewing's sarcoma protein was documented, suggesting that Ewing's sarcoma protein may also play a role in regulating OCT-4 (35).
Although only a single form of OCT-4 mRNA has been identified in embryonic mouse tissues, two forms, i.e. OCT-4A and OCT-4B, generated by alternative splicing, were identified in the RT 3 -PCR products from adult human pancreatic islets (36). Compared with human OCT-4A, little is known about the properties of OCT-4B. To identify the biochemical functions of the human OCT-4B isoform, we performed RT-PCR and sequenced OCT-4 cDNAs from human ES cells. This revealed a novel alternative spliced variant of OCT-4 mRNA in which exon 1a is replaced by exon 1b. The DNA binding and C-terminal domains of OCT-4B are identical to the corresponding domains of OCT-4A, but it lacks the sequences necessary for transactivation. Moreover, it does not bind DNA and mainly localizes to the cytoplasm. We also found that, unlike OCT-4A (35), it cannot stimulate transcription from OCT-4-dependent promoters, nor does it antagonize the induction of gene expression by OCT-4A. In addition, ectopic expression of human OCT-4B in ZHBTc4 ES cells, unlike that of human OCT-4A, was not sufficient to maintain stem cell self-renewal and permit them to display differentiated ES cell phenotypes. These data imply that the DNA binding and transactivation properties of the human OCT-4 isoforms and their abilities to confer selfrenewal differ. Thus, the different polypeptides encoded by the human OCT-4 gene may have different targets as well as different roles as regulators of human ES cells.

EXPERIMENTAL PROCEDURES
Materials and General Methods-Restriction endonucleases, calf intestinal alkaline phosphatase, the Klenow fragment of DNA polymerase I, and T4 DNA ligase were purchased from New England Biolabs. Pfu Turbo polymerase was purchased from Stratagene, and [␥-32 P]ATP (3000 Ci/mmol) was obtained from PerkinElmer Life Sciences. Preparation of plasmid DNA, restriction enzyme digestion, agarose gel electrophoresis of DNA, DNA ligation, bacterial transformations, and SDS-PAGE of proteins were carried out by standard methods (37). Subclones generated from PCR products were sequenced by the chain termination method with double-stranded DNA templates to ensure the absence of mutations.
Cell Culture-Human ES cells (Miz-hES1, SNU-hESC3, and Cha-hESC3) were grown as described previously (38,39). Briefly, they were cultured in Dulbecco's modified Eagle's medium/F12 medium with 20% knock-out serum replacement, 1 mM L-glutamine, 1% nonessential amino acids, 0.1 mM ␤-mercaptoethanol, and 4 ng/ml basic fibroblast growth factor (Invitrogen) on mitomycin C-treated mouse embryonic fibroblast feeders at 37°C and 5% CO 2 . After 5 days of culture, colonies were detached mechanically from the feeder cells with a micropipette, and individual colonies were mechanically divided into four or five pieces. These ES cell clumps were then separately plated on fresh feeder cell layers. HEK293T or NIH3T3 cells were maintained in Dulbecco's modified Eagle's medium with 10% heat-inactivated fetal calf serum (Invitrogen), penicillin, and streptomycin.
RT-PCRs for Oct-4 downstream target genes and differentiation marker genes were performed with gene-specific primer sets as described previously (9,11,30,41). Alkaline phosphatase was stained with an AP staining kit (Sigma).
Quantitative Real Time PCR-Quantitative real time PCR was performed with an Applied Biosystems 7500 Fast real time PCR system (Applied Biosystems) and SYBR Green Master Mix (Applied Biosystems), as described previously (43). As a control, the level of GAPDH mRNA was determined in the real time PCR assay of each RNA sample and was used to correct for experimental variation. The following primer sequences were used: hOCT-4A forward primer was 5Ј-CTCCTGGAGGGC-CAGGAATC-3Ј, and hOCT-4A reverse primer was 5Ј-CCA-CATCGGCCTGTGTATAT-3Ј. The hOCT-4B forward primer was 5Ј-ATGCATGAGTCAGTGAACAG-3Ј, and the hOCT-4B reverse primer was 5Ј-CCACATCGGCCTGTG-TATAT-3Ј. The GAPDH forward primer was 5Ј-GAAGGT-GAAGGTCGGAGTC-3Ј, and the GAPDH reverse primer was 5Ј-GAAGATGGTGATGGGATTTC-3Ј. Quantitation of the relative expression levels of the human OCT-4 isoforms was achieved by normalizing for the endogenous GAPDH using the ⌬C T method of quantitation.
ES Cell-based Complementation Assay-ES cell-based complementation assays were performed with ZHBTc4 ES cells as described previously (44). Fluorescence was detected with a fluorescence microscope (Olympus, 1X51) equipped with a Cool-SNAP digital camera (Olympus).
Expression and Purification of GST Fusion Proteins-GST-OCT-4 proteins were generated in Escherichia coli as described previously (45). After binding to glutathione-Sepharose, the proteins were washed and eluted with reduced glutathione (Sigma). Protein concentrations were determined by the method of Bradford (Bio-Rad). The purity and size of the eluted proteins were evaluated by Coomassie staining of SDS-polyacrylamide gels.
Western Blot Analysis-Western blot analysis was performed using an anti-OCT-4 antibody (C-20; Santa Cruz Biotechnology), and reactive bands were detected by chemiluminescence using Western Lightening (PerkinElmer Life Sciences).
Transfection and Reporter Assays-Cells were transiently transfected by electroporation with the Gene Pulser II RF module system (Bio-Rad), as instructed by the manufacturer. Luciferase assays were performed with the dual-luciferase assay system (Promega). Renilla luciferase activity was used to normalize transfection efficiencies.
Subcellular Localization Experiments-Full-length human OCT-4A or OCT-4B cDNAs were subcloned into the BamHI/ EcoRI sites of pBabePuro. NIH3T3 cells were plated on glass coverslips and infected with high titer retrovirus stocks produced by transient transfection of Phoenix cells (47). Immunocytochemical analyses were performed as described previously (45). Briefly, the infected cells were washed in phosphate-buffered saline (PBS) and fixed for 10 min at Ϫ20°C with acetone/methanol (1:1, v/v). To detect human OCT-4A or OCT-4B, we used anti-OCT-4 antibody (C-20; Santa Cruz Biotechnology) and horseradish peroxidaseconjugated secondary antibody (Santa Cruz Biotechnology), and fluorescence was detected with a confocal laser scanning microscope (LSM5 Pascal; Carl Zeiss Co., Ltd.).

Identification of OCT-4 Isoforms in Human Embryonic Stem
Cells-Transcripts of the human OCT-4 isoforms (OCT-4A and OCT-4B) were first observed as RT-PCR products during a search for transcription factors containing POU-domains in adult human pancreatic islets (36). We examined OCT-4 isoforms in human embryonic stem cells by RT-PCR using total human ES cell (Miz-hES1) RNA with specific oligonucleotide primers flanking the coding region corresponding to the N-terminal domain of OCT-4 proteins. As shown in Fig. 1A, OCT-4A mRNA was generated from exon 1a (labeled E1a), the 3Ј-half of exon 1b (E1b), and exons 2-4 (E2, E3, and E4) using an internal splicing acceptor site in exon 1b. OCT-4B mRNA contains exons 1b to 4 (Fig. 1A). PCR products of the expected size of 532 and 247 bp derived from human OCT-4A (Fig. 1B,  lane 1) and OCT-4B (lane 2) mRNA, respectively, were observed using human ES cell RNA. These RT-PCR products were sequenced and confirmed to represent nucleotides 1-532 and 1-247 of the reported human OCT-4A or OCT-4B cDNA sequences, respectively (data not shown). These results point to the existence of two isoforms of human OCT-4 differing in their N-terminal domain, as reported previously (36). Using quantitative real time PCR, we were also able to quantify levels of mRNA for the human OCT-4 isoforms. In human ES cells (Miz-hES1) human OCT-4A mRNA was 8-fold more abundant than human OCT-4B (Fig. 1C, bars 1 and 2). Similar results were obtained in other human ES cell lines (SNU-hESC3 (  (11). It can be propagated as a mouse ES cell line in the absence of tetracycline, in which condition the Oct-4 transgene is active, but not in the presence of tetracycline, which represses the transgene (44). In this system, the tetracycline analogue, doxycycline, can be used to prevent expression of OCT-4. Addition of doxycycline to the growth medium of ZHBTc4 ES cells resulted in rapid repression of OCT-4 expression, as determined by Western blotting of total cell extracts ( Fig. 2A).
To test whether the stem cell phenotype can be rescued by transfections of human OCT-4 isoforms, OCT-4A or OCT-4B cDNAs under the control of the constitutive CAG expression unit were transfected into ZHBTc4 ES cells, respectively. To identify the transfected ES cells, we constructed plasmids expressing green fluorescent protein fusions of the human OCT-4 isoforms. The structures of expression vectors used are shown schematically in Fig. 2B. Consistent with the previous reports (11,44), the mouse OCT-4-transfected ZHBTc4 ES cells were able to differentiate because of superthreshold production of OCT-4 (Fig. 2C, panels b and f ). In addition, growth of the transfectants in the presence of doxycycline rescued their self-renewal ability and stem cell phenotype (Fig. 2C, panels j  and n). As a control, ZHBTc4 ES cells transfected with EGFP and cultured in the presence of doxycycline underwent differentiation (Fig. 2C, panels i and m).
To investigate the behavior of the OCT-4 gene products of different species, we introduced an expression construct for human OCT-4A into the ZHBTc4 ES cells and performed the ES cell-based complementation assay. The ZHBTc4 ES cells cultured in the absence of doxycycline again differentiated because of superthreshold production of OCT-4 (Fig. 2C, panels c and g), and their self-renewal ability was rescued in the presence of doxycycline, indicating that human OCT-4A protein is active in mouse ZHBTc4 ES cells (Fig. 2C, panels k and o). However, contrary to expectation, when human OCT-4B was introduced into ZHBTc4 ES cells, it failed to rescue stem cell renewal in the presence of doxycycline (Fig. 2C, panels l and p). These data indicate that the abilities of human OCT-4A and OCT-4B isoforms to confer self-renewal on ES cells differ.
Different Abilities of Human OCT-4 Isoforms to Maintain the Undifferentiated State-The flat morphology of doxycyclinetreated ZHBTc4 ES cells expressing human OCT-4B (Fig. 2C) suggested failure to maintain the undifferentiated state of the ES cells. To verify this hypothesis, we examined known molecular markers of undifferentiated ES cells. To stain for alkaline phosphatase activity (a marker of pluripotent cells of embryonic origin) (48), the pCAG-IP/EGFP, pCAG-IP/mOCT-4-EGFP, pCAG-IP/hOCT-4A-EGFP, and pCAG-IP/hOCT-4B-EGFP constructs (Fig. 2B) were linearized with PvuI, and 50 g of each linearized plasmid DNA was transfected into ZHBTc4 ES cells (1 ϫ 10 7 ) using a Gene Pulser II RF module system. 48 h post-electroporation, puromycin was added to the medium at a final concentration of 1 g/ml to select transfected ZHBTc4 ES clones. After selection of pCAG-IP/EGFP-, pCAG-IP/mOCT-4-EGFP-, pCAG-IP/hOCT-4A-EGFP-, or pCAG-IP/hOCT-4B-EGFP-transfected ZHBTc4 cells, 4 ϫ 10 3 transfectants were seeded onto 35-mm dishes containing mitomycin C-treated mouse embryonic fibroblast feeder layers. After 2 days of culture, the cells were transferred to medium containing 1 g/ml doxycycline to repress the tetracycline-repressible mouse Oct-4 transgene. After another 4 days, colonies resistant to puromycin were tested for OCT-4-EGFP protein expression by examining green fluorescence and staining the cells for alkaline phosphatase. ZHBTc4 ES cells expressing human OCT-4A contained alkaline phosphatase activity (Fig. 3A, panels e and f ) comparable with that in ZHBTc4 ES cells expressing mouse OCT-4 ( Fig.  3A, panels c and d). However, the clones expressing human OCT-4B (Fig. 3A, panels g and h) or EGFP alone (Fig. 3A, panels a and b) had lost this characteristic, pointing to failure to maintain the undifferentiated state.
We also evaluated the expression of markers of differentiated cells. As shown in Fig. 3B, expression of Cdx-2 mRNA, which is implicated in trophoblast differentiation (52), was detected in ZHBTc4 ES cells expressing human OCT-4B or vector but was not present in ZHBTc4 ES cells expressing mouse OCT-4 or human OCT-4A. Fgf-5 mRNA, a marker of primitive ectoderm (53), also appeared in ZHBTc4 ES cells expressing human OCT-4B, indicating formation of the ectoderm lineage. These properties all suggest that human OCT-4A-expressing ES cells expanded normally and remained in an undifferentiated status, whereas human OCT-4B-expressing ES cells did not.
DNA Binding Properties of Human OCT-4 Isoforms-Because the results presented above indicate that human OCT-4B is not able to induce stem cell self-renewal, we examined its biochemical characteristics. In order see whether it can bind to an authentic OCT-4 DNA recognition sequence, we studied the DNA binding properties of the human OCT-4 isoforms using

. Characterization of ZHBTc4 ES cells expressing human OCT-4 isoforms. A, expression of alkaline phosphatase. Alkaline phosphatase activity was assessed in ZHBTc4 ES cells expressing human OCT-4 isoforms.
The expression vectors used in this experiment were the same as those in Fig. 2B. B, expression of OCT-4 downstream target genes and lineage-specific markers. RT-PCR analyses of Fgf-4, Rex-1, Sox-2, Nanog, Cdx-2, and Fgf-5 mRNAs were performed in ZHBTc4 ES cells expressing vector, mouse OCT-4, or human OCT-4 isoforms. Hprt was used as a control to quantify the RT-PCR results. Following amplification, an aliquot of each product was analyzed by staining the gel with ethidium bromide. The ES cell lines from which the input RNAs used in the RTs are derived are shown above the panel.
EMSAs. An oligonucleotide containing the consensus OCT-4 DNA-binding sequence was synthesized and used as target in the binding reactions. The structures of the OCT-4 isoform GST fusions used in this study are shown schematically in Fig.  4A. They were expressed in E. coli, purified, and coupled to glutathione-Sepharose beads. The affinity-purified OCT-4 isoforms were fractionated by SDS-PAGE, transferred to a polyvinylidene difluoride membrane, and immunoblotted with an anti-OCT-4 antibody (C-20; Santa Cruz Biotechnology) to quantify the amount of protein in each sample. As shown in Fig.  4B, the purified fractions contained equal amounts of protein.
EMSAs were performed with the concentration of the OCT-4 probe kept constant, and the amount of input protein varied. GST-OCT-4A bound to the DNA (Fig. 4C, lanes 5-7), and binding was specific as it was displaced by an excess of cold oligonucleotide containing the OCT-4-binding site but not by an excess of cold mutant oligonucleotide containing a mutant OCT-4-binding sequence not recognized by OCT-4 (data not shown). However, the OCT-4B isoform hardly bound at all (Fig.  4C, lanes 8 -10). These results indicate either that the N-terminal domain of OCT-4A is required for efficient DNA binding or that the 40 amino acids present in the N-terminal domain of OCT-4B inhibit binding.
The N-terminal Domain of the OCT-4B Isoform Inhibits DNA Binding-To distinguish between these possibilities, we produced a number of mutants with deletions of the N or C termini of OCT-4B, and we evaluated their DNA binding properties. The structure of the OCT-4B deletion and domain-swapping mutants is shown schematically in Fig. 5A. The superscripts "A" and "B" indicate whether a particular domain was derived from OCT-4A or OCT-4B. As a result of alternative splicing, the OCT-4B POU domain (total 154 amino acids) lacks two amino acids of the N-terminal sequence of the OCT-4A POU domain (total 156 amino acids) (Fig. 1D) (36). EMSAs with the POU domains of OCT-4A and OCT-4B on their own revealed that both bound the probe (Fig. 5B, lanes  5-7 and 8 -10). Moreover, the addition of the C-terminal domain of OCT-4B to the POU DNA binding domain of OCT-4B (OCT-4B (P B C)) produced a recombinant protein still capable of forming protein-DNA complex (Fig. 5C, lanes  8 -10), whereas the addition of the N-terminal domain of OCT-4B (OCT-4b (N B P B )) essentially abolished DNA binding (Fig. 5C, lanes  5-7). These results show that the N-terminal domain of OCT-4B inhibits POU DNA binding.
To further test the contribution of the N-terminal domain of OCT-4B to DNA binding, we generated a series of chimeric proteins in which the order of the domains remained constant, i.e. NTD-POU-CTD, but domains were swapped between OCT-4A and OCT-4B. As shown in Fig. 5D, OCT-4A (N B P A C) was unable to bind the OCT-4 probe, whereas OCT-4B (N A P B C) was able to bind it. This confirms that the N-terminal domain of OCT-4B contains some sequence or sequences that inhibit DNA binding and that this is active on POU A as well as POU B .
Two Separate Regions in OCT-4B NTD Are Responsible for Inhibiting OCT-4B DNA Binding-Fusion of the N-terminal domain of OCT-4B to the POU and C-terminal domains of OCT-4A revealed inhibition of DNA binding by the 40-amino acid sequence of OCT-4B (Fig. 5D). To define the minimum inhibitory domain, we tested partial deletions of the N terminus in which either amino acids 21-40 or 1-20 were deleted. The structure of the OCT-4B deletion mutants used is shown in Fig.  6A. These truncation mutants were expressed as GST fusion proteins in E. coli and purified to near homogeneity in roughly equal yield (Fig. 6B). As shown in Fig. 6C, neither partial deletion mutant was able to bind DNA. These results demonstrate that the N-terminal domain of OCT-4B possesses at least two independently acting sequences that inhibit POU DNA binding.
OCT-4A Is Nuclear whereas OCT-4B Is Cytoplasmic-We also determined the intracellular locations of OCT-4A and OCT-4B. For this purpose, we infected NIH3T3 cells with a retroviral expression construct, pBabePuro, containing the entire OCT-4 coding region. A pBabePuro retroviral vector lacking OCT-4 cDNA sequences was used as negative control. The cells were infected with either empty pBabePuro expression vector (data not shown), pBabePuro-OCT-4A (Fig. 7A,  panels a and b), or pBabePuro-OCT-4B (Fig. 7A, panels c and d) and processed for immunofluorescence. We previously showed that mouse OCT-4 protein is localized to the nucleus (35). In accordance with this previous result, human OCT-4A was clearly localized to the nucleus (Fig. 7A, panel  a). However, contrary to our expectation, we found that most of the human OCT-4B protein was cytoplasmic (Fig. 7A, panel  c). We also transfected COS-7 cells with pcDNA3-OCT-4B or pEGFP-OCT-4B with similar results (data not shown).
To confirm these results in cells possessing stem cell properties, the subcellular distribution of the human OCT-4 isoforms was also determined in ZHBTc4 ES cells (Fig. 7B). pCAG-IP/EGFP, pCAG-IP/hOCT-4A-EGFP, and pCAG-IP/ hOCT-4B-EGFP constructs that express EGFP, OCT-4A-EGFP (GFP fusion human OCT-4A), and OCT-4B-EGFP (GFP fusion human OCT-4B), respectively (Fig. 2B), were transfected into ZHBTc4 ES cells, and the subcellular locations of these proteins were detected by fluorescence microscopy. In the ZHBTc4 ES cells, EGFP-tagged OCT-4A was clearly localized to the nucleus (Fig. 7B, panels c and d), whereas most of the EGFP-tagged OCT-4B was cytoplasmic (Fig. 7B, panels e and f ). EGFP alone was found in both the nucleus and cytoplasm of ES cells (Fig. 7B, panels a  and b). Taken together, these data suggest that OCT-4A is a nuclear protein whereas OCT-4B is a cytoplasmic protein.
OCT-4B Is Unable to Activate Transcription of OCT-4-responsive Genes-OCT-4B lacks the N-terminal sequence that mediates the transcriptional activity of OCT-4 (13). In addition, its N-terminal domain inhibits DNA binding (Figs. 5 and 6), and it is cytoplasmic in location (Fig. 7). All these properties suggest that it may not be a transcriptional activator. To address this question, we performed transient transfection assays. 293T cells were transfected with a luciferase reporter plasmid with 10 consensus OCT-4-binding sites in its promoter. Cotransfected OCT-4A increased the transcription of this reporter 50-fold (Fig. 8A, black bars), whereas OCT-4B had no effect (Fig. 8A, white bars). Similarly, co-expression of OCT-4A stimulated gene expression from the FOR (farnesoid X receptor-like orphan receptor) promoter containing two OCT-4binding sites, 4 whereas co-expression of OCT-4B did not (Fig. 8B).
To determine whether OCT-4B can inhibit transactivation by OCT-4A, we co-transfected the expression vectors for OCT-4B and OCT-4A with an OCT-4-dependent reporter gene. In the absence of OCT-4B, OCT-4A increased reporter gene activity severalfold, as expected (Fig. 8C, bar 2), and increasing input levels of OCT-4B had no inhibitory effect (Fig.  8C, bars 3 and 4). Hence, we may conclude that overexpression of OCT-4B does not interfere with transactivation by OCT-4A.

DISCUSSION
Numbers of stem cells, and their decisions to differentiate, must be tightly controlled during embryonic development and in the adult animal to avoid premature senescence or tumor formation. Embryonic and adult stem cells share the properties of self-renewal and multiple developmental potential, suggesting the presence of common cellular machinery. Thus, greater understanding of the molecular determinants responsible for these properties is desirable. Accordingly, there is growing interest in the functional characterization of OCT-4 in embryonic stem cells.
The results reported here begin to characterize the human OCT-4 isoforms. Two isoforms of human OCT-4 are generated by alternative splicing (Fig. 1A). Oct-4 encodes a POU transcription factor that is expressed by all pluripotent cells during embryogenesis and is also abundantly expressed by ES, embryonic germ, and embryonic carcinoma cell lines (2,3,7,57,58). Differentiation of pluripotent cells to somatic lineages occurs at the blastocyst stage and during gastrulation, coincident with downregulation of OCT-4. The unique oct-4 expression pattern in the mouse embryo also leads to the hypothesis on the pluripotent cycle (27). Consistent with this speculation, oct-4-null embryos die at implantation because of a failure to form the inner cell mass (9). These results demonstrate that oct-4 is required to prevent somatic differentiation of the inner cell mass and is important for maintaining the undifferentiated state during embryonic development.
Despite the fact that OCT-4A function is critical for maintaining the pluripotency of embryonic stem cells and promoting tumorigenesis in human tissues, little is known of the biological properties of the OCT-4B isoform. First, we investigated the existence of the two OCT-4 isoforms in human ES cells by RT-PCR, because although two human OCT-4 isoforms have been reported to be expressed in adult human pancreatic islets (36), it was not known whether they were also expressed in human ES cells. Two OCT-4 isoforms were indeed detected in the human ES cells. In addition, human OCT-4A was more abundant in human ES cells than OCT-4B (Fig. 1).
OCT-4 is structurally and functionally divided into three domains (see Fig. 1D). The N-terminal 133 amino acid residues of OCT-4A encompass a transcriptional activation region that is active in various cultured cell types. Amino acid residues 134 -289 form the central POU domain of the protein that binds to DNA in a sequence-specific fashion. The third domain between amino acid residues 290 and 360 also controls the transactivation function of OCT-4, but its activity is cell typespecific (13). As shown in Fig. 1D, human OCT-4A and OCT-4B mRNA encode proteins that share POU DNA binding and C-terminal domains but differ in sequence at their N termini. Because the N-terminal domain of OCT-4 functions as a transactivation domain, we measured the transactivation potential of OCT-4B. OCT-4A functioned as a transcriptional activator as reported previously (33), whereas OCT-4B did not appear to do so (Fig. 8).
We also demonstrated that the N-terminal domain of OCT-4B inhibits the sequence-specific DNA binding of OCT-4 protein, via the central POU domain (Figs. 4 and 5). Moreover, regions containing amino acids 1-20 and 21-40 both inhibited POU sequence-specific DNA binding, indicating that both regions independently maintain OCT-4B in the latent state with little or no affinity for its target sequences (Fig. 6).   293T cells were co-transfected with expression vectors encoding the OCT-4A (black bars) or OCT-4B (white bars) isoforms, the pOCT-4(10x)TATAluc reporter vector, and the Renilla luciferase normalizing vector. Firefly luciferase activity was normalized with Renilla luciferase activity to correct for transfection efficiencies. Each transfection was performed at least three times independently, and the mean value is plotted with the standard deviation (vertical bars). Fold induction is relative to the empty expression vector. B, transcriptional activation by the human OCT-4 isoforms using pFORluc reporter vector. 293T cells were co-transfected with expression vectors encoding the OCT-4A (black bars) or OCT-4B (white bars) isoforms, the pFORluc reporter vector, and the Renilla luciferase normalizing vector. Firefly luciferase activity was normalized with Renilla luciferase activity to correct for transfection efficiencies. Each transfection was performed at least three times independently, and the mean value is plotted with the standard deviation (vertical bars). Fold induction is relative to the empty expression vector. C, human OCT-4B isoform does not interfere with transcriptional activation by OCT-4A. The pOCT-4(10x)TATAluc reporter was co-transfected into 293T cells with 0.25 g of human OCT-4A and the indicated amounts of OCT-4B isoform. The average fold induction of transcription (normalized firefly luciferase activity) and standard error of three independent experiments are presented.
Classical NLS sequences contain regions rich in basic amino acids and generally conform to one of three motifs (34). The first type of NLS consists of a continuous stretch of four basic amino acids (lysine or arginine) or three basic amino acids together with a histidine or proline. The second type of NLS starts with a proline and is followed within three residues by an amino acid sequence containing three out of four basic residues. The third type of NLS, known as a bipartite motif, consists of two basic amino acids, a 10 amino acid spacer and a 5 amino acid sequence containing at least three basic residues. It has been reported recently that OCT-4A harbors a conserved nuclear localization signal RKRKR in its POU DNA binding domain (40). Therefore, one consequence of fusing the NTD B to the POU DNA binding domain should be nuclear targeting. Surprisingly, however, OCT-4B mostly localized to the cytoplasm, whereas OCT-4A localized to the nucleus, as expected (Fig. 7). These data suggests two possible models. One is that the NTD of OCT-4B protein contains a nuclear export signal that is recognized by transport receptors that carry it from the nuclear envelope to the cytoplasm. The other is that the NLS within the POU domain is blocked or buried by the OCT-4B NTD. Additional experiments will be required to distinguish between these possibilities.
There are several oct-4-like genes in the mouse and human genomes. The human OCT-4 gene has been mapped to the region of the major histocompatibility complex on chromosome 6 and spans about 7 kb (36). The cognate mouse Oct-4 gene is on chromosome 17 in a region that is syntenic with the major histocompatibility complex region of human chromosome 6 (1, 49). The pseudogene-like sequence for human OCT-4 has been mapped to human chromosome 8 and is classified as a retroposon (36). Pseudogenes arise from the action of reverse transcriptases; cellular mRNAs can be copied into DNA by this enzyme, and the resulting DNA fragments can be reintegrated into the genome at a low rate. These pseudogenes have in common that they lack introns and have an additional poly(A) sequence (54,55). Interestingly, the sequence of the oct-4 pseudogene homologous to oct-4 mRNA has 97.5% nucleotide sequence identity with the sequence of oct-4 cDNA. However, although human OCT-4 consists of five exons, the oct-4 pseudogene lacks introns. In addition, it has a poly(A) track at its 3Ј-end and flanking direct repeats (5Ј-GAAAAGTAA-CATAATT-3Ј) at both ends of the gene, indicating that it is a retroposon (36). A comparison of the sequences of human OCT-4 and of the OCT-4A and OCT-4B cDNA clones clearly indicates that OCT-4A and OCT-4B mRNAs are derived from the oct-4 gene by alternative splicing, not from the oct-4 pseudogene (36).
It is not easy to evaluate the role of human OCT-4B in embryonic stem cells. The N-terminal domain of human OCT-4A, unlike that of human OCT-4B, is rich in glycine and proline residues and appears to have a transcriptional activation domain. A search of OCT-4B (NTD) on line failed to reveal any functional motifs or homology with other proteins. However, analysis of its protein sequence revealed several potential serine, threonine, and tyrosine phosphorylation sites. It is possible that extranuclear OCT-4B is modified by several signaling molecules. Consistent with this speculation, it has been reported that OCT-4 exists as a phosphoprotein in embryonic carcinoma cells (13). Thus, it will be important to assess whether human cytoplasmic OCT-4B is involved in signal transduction.
In a broader context, our findings suggest that the OCT-4B isoform may play a different role than the unique function of the OCT-4A protein in self-renewal. Consistent with this speculation, expression of human OCT-4B alone is not sufficient to maintain stem cell self-renewal (Fig. 2C) and undifferentiated state (Fig. 3). These failures can be explained by the loss of transactivation ability (Fig. 8), which can in turn be accounted for by its localization in the cytoplasm (Fig. 7) as well as its inability to bind to the OCT-4 consensus motif (Fig. 4). Because OCT-4 is expressed in human testicular germ cell tumors and its expression transforms and endows tumorigenicity in nude mice (42,56), it would also be interesting to determine whether OCT-4B can collaborate with OCT-4A in transforming nontumorigenic cells.