A Conserved Transcription Motif Suggesting Functional Parallels between Caenorhabditis elegans SKN-1 and Cap‘n’Collar-related Basic Leucine Zipper Proteins*

In Caenorhabditis elegans, the predicted transcription factor SKN-1 is required for embryonic endodermal and mesodermal specification and for maintaining differentiated intestinal cells post-embryonically. The SKN-1 DNA-binding region is related to the Cap‘n’Collar (CNC) family of basic leucine zipper proteins, but uniquely, SKN-1 binds DNA as a monomer. CNC proteins are absent in C. elegans, however; and their involvement in the endoderm and mesoderm suggests some functional parallels to SKN-1. Using a cell culture assay, we show that SKN-1 induces transcription and contains three potent activation domains. The functional core of one domain is a short motif, the DIDLID element, which is highly conserved in a subgroup of vertebrate CNC proteins. The DIDLID element is important for SKN-1-driven transcription, suggesting a likely significance in other CNC proteins. SKN-1 binds to and activates transcription through the p300/cAMP-responsive element-binding protein-binding protein (CBP) coactivator, supporting the genetic prediction that SKN-1 recruits theC. elegans p300/CBP ortholog, CBP-1. The DIDLID element appears to act independently of p300/CBP, however, suggesting a distinct conserved target. The evolutionarily preservation of the DIDLID transcriptional element supports the model that SKN-1 and some CNC proteins interact with analogous cofactors and may have preserved some similar functions despite having divergent DNA-binding domains.

During development, establishment of cell fates frequently involves conserved regulatory pathways. In the early Caenorhabditis elegans embryo, endodermal (intestinal) fates are specified by GATA family transcription factors (END-1, END-3, ELT-2) that are related to those that mediate endoderm development in Drosophila (Serpent) and vertebrates (GATA-4/5/6) (1)(2)(3). This program is triggered by maternally expressed SKN-1, a predicted transcription factor (4) that also specifies mesodermal lineages (pharynx and some body wall muscle) (5). The presence of consensus SKN-1-binding sites adjacent to the end-1 gene along with the timing of its expression suggests that SKN-1 may activate end-1 directly (2). SKN-1 is also required in the endoderm post-embryonically, to prevent differentiated intestinal cells from undergoing severe atrophy (5).
Apparent SKN-1 orthologs have not been identified outside of nematodes, but in its DNA-binding region, SKN-1 is related to a subgroup of basic leucine zipper transcription factors (4,5), the CNC 1 proteins. A basic DNA-binding region at the SKN-1 COOH terminus is particularly similar to those of CNC proteins, but SKN-1 lacks a zipper dimerization domain (Fig. 1, A and C) and, uniquely, binds DNA as a monomer (4). Its DNA binding requires the adjacent ␣-helical "CNC region" (Fig. 1, A and C) (4, 6 -8), which is otherwise found only in CNC proteins. An adjacent SKN-1 element that is lacking in CNC proteins, the NH 2 -terminal arm (Fig. 1, A and C), contributes additional binding affinity and specificity (6,9). These similarities and differences pose the question of whether SKN-1-and CNC-related proteins simply bind DNA through related but divergent mechanisms or might share a closer functional relationship.
Like SKN-1, many CNC proteins are involved in the development or function of endodermal or mesodermal cells. Drosophila CNC is required for specification of pharyngeal segments (10), and vertebrate p45 NF-E2 is involved in hematopoiesis (11). In mice, different knockouts of the Nrf1 (NF-E2 related factor-1; also LCRF-1/TC11) gene either cause a fetal liver microenvironment defect (12) or appear to block cell-tocell induction of the mesoderm, a function usually ascribed to endodermal cells (13). Both NRF1 and NRF2 directly induce expression of detoxification enzymes (14 -17), a pathway that is markedly stimulated in the liver and intestine (14). Supporting the idea that some CNC protein functions might parallel those of SKN-1, neither they nor their dimerization partners, the Maf basic leucine zipper proteins (18,19), appear to be encoded in the complete C. elegans genome (data not shown).
Genetic evidence suggests that SKN-1 may interact functionally with CBP-1, the C. elegans ortholog of the p300/CBP transcription coactivators. (20). p300/CBP proteins are metazoan histone acetyltransferases that are involved in developmental and inducible gene expression and that are recruited by numerous activators (21), including the CNC protein p45 NF-E2 (22). C. elegans CBP-1 is required for specification of all nonneuronal embryonic developmental lineages (20). Endodermal differentiation can be restored in embryos that lack either CBP-1 or SKN-1, however, by inhibition of histone deacetylases * This work was supported by National Institutes of Health Grants GM50900 (to T. K. B.), GM58012 (to Y. S.), and DK09416 (to A. K. W.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 (20). This suggests that in the endoderm, SKN-1 might recruit the CBP-1 histone acetyltransferase activity directly. It remains possible, nevertheless, that lack of histone deacetylases simply restores expression of downstream genes independently of possible histone acetyltransferase recruitment by SKN-1.
We have addressed these issues by investigating how SKN-1 regulates transcription. C. elegans cell lines have not been developed, but given the conservation of the eukaryotic mRNA transcription machinery (23), SKN-1 would be predicted to be active in mammalian cells. Using transfection assays, we show here that SKN-1 is a powerful activator of transcription that contains three transactivation domains. The functional core of one domain consists of a short sequence (the DIDLID element) (Fig. 1B) that is specific to SKN-1-related proteins and to the NRF group of CNC proteins. The p300/CBP proteins appear to be direct cofactors of SKN-1, but not to be critical for the activity of the DIDLID element. The conservation and transcriptional function of the DIDLID element suggest that SKN-1 and the NRF CNC proteins share a common transcriptional protein target in addition to p300/CBP and, during evolution, may have maintained some parallel functions in endodermal cells.

EXPERIMENTAL PROCEDURES
Plasmid Constructions-For analysis of full-length SKN-1, a FLAG epitope was added at its NH 2 terminus. A SpeI site was first created by polymerase chain reaction, which added a linker of LV prior to the initial methionine, and then the FLAG epitope (DYKNDDDKDP) was added as an R1/SpeI fragment prior to cloning into the CMV-based vector CS2 (24). To delete the DIDLID motif (⌬DID), an SKN-1-(113-391) fragment was generated by polymerase chain reaction and then substituted for residues 100 -391. Activation domain B was disrupted (⌬AD(B)) by deleting a restriction fragment encoding residues 154 -205. The double deletion (⌬DID,AD(B)) was made by swapping appropriate restriction fragments. Each deleted region was sequenced. Point mutations were constructed with the QuickChange kit (Stratagene) and confirmed by sequencing. Gal4-SKN-1 fusions were constructed by creating BamHI/XbaI sites at the ends of SKN-1 sequences using the polymerase chain reaction. These were linked to the COOH terminus of the Gal4 DNA-binding domain within pSG424 (25), creating an intervening linker of ISRA. Junctions were sequenced for accuracy. The polymerase chain reaction was performed using Pfu polymerase (Stratagene). The SKN-1/TK-CAT reporter was constructed by blunting and multimerizing the SK1 oligonucleotide (4), which corresponds to the preferred SKN-1-binding site. A fragment with four SK1 sites in the same orientation was then ligated into a blunted SalI site in the TK-CAT reporter (26).
Transfections and Analysis-HeLa cells were used in all transfections and maintained in Dulbecco's modified Eagle's medium plus 10% fetal calf serum. Cells were transfected transiently by calcium phosphate for 18 h at 3% CO 2 and then placed in 5% CO 2 and harvested 24 -26 h later. Transfections for CAT assays were performed in six-well plates and brought to 10 g of total DNA with Bluescript. Transfections for Western blotting and electrophoretic mobility shift assays were performed in 100-mm plates using 20 g of total DNA. CAT assays were performed as described (25). An internal control was omitted because some SKN-1 constructs influenced overall gene expression, but not transfection efficiency (data not shown), presumably through squelching. Samples were normalized for total protein concentration. Six independent values were generated for each data point, of which a representative is shown in each figure. Each fell in the linear range for CAT and represented an average of three experiments unless otherwise indicated, with error bars showing S.D. values. Cotransfection of a ␤-galactosidase vector revealed that in these assays, E1A expression did not cause detectable apoptosis (data not shown).
Protein Analysis-Lysates for Western and electrophoretic mobility shift assay analyses were prepared as described (27), and SKN-1 was expressed by in vitro translation using Promega kits. For Western blotting, proteins were separated on a 7.5% gel and transferred to nitrocellulose membranes (Schleicher & Schü ll), which were probed with an anti-SKN-1 monoclonal antibody (a gift of J. Priess) and visualized by enhanced chemiluminescence (Amersham Pharmacia Biotech). Electrophoretic mobility shift assays were carried out as described (9) with an extract equivalent of 10% of a 100-cm plate/lane. Binding of SKN-1 to a 32 P-labeled SK1 oligonucleotide (4) was assayed and was competed either by excess unlabeled SK1 DNA or by a mutant site (MSK1) to which SKN-1 does not bind specifically (4). Glutathione S-transferase (GST)-p300 proteins were expressed as described (28) and then coupled to glutathione-agarose beads (Sigma) and incubated for 2 h with 35 S-labeled SKN-1 in 25 mM Hepes, 250 mM NaCl, 25 mM EDTA, 0.1% Nonidet P-40, and protease inhibitor mixture as described (28). Beads were washed in phosphate-buffered saline plus 0.1% Nonidet P-40 five times, and then proteins were eluted in sample buffer and electrophoresed on an SDS-polyacrylamide gel. by transfection in human (HeLa) cells, SKN-1 strongly activated a reporter containing four SKN-1-binding sites (SKN-1/ TK-CAT) (Fig. 2), but not the corresponding control reporter lacking those sites (data not shown). SKN-1/TK-CAT was not activated by the SKN-1 DNA-binding domain alone (SKN-1-(449 -533)) or by SKN-1 residues 1-391, which encompass its similarity to a C. elegans SKN-1-related gene (SRG-1) (5) outside of the DNA-binding domain (data not shown) (Fig. 2). When fused to the yeast Gal4 DNA-binding domain, however, residues 1-391 (Gal4-SKN-(1-391)) strongly activated a Gal4based reporter (E1B-CAT) (Fig. 3A) (28). They did not activate a control reporter lacking Gal4 sites (data not shown), suggesting that this SKN-1 region can bind and recruit transcription complexes. Residues 1-448 similarly constituted a powerful activator that was comparable to the viral protein VP16 (Fig. 3A).

A Short Transactivation
Outside of their DNA-binding domains, SKN-1, NRF1, and NRF2 are highly related within a short NH 2 -terminal motif, which we refer to as "DIDLID" after part of its sequence (Fig. 1,  A and B). This motif is also present in SRG-1 and an SKN-1 homolog in the nematode Caenorhabditis briggsae (Fig. 1B), but among CNC proteins, it appears to be restricted to NRF1 and NRF2 orthologs (data not shown). The DIDLID element appears to be present only in SKN-1-and CNC-related proteins, and SKN-1 and NRF1 are more closely related within it (92%) (Fig. 1, B and C) than in their DNA-binding basic regions (86%), suggesting that it has a specific and conserved function. The NRF2 DIDLID element is located adjacent to a domain that is lacking in SKN-1 and that can retain NRF2 in the cytoplasm, but the DIDLID element is not involved in this interaction (29). The DIDLID element includes alternating charged and hydrophobic residues (Fig. 1B), but is not predicted to form an amphipathic ␣-helix. It is reminiscent, however, of short helical protein-protein interaction modules that are involved in transcription, such as the LXXLL motif in nuclear receptor co-activators (30,31) and the LDFS motif in E2A proteins (32), suggesting that the DIDLID element might also have a transcriptional function.

FIG. 2. Activation of transcription by SKN-1.
Full-length SKN-1 (0.5 g) and the indicated mutants were assayed by transient transfection for activation of the SKN-1/TK-CAT reporter (2.0 g). SKN-1 sequence domains are indicated as described in the legend to Fig. 1. contrast, alteration of the conserved tryptophan to either Ala (W108A) or Arg (W108R) decreased transcription markedly (Fig. 4A), indicating that interactions involving the DIDLID element are specific.
Contributions of p300/CBP Proteins to SKN-1 Function-The model that C. elegans CBP-1 cooperates directly with SKN-1 predicts that recruitment of p300/CBP would be important for its activation of transcription in human cells. This is particularly likely because CBP-1 is related to p300 throughout its length, especially within its predicted functional domains (20). Supporting this idea, in vitro translated SKN-1 bound specifically to GST fusion proteins that contain either the NH 2or COOH-terminal region of human p300 (Fig. 5A), each of which interacts with numerous transcription activators (21). In contrast, SKN-1 did not bind to the p300 center (Fig. 5A), which contains the histone acetyltransferase domain and generally does not bind directly to activators (21). SKN-1-dependent transcription was increased by expression of p300 in increasing amounts (up to 7-fold) (Fig. 5B) and was decreased by ϳ80% by expression of the adenovirus E1A 12 S protein (Fig. 5C), which binds and inhibits both p300/CBP and PCAF (21,(33)(34)(35). SKN-1-dependent transcription was also inhibited by an E1A mutant that does not bind the retinoblastoma (Rb) protein (pM47AI24), but not by one that does not bind p300/CBP (RG2) (Fig. 5C).
These findings raise the question of whether p300/CBP proteins are required by the DIDLID element or other SKN-1 activation domains. Coexpression of p300 significantly enhanced the activity of either domain B or domain C, but not that of domain A (Fig. 5D), suggesting that p300/CBP protein levels are not limiting for function of the core DIDLID element (Fig. 1A). Each SKN-1 activation domain, but not VP16, was inhibited by both E1A and the E1A Rbmut protein (Fig. 5E), which does not bind the Rb protein. None were repressed by the E1A ⌬CR1 mutant (Fig. 5E), which lacks conserved region 1 and does not bind either p300/CBP or PCAF. In contrast, the SKN-1 activation domains differed in the extent to which they were inhibited by E1A mutants that are impaired for binding to either p300/CBP proteins (p300mut) or PCAF (E55) individually. Neither of these last two E1A mutants repressed domain B, but each retained a partial effect on domains A and C (Fig.  5E). They also differed from each other in their effects on domain A, which was repressed by ϳ50% by E1A p300mut (Fig.  5E), supporting the idea that the DIDLID element targets a factor that is distinct from p300/CBP.

DISCUSSION
The presence of a DNA-binding domain in SKN-1 (4) and its localization to nuclei (36) suggested previously that SKN-1 is likely to regulate transcription. We have shown here that SKN-1 is a potent activator of transcription when it binds its cognate site (Fig. 2). SKN-1 interacts with two regions of human p300 in vitro (Fig. 5A), and p300/CBP proteins contribute to its activity (Fig. 5, B and C). Given the extensive similarity between p300/CBP proteins and their C. elegans ortholog CBP-1, these data support the model that SKN-1 may recruit CBP-1 directly to promoters as a cofactor in vivo (20). Overexpression of p300 potentiated SKN-1 activation of domains B and C (Fig. 5D), suggesting that it can be recruited directly or indirectly by them. Domain B was not inhibited by E1A mutants in which binding to either p300/CBP (p300mut) or PCAF (E55) in particular was impaired (Fig. 5E), implicating each of these histone acetyltransferases in its activity. Domain C was partially inhibited by these E1A mutants (Fig. 5E), however, suggesting either that it may require both histone acetyltransferases simultaneously or might act on an independent E1A target. The activity of domain A, which contains the DIDLID element (Fig. 1A), was not enhanced significantly by p300 expression (Fig. 5D), and was inhibited by E1A p300mut (Fig. 5E), suggests that it has a target that is distinct from p300/CBP.
The conservation of the DIDLID element across evolution (Fig. 1, B and C) supports the model that SKN-1 and the CNC proteins evolved from a common precursor (4). This conservation is particularly striking because the DIDLID motif is separate from the DNA-binding domain and was maintained despite divergences in how these proteins bind DNA (Fig. 1C).  Fig. 2). Sequence domains are indicated as described in the legend to Fig. 1. B, DNA binding by SKN-1 mutants. Extracts from cells transfected with the indicated SKN-1 derivatives were tested by electrophoretic mobility shift assay along with in vitro translated (IVT) SKN-1 for binding to a labeled SKN-1-binding site (4). Only bound complexes are shown, with a nonspecific species indicated by NS. In the indicated samples, a 500-fold excess of unlabeled SKN-1-binding site (S) or a nonspecific sequence (M) was added to the assay mixture. C, expression of SKN-1 derivatives, assayed by Western blotting of the six protein samples analyzed in B. By this assay, the DIDLID point mutants analyzed in A were also expressed comparably to SKN-1 (not shown).
Point mutants in the DIDLID element dramatically decreased SKN-1-driven transcription in mammalian cells (Fig. 4A), supporting the idea that it mediates a highly specific interaction that is common to these proteins. In apparent contrast to our findings, an NRF2 fragment that contained the DIDLID element appeared to lack transcriptional activity in Gal4 fusion assays (29). This particular NRF2 fragment also included the inhibitory domain that can retain NRF2 in the cytoplasm (29), however, suggesting that in this context, the DIDLID element might have been masked or not present in the nucleus.
Our findings suggest that SKN-1 and CNC proteins may have preserved some parallel functions in the endoderm and mesoderm. Also supporting this idea, in all of these proteins, hydrophobic residues on the CNC region surface are conserved that are not predicted to influence folding or DNA binding, but instead form a pocket, suggesting a common protein-protein interaction (8). The CNC protein most analogous to SKN-1 may be NRF1, which contains DIDLID (Fig. 1, B and C) and appears to be involved in endodermal and mesodermal differentiation and regulation of detoxification genes (13,15,16). These similarities suggest a particularly intriguing possibility, that the little understood requirement for SKN-1 to maintain the viability of differentiated intestinal cells (5) might involve functions that parallel the role of the NRF1 and NRF2 proteins in antioxidant responses.