Regulatory Factor Interactions and Somatic Silencing of the Germ Cell-specific ALF Gene*

Germ cell-specific genes are active in oocytes and spermatocytes but are silent in all other cell types. To understand the basis for this seemingly simple pattern of regulation, we characterized factors that recognize the promoter-proximal region of the germ cell-specific TFIIAα/β-like factor (ALF) gene. Two of the protein-DNA complexes formed with liver extracts (C4 and C5) are due to the zinc finger proteins Sp1 and Sp3, respectively, whereas another complex (C6) is due to the transcription factor RFX1. Two additional complexes (C1 and C3) are due to the multivalent zinc finger protein CTCF, a factor that plays a role in gene silencing and chromatin insulation. An investigation of CTCF binding revealed a recognition site of only 17 bp that overlaps with the Sp1/Sp3 site. This site is predictive of other genomic CTCF sites and can be aligned to create a functional consensus. Studies on the activity of the ALF promoter in somatic 293 cells revealed mutations that result in increased reporter activity. In addition, RNAi-mediated down-regulation of CTCF is associated with activation of the endogenous ALF gene, and both CTCF and Sp3 repress the promoter in transient transfection assays. Overall, the results suggest a role for several factors, including the multivalent zinc finger chromatin insulator protein CTCF, in mediating somatic repression of the ALF gene. Release of such repression, perhaps in conjunction with other members of the CTCF, RFX, and Sp1 families of transcription factors, could be an important aspect of germ cell gene activation.

Germ cells express a unique set of core promoter-associated transcription factors that are not found in somatic cells. These include members of the TBP family as well as TBP-associated factors, such as TFIIA and TAFs (1)(2)(3). One of these germ cell variants, TFIIA␣/␤-like factor (ALF) 2 (TFIIA), encodes a counterpart of the large subunit of the general transcription factor TFIIA (4 -6). ALF interacts with the small TFIIA␥ subunit to form a complex that can stabilize binding of TBP-like factors to promoter DNA, thereby facilitating formation of a transcription preinitiation complex in germ cells (7,8). The loss of ALF expression is associated with male infertility in humans (9).
An intriguing feature of ALF and other germ cell-specific genes is that the sequences sufficient to exert correct control of expression are very short, on the order of 100 base pairs. For instance, a region of the ALF gene between Ϫ133 and ϩ21 relative to the transcription start site is able to recapitulate accurate tissue-specific expression and silencing of a linked reporter in transgenic mice (10). The most notable sequences in this region include a TTCAAA element, a GC-rich region with a number of CpG dinucleotides, and an upstream TC-rich region. Although broadly similar features can be observed in several other germ cell-specific promoters (10), individual sequences of germ cell promoters are not precisely homologous, and the identification of factors responsible for their regulation remains an important goal. Such factors are expected to control the two main behaviors of germ cell-specific genes: selective activation during germ cell differentiation and somatic silencing.
Here we have examined DNA-protein complexes that form on the ALF promoter using mouse liver-and testis-derived nuclear extracts. We show that these complexes are due to the zinc finger proteins, Sp1, Sp3, and CTCF, and to the winged helix transcription factor RFX1. Interestingly, sites for Sp1/Sp3 and CTCF overlap, making this a unique subclass of elements that has dual specificity for both types of factors. The identification of these particular proteins is also of interest because they are all members of larger families of factors that are regulated in complex patterns during gametogenesis (11)(12)(13)(14)(15)(16)(17)(18). The role of CTCF is especially interesting, since its effect on gene silencing is mediated at the level of chromatin organization (19,20). Its identification as a germ cell promoter binding factor, as described in this report, suggests a possible role in protecting germ cell genes from inappropriate activation by adjacent enhancers active in somatic tissues. Functional data are presented that support the notion that the ALF promoter is repressed in somatic cells, and we speculate that release of repression might be an important aspect of how such genes ultimately turned on. Overall, the results provide insights into the on-off regulation of germ cell gene promoters.
Labeling of DNAs was performed with T4 polynucleotide kinase (New England Biolabs) and [␥-32 P]ATP (PerkinElmer Life Sciences) at 37°C for 30 min. Labeled probes were purified using a NICK column (Amersham Biosciences). DNA was methylated with SssI and S-adenosylmethionine at 37°C for 3 h. Reactions were inactivated at 65°C for 20 min. To confirm that methylation was complete, DNAs were digested with methylation-sensitive and methylation-insensitive restriction enzymes HpaII and MspI.
Preparation of Tissue Extracts and Recombinant Proteins-Mouse testis and liver were used to prepare nuclear extracts (NE) as described below (22). In brief, razor-minced tissues were homogenized in 20 ml of HB buffer (10 mM HEPES-KOH (pH 7.6), 15 mM KCl, 0.15 mM spermine, 0.5 mM spermidine, 1 mM EDTA, 2.4 M sucrose, 1% low fat milk, 0.5 mM dithiothreitol, 0.5 mM phenylmethylsulfonyl fluoride, and 1% protease inhibitor) and loaded into an SW28 tube on top of a 10-ml HB cushion. After centrifugation at 24,000 rpm for 60 min at 4°C, the supernatant was removed, and the nuclear pellet was rinsed with water. The pellet was then resuspended with NL buffer (10 mM HEPES-NaOH (pH 7.6), 100 mM KCl, 0.1 mM EDTA, 10% glycerol, 3 mM MgCl 2 , 1 mM dithiothreitol, 0.1 mM phenylmethylsulfonyl fluoride, and 1% proteinase inhibitor) and homogenized with a hand-held glass homogenizer. The DNA concentration was measured and diluted to 0.5 mg/ml with NL buffer. One-tenth volume of 4 M (NH 4 ) 2 SO 4 was added, mixed by inversion, and incubated on ice for 30 min. The sample was then centrifuged at 35,000 rpm for 60 min, and the supernatant was transferred to a new tube. Solid (NH 4 ) 2 SO 4 (0.3 g) was added to each ml of supernatant, incubated on ice for 60 min, and centrifuged at 40,000 rpm for 20 min. The pellet was resuspended in 25 mM HEPES (pH 7.6), 0.1 mM EDTA, 40 mM KCl, 10% glycerol, and 1 mM dithiothreitol.
RNA Interference-NIH3T3 cells were trypsinized and distributed into 6-well plates without antibiotics at 2 ϫ 10 5 cells/ well. Cells were transfected with CTCF-specific (Dharmacon) or negative control siRNAs (Ambion) using Lipofectamine 2000 (Invitrogen). The concentration of siRNA that gave the least cell deterioration was 0.58 g/well, with each well containing 2 ml of Opti-MEM I reduced serum medium (Invitrogen). Cells were cultured for 8 h and then switched to media containing 10% fetal bovine serum and 1% antibiotic solution.
Transfections-293 cells were grown to 50 -80% confluence on 6-well plates and transfected with 1 g of DNA using the FuGENE 6 reagent (Roche Applied Science). pGL3 Control and pGL3 Enhancer served as positive and negative controls, respectively. Whole cell extracts were prepared after 48 h, and luciferase activity was assayed using the luciferase assay system (Promega) and a Turner TD-20e luminometer.
For cotransfection experiments with pmALFpro200 and individual expression vectors, 293T cells were plated at 4 ϫ 10 5 cells into 6-well plates. After overnight incubation, cells were transfected with expression constructs together with 1 g of mALF-pro200-Luc construct using the FuGENE6 transfection reagent (Roche Applied Science). The pCMV Sport-␤-gal vector (100 ng; ATCC) was used to monitor transfection efficiency, and the pGL3 Control and pGL3 Enhancer vectors were used as positive and negative controls. After 48 h, luciferase and ␤-galactosidase assays were performed using the luciferase assay system and the ␤-galactosidase assay kit (Invitrogen). Luciferase activity was normalized to ␤-galactosidase activities. All transfections were performed between three and six times each.

RESULTS
Protein-DNA Interactions on the ALF Promoter-A fragment of the mouse ALF promoter from Ϫ117 to Ϫ5 relative to the major transcription start site ( Fig. 1A) was used as a probe in bandshift reactions that contained nuclear extracts from either mouse liver or testis. The probe spanned a region of the promoter previously shown to drive correct regulation in transgenic mice (10). Six major complexes could be resolved using liver extracts (Fig. 1B, lane 1) and were denoted as C1-C6. Three of these complexes were also abundant in testis extracts (C3, C4, and C5) (lane 2). Additional rapidly migrating complexes were less intense or variably present and were not given separate designations. The results depicted are representative of results obtained with several nuclear extract preparations.
Zinc Finger Transcription Factors Sp1 and Sp3 Recognize the ALF Promoter-The ALF promoter sequence is relatively GC-rich and contains a GGGCGG element located in reverse orientation relative to the direction of transcription at positions Ϫ84 to Ϫ79 (Fig.  1A). To test whether members of the Sp1 family of transcription factors might recognize this element and generate some of the observed complexes, we evaluated the ability of an Sp1 site to compete for binding. The results show that both the C4 and C5 complexes seen in liver NE disappeared when this competitor was present (Fig. 1C, lanes 1 and 2), and similar results were observed with the C4 and C5 complexes in testis extracts (lanes 3 and 4). To verify which factors are involved, Sp1-and Sp3-specific antibodies were included in the bandshift reactions. The addition of an anti-Sp1 antibody selectively diminished formation of the C4 complex in both liver (Fig. 1D, lanes 1 and 2) and testis (lanes 3 and 4), consistent with the idea that it contained Sp1. Likewise, the addition of an antibody specific to Sp3 selectively diminished formation of the C5 complex in both types of extracts (Fig. 1D, lanes FIGURE 1. The zinc finger transcription factors Sp1 and Sp3 recognize mouse ALF promoter DNA. A, the sequence of the mouse ALF promoter between Ϫ133 and ϩ25 is shown. The shaded regions highlight the TC-rich area, the TTCAAA sequence, and sites for factors identified in this report. CG dinucleotides are indicated in red. The arrows show known transcription start sites, and the dashed line shows the region of the ALF promoter used in initial DNA-protein interaction assays. B, bandshift analysis using ALF promoter DNA with liver (lane 1) and testis (lane 2) extracts reveals six major complexes in liver (C1-C6), and three major complexes in testis (C3-C5). C, liver (lanes 1 and 2) and testis nuclear extracts (lanes 3 and 4) were incubated with the ALF promoter in the absence (lane 1 and 3) or presence (lane 2 and 4) of an Sp1 binding site competitor. Complexes C4 and C5 were sensitive to the addition of competitor, whereas complexes C1, C2, C3, and C6 were unaffected. D, liver and testis nuclear extracts were incubated with an ALF probe in the absence (lanes 1 and 3) or presence of an Sp1-specific antibody (lanes 2 and 4). The results show that complex C4 is specifically eliminated, whereas the other complexes are unaffected. Liver extracts were also incubated with an ALF probe in the absence (lanes 6 and 8) or presence of an Sp3-specific antibody (lanes 7 and 9). The results show that complex C5 is specifically eliminated, whereas other complexes are unaffected. Lanes 5 and 10 are controls showing that the antibodies themselves do not have intrinsic binding activity. 6 -9). These results demonstrate that the ALF promoter is recognized by the zinc finger transcription factors Sp1 and Sp3.
The Insulator/Silencer Protein CTCF Recognizes the ALF Promoter-Some zinc finger transcription factors, especially those with multiple fingers, have relatively broad site specificity for GC-rich sequences. An example of such a factor is the chromatin insulator and silencer protein CTCF (11,13). Because it seemed plausible that insulator-like activities might be important for the regulation of germ cell genes in somatic cells, we asked whether CTCF would interact with ALF promoter DNA.
To address this question we first synthesized full-length CTCF protein and a zinc finger DNA binding subdomain (CTCF-Zn) using a coupled transcription-translation system. Control experiments showed that programmed lysates generated strong shifts when incubated with a control CTCF site from the chicken ␤-globin gene (19) (data not shown). When lysates programmed with the full-length CTCF were used in reactions with a labeled ALF promoter, two complexes were generated ( Fig. 2A, lane 1). Competition experiments with unlabeled oligonucleotides corresponding to ALF, a consensus cFII site, and an Sp1 site were consistent with the idea that the CTCF-programmed reticulocyte lysate was specifically recognizing an element within the ALF promoter (lanes 2-4).
Lysates programmed with the smaller CTCF-Zn fragment generated a strong, more rapidly migrating complex whose properties in competition experiments matched those of its fulllength counterpart (lanes 5-8). In addition, a human ALF promoter probe would also be recognized by the CTCF-Zn protein, albeit more weakly than the mouse probe (Fig. 2B, lanes  1-4).
Endogenous CTCF Protein in Liver and Testis Extracts Binds to the ALF Promoter-To directly show that CTCF was responsible for complex formation, we performed supershift reactions using commercially available CTCF-specific antibodies raised against either the N terminus (CTCF N-Ab) or the C terminus (CTCF C-Ab) of CTCF. The N-terminus-specific antibody eliminated formation of both complexes seen with CTCF-programmed lysates, generating an antibody-dependent supershift (Fig. 2C, lanes 1 and 2). The C-terminus-specific antibody eliminated the upper complex but not the lower complex (lanes 5 and 6), indicating that the lower of the two complexes lacks a complete C terminus. Neither antibody had an effect on the complexes formed with CTCF-Zn (Fig. 2C, lanes 3, 4, 7, and 8).
We next asked whether complexes formed on the ALF promoter using liver and testis extracts were actually due to endogenous CTCF. This seemed likely, since the large and small complexes observed with recombinant CTCF comigrated with extract-derived complexes C1 and C3 (Fig. 2D, lanes 1 and 2). When the N-terminus-specific CTCF antibody was added to the reaction, the C3 and C1 complexes were supershifted, but no other complex was affected (compare lanes 2 and 3). Similarly, when these antibodies were added to reactions that contained testis extracts, the C3 complex was supershifted (Fig. 2D,  lanes 4 and 5). In addition, the C-terminus-specific antibody was also able to abolish formation of the C3 complex in testis extracts (lanes 7 and 8). Since the antibodies used in these experiments do not recognize the male-specific CTCF-like fac-tor BORIS (12), the C3 complex observed in whole testis extracts is due to CTCF. We did not observe complexes that were shifted by both the Sp1/Sp3 and CTCF antibodies, sug-  7 and 8). D, the C1 and C3 complexes observed in liver extracts comigrate with those formed with in vitro translated recombinant CTCF (lanes 1 and 2). The addition of the N-terminal specific antibody selectively supershifts the C1 and C3 complexes but does not affect any other complex (lane 3). The addition of this antibody to a reaction containing testis extract (which contains very little C1 complex) results in a supershift of the C3 complex (lanes 4 and 5). The addition of the C-terminus-specific antibody to reactions that contain testis extracts abolishes the C3 complex (lanes 7 and 8) but does not affect complexes C4 or C5.
gesting that the probe was not bound by more than one factor. Overall, these results lead to the interesting conclusion that the insulator/silence factor CTCF interacts with an element within the germ cell-specific ALF gene.
Localization of Regulatory Factor Binding Sites-To locate the positions at which Sp1, Sp3, and CTCF were bound, we prepared a series of oligonucleotides that contained mutations in the GC box (mut-Sp1), mutations along the length of the sequence (mut 5, mut 3-1, mut 3-2, and mut 3-3), or mutations in selected C or G residues (mut CG to AG, mut CG to CT, mut C to A, and mut G to T) (Fig. 3A). These oligonucleotides, together with a wild type control, were then used as competitors for complexes formed with liver NE (Fig. 3B).
The results make several important points about regulatory factor interactions along the ALF promoter. First, the mut Sp1 competitor, which contains a mutated GC box, is the only oligonucleotide that could not compete for the C4/C5 complexes (Fig. 3B, lane 7). Thus, we conclude that this site is critical for the binding of both Sp1 and Sp3. Second, the mut 5 and mut 3-2 oligonucleotides compete for CTCF binding (lanes 3 and 5), whereas mut Sp1 and mut 3-1 do not compete (lanes 4 and 7). These results thus define a 17-nucleotide CTCF recognition motif that overlaps with the Sp1/Sp3 GC box site at the 5Ј-side (Ϫ84) and extends to position Ϫ68 at the 3Ј-side. The competition results also give a clue to the location of the C6 complex. In particular, the inability of mut 3-2, mut 3-3, mut CG to AG, and mut G to T oligonucleotides to compete for this complex (lanes 5, 6, 8, and 11) places the binding site for this factor just downstream of CTCF. The competition properties of C2 and smaller complexes were either indeterminate or similar to CTCF.
The Winged Helix Transcription Factor RFX1 Binds to the Mouse ALF Promoter-Inspection of sequences where complex C6 was formed indicated possible binding sites for C/EBP or RFX factors. Supershift experiments using anti-C/EBP antibodies failed to produce any change (data not shown). On the other hand, a well characterized RFX site from the polyoma virus enhancer was able to eliminate the C6 complex without affecting any of the other complexes (Fig. 3C, lanes 1-3). In addition, an RFX1-specific antibody supershifted the C6 complex, indicating that the C6 complex is due to RFX1. The presence of this complex in testis was very weak, and we do not know if it is due to RFX1 or to other members of the RFX family.  (lanes 2-11). The results reveal overlapping binding sites for Sp1/Sp3 and CTCF and identify an additional downstream binding site for complex C6. An analysis of the differential competition patterns is described under "Results." C, reactions containing liver NE and the ALF promoter (lane 1) were incubated with a polyoma virus EP RFX site oligonucleotide (lane 2) or with an oligonucleotide that contained mALF promoter sequences downstream of mutation 3-1. Both oligonucleotides competed for C6 binding but did not affect any of the other complexes. In addition, the presence of an RFX-1-specific antibody resulted in the selective supershift of the C6 complex (compare lanes 4 and 5). An antibody-alone control is shown in lane 6. D, an alignment of the mouse ALF RFX site together with RFX sites from several other genes shows strong similarity to the RFX consensus.
Alignment of the RFX site from the ALF promoter with other RFX sites shows a match to a consensus (25) (GTNRCCNNR-GYAAC) at seven of eight positions for which a particular nucleotide is specified and at all three positions where a purine (R) or a pyrimidine (Y) is specified (Fig. 3D). Fig. 5B showed that mutations on either side of a 17-bp region had no effect on CTCF binding. This was surprising, since the binding site is thought to be relatively long (13,24). To investigate this point, we synthesized two additional sets of oligonucleotides for use as binding site competitors. Members of one set contained the 17-base pair core but differed in the length of their flanking sequences (Fig. 4A). The second set was composed of natural genomic DNA sequences from mice, humans, cows, dogs, and pigs that had been retrieved in data base homology searches using the 17-base pair sequence as the query. These sequences contain between 15 and 17 base pairs that are identical to the site in the ALF promoter but possess unique flanking sequences (Fig. 4B).

A Minimal CTCF Site and Prediction of Additional Genomic Sites-The results in
When oligonucleotides from the first set were added to bandshift reactions containing the ALF promoter, we observed a flanking region-dependent competition for CTCF binding (Fig. 4C). In particular, WT29, with 6 bp of flanking DNA, and WT17, with no flanking DNA, were unable to compete for binding (lanes 4 and 5), whereas the WT and WT40 competitors could compete. WT17 was the only oligonucleotide that could not compete for Sp1/Sp3 binding (lane 5). Overall, these results indicate that although the 17-bp core provides recognition specificity, additional flanking DNA is necessary for stable interactions to occur.
We then asked whether the core site would predict CTCF sites in genomic DNA and whether the surrounding sequences would have an effect on the strength of binding. The experiment was performed under conditions in which the competitor was not in vast excess, so that any differences due to flanking DNA would be observed. The results show that all of these natural genomic sequences could compete for binding (Fig. 4D, lanes 2-9). Because there is a GC box within the CTCF site, all of the oligonucleotides were also able to compete for Sp1/Sp3 binding. However, none were able to compete for the C6 complex, since their flanking regions did not contain an RFX site. Control reactions showed that the wild type mouse ALF self-competitor competed with similar efficiency as the test set (lane 10), whereas a mut GC oligonucleotide was unable to compete for either Sp1/Sp3 or CTCF binding (lane 11).
Alignment of CTCF Sites-The ability of CTCF to recognize seemingly dissimilar binding sites, in part through the use of different combinations of zinc fingers, makes identification of CTCF sites a challenging proposition (13). Still, alignments of such sites may be helpful in identifying common sequence features. The alignment shown in Fig. 5A was optimized visually by placing sequences around the distinct stretches of G and C nucleotides in register as well as possible. To help simplify the alignment, the decision was made to allow no gaps, and the use of "top" or "bottom" strands was based on which fit the alignment best. If the threshold is set at 50% identity at each position within the shaded area (Fig. 5A), the consensus is 5Ј-CCCGN-GGCCGCTAGGGGGCGGGC-3Ј. If the stringency is increased to 60% identity at each position, the consensus is 5Ј-CCN-NNGCCGCTAGGGGGNGG-3Ј. If the stringency is increased to 70%, the consensus is 5Ј-GCNGCTNGGGGG-3Ј.
An important test of the alignment is whether it could or could not generate a functional CTCF site. To test this point, we synthesized oligonucleotide competitor pools that matched the 50, 60, and 70% stringency thresholds (Fig. 5B). To eliminate FIGURE 4. A 17-bp CTCF core sequence predicts CTCF interactions. A, oligonucleotides that contained the 17-bp CTCF core binding site defined by the competition analysis of Fig. 5 were synthesized with different lengths of flanking sequences. B, data base identification of naturally occurring sequences from mice, cows, humans, pigs, and dogs that match the 17-bp core sequence. All of the sequences contain a GC box as part of the 17-bp core. C, competition analysis of complexes formed with liver NE shows that flanking sequences are necessary for the 17-bp core CTCF site to function effectively as a CTCF binding site competitor (lanes 2-5). D, naturally occurring sequences that contain the 15-17 bp that exactly match the ALF CTCF core but differ in their flanking sequences are efficient competitors for both Sp1/Sp3 (C4/C5) and CTCF (C1/C3) binding but not for RFX binding (C6) (lanes 2-9). A mouse ALF self-competitor control competes for the same complexes but also abolishes the RFX (C6) complex (lane 10), whereas a mutant Sp1 site competitor control only competes for the RFX (C6) complex (lane 11).
possible effects of flanking DNA, we synthesized these oligonucleotides with either the natural mouse ALF flanking DNA, or with random AGTC residues. As shown in Fig. 5C, all of the sequences were able to compete for formation of the Sp1/Sp3 and CTCF complexes to varying degrees. At the 70% threshold, however, both types of oligonucleotides became less effective as competitors for CTCF binding as well as for Sp1/Sp3 (lane 4). As expected, only oligonucleotides with the ALF flanking regions were able to compete for the C6 (RFX) complex ( lanes  2-4). The data show that a functional binding site had been identified by this analysis and that the determinants for binding lie within a relatively small region. It is clear, however, that many sequence variations are consistent with binding, and further studies will be required to address this specificity issue.
DNA Methylation Status Affects CTCF Binding to the ALF Promoter-Because methylation status is associated with ALF expression (26), we asked whether any of the complexes formed on the ALF promoter would be affected by methylation. To address this point, we prepared methylated and unmethylated ALF probes using SssI methylase. The reaction was monitored with the restriction enzyme HpaII and its methylation-insensitive isoschizomer MspI. Digestion of the unmethylated probe generated two fragments regardless of which enzyme was used, whereas the SssI-methylated probe could only be digested with MspI (Fig. 6A, lanes 1-6).
The results of interaction assays showed that a control ALF probe that had been used in earlier experiments and a probe that had gone through a mock methylation reaction but which was unmethylated were both able to bind CTCF and CTCF-Zn polypeptides (Fig. 6B, lanes 1, 2, 4, and 5). In contrast, a methylated ALF probe was unable to form complexes with either of the recombinant proteins (lanes 3 and 6). In addition, both the endogenous C1 and C3 complexes were absent in reactions that contained liver NE and the methylated ALF probe (Fig. 6C,  lanes 1 and 2). Other complexes, most notably C4 and C5, were not affected. The results show that CTCF would be specific for an unmethylated or undermethylated promoter, whereas Sp1 and Sp3 would bind independent of methylation status.
Functional Analysis Suggests That the ALF Promoter Is Repressed in Somatic Cells-The identification of the silencing factor CTCF and the fact that the endogenous ALF gene is normally off in somatic cells suggested the possibility that the promoter might be maintained in a repressed state. To investigate this question, we prepared a wild-type ALF promoter construct spanning nucleotides Ϫ105 to ϩ8 linked to a downstream luciferase reporter in the pGL-3 Enhancer vector. This construct was used as a parent to prepare a series of 10-bp mutations spaced at 4-bp intervals along the length of the promoter, beginning at position Ϫ99 and ending at position Ϫ6 Nucleotides with the greatest similarity in 10 of 12 sequences are shaded. B, each position in the shaded region was tabulated according to whether it was conserved in 50, 60, or 70% of the aligned sequences (described under "Results"), and artificial consensus sequences were then synthesized with flanking regions derived from ALF or a random AGTC repeat. C, competition analysis shows that each pool varies in its ability to compete for liver-derived ALF promoter complexes. Sequences with 50 and 60% stringency compete equally well for C4/C5 and C1/C3 (lanes 2, 3, 5, and 6). Sequences set at the higher 70% stringency have disrupted the GC box and do not compete as well for the C4/C5 complexes (lanes 4 and 7). In addition, the 70% sequences show that the ALF flanking sequences compete somewhat less effectively for binding to CTCF compared with the random sequences (lanes 4 and 7). Only oligonucleotides with natural flanking sequences and not those with the AGTC sequences are able to compete for the C6 complex (compare lanes 2-4 with lanes 5-7).  1, 2, 4, and 5) but do not bind a methylated promoter (lanes 3 and  6). C, CTCF-dependent C1 and C3 complexes are not observed with the methylated promoter probe, whereas complexes C2, C4, and C5 are unaffected (lanes 1 and 2). (Fig. 7A). The strategy for mt1, mt4, mt5, and mt6 was that A and G were changed to T, whereas T and C were changed to G. For mt2, mt3, and mt7 A and G were changed to C, and T and C were changed to A.
Six-fold replicate transfection assays were performed with each construct in somatic 293 cells. Interestingly, results with the mt3, mt4, and mt5 constructs all resulted in an increase in activity relative to the wild-type promoter (Fig. 7B). The fact that three different 10-bp mutations spanned a region of nearly 40 nucleotides argued against the idea that a particular mutation had resulted in the accidental creation of an activator element. Instead, the results seemed to indicate that mutations along this region resulted in a partial release from repression. The position of mt3 mutations coincides with one side of the CTCF binding site, the position of mut4 falls within the predicted site for RFX, and mut5 removes the conserved TTCAAA element. The mt2 resulted in a minor reduction in activity, whereas a construct in which all but one CpG dinucleotide had been mutated to GT showed very low activity. Mutations mt6 and mt7 show only modest reductions in activity, consistent with the fact that these are downstream of the identified transcription start sites. The results support the idea that elements within the ALF promoter may serve to maintain low activity in somatic cells.
We next wanted to test whether some of the factors we had identified had repressive effects on ALF promoter activity in transient cotransfection assays using 293T cells (Fig. 7C). Experiments were performed using constant amounts of pmALF200, a pGL3-based luciferase reporter driven by 200 bp from the mALF promoter. The results showed several interesting trends. First, increasing amounts of pFCMV-CTCF resulted in decreased ALF reporter expression. In contrast, cotransfection with pFCMV-BORIS, which encodes the germ cell-specific counterpart of CTCF (12), resulted in a modest dose-dependent increase in expression. These results are consistent with the notion that CTCF acts to down-regulate expression and with recent reports indicating that CTCF and BORIS may have opposing effects on genome-wide patterns of gene expression in tumor cell lines (27,28). Additional experiments with  1-4) and was associated with activation of the endogenous ALF gene (lanes 9 -12). Lane 13 shows the position of the ALF reverse transcription-PCR product using control RNA from testis. Actin RNA levels were unaffected by treatment (lanes 5-8), and reactions using RNA that had not been treated with reverse transcription were negative (lanes 14 -17). The bottom set of panels using a negative control siRNA (ϪsiRNA) resulted in no change in CTCF RNA levels (lanes [1][2][3][4] and no activation of ALF (lanes 9 -12). The Western blot at the right shows that CTCF protein levels are reduced in cultures treated with CTCF-specific siRNAs (lanes 1 and 2). NS, a nonspecific band.
pFCMV-Sp1 and pFCMV-Sp3 showed that whereas Sp1 did not affect the reporter, Sp3 resulted in a substantial repressive effect even at low concentrations. This is in agreement with reports of Sp3-dependent repression (29). The data demonstrate that two of the factors identified have the ability to repress ALF expression in somatic cells.
If CTCF is a bona fide regulator of the ALF gene, we would predict that a decline in the levels of endogenous CTCF might result in ALF gene activation. To test this possibility, CTCF siRNAs were transfected into NIH3T3 cells, and the level of the endogenous ALF mRNA was measured at various days thereafter. Although it was somewhat difficult to balance siRNA transfection without having adverse effects on the cells, we were able to observe that a decrease in the level of CTCF mRNA (Fig. 7D,  lanes 1-4) was associated with the appearance of ALF expression (lanes 9 -12). Western analysis showed that treatment was associated with a decline in CTCF protein. ALF expression rose to maximal levels by the second day after transfection (lane 11) and was not observed in untreated control cells or with control RNA samples that had not been treated with reverse transcriptase (lanes 14 -17). In addition, an actin control used to normalize the amount of RNA used was unaffected by treatment (lanes [5][6][7][8]. Collectively, functional assays and binding data indicate that CTCF and Sp3 are repressive factors that could play a role in the somatic cell-specific down-regulation of the ALF gene.

DISCUSSION
The aim of this report was to characterize mechanisms important for the proper expression of germ cell-specific genes. This is an interesting issue, because such genes are regulated in a strict on-off manner and because their promoters are typically very short. Germ cell transcription is also peculiar, because it often involves a switch from somatic to germ cell sites of initiation (30) and, in some cases, is controlled by regulatory elements within retrotransposons (31). Further complexity is suggested by the observation that there are subgroups of germ cell genes whose temporal activation correlates with progression through spermatogenesis (32). Although experiments in transgenic mice show that germ cell promoter modules provide sufficient cues for activation and silencing, many basic problems, including the identification of the relevant regulatory factors, remain to be understood.
Some of the results reported here are summarized in Fig. 8A. One conclusion is that the mouse ALF promoter is recognized by members of the Sp1 family of zinc finger transcription factors. Sp1 not only has a wide role in transcriptional regulation of housekeeping genes but is also reported to control germ cellspecific genes such as PIASx, H1t, PGK2, LDH-C, cyclin A1, and PDHA-2 (33)(34)(35)(36)(37)(38)(39). Moreover, expression of the Sp1 gene is characterized by several unique features in germ cells, including up-regulation during the pachytene stage of meiotic prophase I and the generation of alternate mRNA transcripts (15,16). Additional members of this family include Sp2, Sp3, and others (14), but the roles of these as gametogenic factors are less well established. Our studies suggest that Sp3 may act as a somatic repressor of ALF expression, a result consistent with other reports of Sp3-dependent repression (29).
Binding data also indicate that members of the winged helix RFX family of transcription factors can recognize the ALF promoter. The particular species identified in this report is the somatic factor RFX1, although other members of this family, including RFX2 and RFX4, are testis-specific (17,18). RFX sites are present in the germ cell-specific H1t gene, and RFX2 activates H1t expression in GC-2spd germinal cells (25,40). In this report, we did not further evaluate the role of this factor for ALF expression.
The most surprising finding here was that the ALF promoter is recognized by the chromatin insulator and transcriptional silencer CTCF. This factor was first identified as a protein that bound within a DNase I hypersensitive region of the chicken ␤-globin gene and exhibited enhancer blocking activity (19). We observed two endogenous CTCF-dependent complexes with an ALF promoter probe, C1 and C3, one of which is due to a truncated form of CTCF that lacks a complete C terminus (C1). Although we do not know if this species has a distinct physiological role, its abundance relative to the full-length C3 complex is notably reduced in extracts of whole testis. An unusual feature of CTCF is that its binding site specificity is broad or adaptable through the use of alternate combinations of zinc fingers (13,24). Our studies on the binding of CTCF to the ALF promoter revealed several important points. First, the mouse ALF CTCF site is 17 bp long and could potentially be even smaller. Second, this site has dual specificity for both Sp1/ Sp3 and CTCF. Third, this site can be used directly or by alignment to predict other CTCF sites in genomic DNA. Finally, additional flanking DNA is required to provide a stable platform for binding.
The sensitivity of CTCF to methylation status raises the question of whether this factor would bind the endogenous promoter. Inspection of earlier data shows that of 13 ALF promoter-derived PCR products, only two were methylated at any of the three CpGs located within the CTCF site (26). The remaining 11 sequences were not modified at any of the sites, although most were modified at one or more sites located upstream or downstream. Thus, the reported methylation pattern of CpG dinucleotides in and around the first exon of the ALF gene in somatic cells appears not to preclude CTCF binding. Of course, it is possible that binding is normally transient, for instance working in a "touch-and-go" manner to establish a repressive state at specific times in development or when the DNA becomes undermethylated. In addition, the fact that CTCF, Sp1, and Sp3 bind to overlapping sites and have differential sensitivity to methylation status suggests that the relevant interaction(s) and functional consequences would depend on DNA interaction affinities, the extent of promoter methylation, and relative factor concentrations in a particular tissue.
Several recent studies have shown that CTCF levels are reduced and BORIS levels are increased in some cancer cell lines and that this is associated with the derepression of some testis-specific genes (27,28). This observation is consistent with results presented here showing a direct interaction between CTCF and a germ cell promoter and with the observation that CTCF and BORIS had opposing effects on the expression of an ALF promoter-reporter construct (Fig. 7C). CTCF is normally down-regulated in male germ cells and replaced with BORIS (12). We speculate that this exchange could result in a chromatin state that is permissive for interactions with the basal transcription machinery and site-specific regulators, resulting in gene activation. Once activated, ALF, together with other germ cell transcription factors, is likely to mediate downstream effects on gene expression (Fig. 8A). The observation that CTCF is up-regulated later in spermatogenesis (12) suggests a possible role in the reestablishment of a silent, somatic state.
The demonstration that there are direct interactions between a germ cell-specific promoter and an insulator such as CTCF provides a potential mechanism by which such genes might protect themselves from inappropriate activation by enhancers of adjacent genes (Fig. 8B). Silencer elements and repressive factors have been suggested for other germ cell-specific genes, including PGK-2, H1t, SP-10, and LDHC (41)(42)(43)(44). Although in many cases the factors responsible have not been identified, several studies have shown that E2F6 is important to maintain the somatic silencing of germ cell genes such as ␣-tubulin, Tex12, SMC1␤, and STAG3 (45,46), whereas repression of the SP-10 gene involves TDP-43 (47). In addition to the role of specific factors, higher levels of DNA methylation are sometimes associated with silent germ cell genes (48). Overall, the results point to the idea that release of somatic silencing may be critical for the activation of germ cell genes.
We had demonstrated earlier that TBP was capable of recognizing the TATA-like TTCAAA element (26), and in the current study we identify four additional transcription factors as candidate regulators of the ALF gene. Three of these factors bind to a recognition element that has dual specificity for members of the Sp1 and CTCF families. The effects of these factors on ALF promoter activity support the idea that repression and derepression are an important aspect of germ cell gene regulation. In future studies, we hope to address the physiological role of each factor and the role of chromatin packaging in germ and somatic tissues.