A Repressor with Similarities to Prokaryotic and Eukaryotic DNA Helicases Controls the Assembly of the CAAT Box Binding Complex at a Photosynthesis Gene Promoter*

A single nucleotide exchange in a promoter region located immediately upstream of the CAAT box of the spinach photosynthesis gene AtpC (gene product is subunit γ of the chloroplast ATP synthase) prevents the formation of a secondary structure and causes an unregulated, constitutive high level of expression (Kusnetsov, V., Landsberger, M., Oelmüller, R. (1999)J. Biol. Chem. 274, 36009–36014). We have isolated cDNAs for ATPC-2, a new polypeptide with homologies to pro- and eukaryotic helicases, which specifically binds to this promoter region. Binding of ATPC-2 competes strongly with that of a CAAT box binding factor (CBF), consistent with the idea that both complexes cannot be formed simultaneously because of sterical reasons. In gel mobility shift assays, high binding activities of ATPC-2 and low binding activities of CBF were observed with nuclear extracts from tissue with low AtpC expression levels, and the opposite was observed with extracts from tissues with high AtpCexpression levels. Binding of ATPC-2 to the mutant sequence, which directs a constitutively high level expression in vivo and prevents the formation of a secondary structure in vitro,is significantly weaker than binding to the wild-type sequence. Again, the opposite results were obtained for the CBF. Thus, we conclude that the assembly of the CBF·DNA complex stimulates transcription of AtpC and that CBF binding is prevented if ATPC-2 is bound to the promoter region. The novel mechanism of gene regulation and the role of the helicase-like protein ATPC-2 as a potential transcriptional repressor is discussed in relation to its modular structure.

The plastid ATP synthase of higher plants and algae consists of nine different subunits, and three of them are encoded by the nuclear genes AtpC, AtpD, and AtpG (the gene products are the subunits ␥ and ␦ of the CF 1 moiety and CF o II) (2). Comparable with other nuclear-encoded genes for plastid proteins, expression of the spinach ATP synthase genes and chimeric promoter::reporter gene fusions in transgenic tobacco is strongly regulated by light, phytohormones, in particular cytokinins, or the stage of the plastids in cells in which these genes are expressed (cf. Refs. [3][4][5][6][7]. However, unusually for photosynthesis genes, the essential cis-elements determining the regu-lated expression of AtpC are positioned in close vicinity to the respective transcription start sites (4), and crucial nucleotides for the regulated expression appear to be located immediately upstream of the CAAT box (4). This region also forms a complex with a CAAT box binding factor, CBF 1 (1). However, in contrast to many CBFs from metazoa (8 -10) the binding activity in in vitro assays is regulated, and the binding activities correspond to the promoter activities in vivo (1). Surprisingly, a single nucleotide exchange uncoupled the promoter activity from regulatory pathways and resulted in a high constitutive expression. This indicates that low promoter activities (e.g. in darkness or in photobleached tissue) are caused by an inhibitory effect that is no longer active under conditions of high promoter activities, e.g. in light or after cytokinin treatments. Here we describe the isolation of a DNA that codes for a protein with sequence-specific binding characteristics to this promoter region. We demonstrate that the binding activity is high in tissue with low AtpC expression levels and, vice versa, that this behavior is opposite to the binding features described previously for CBF (1), and that the recombinant protein competes with CBF for binding. The data are consistent with a model in which this protein functions as a repressor by preventing the assembly of CBF at the CAAT box and thus repressing AtpC transcription.

EXPERIMENTAL PROCEDURES
Plant Growth-Tobacco seeds were surface-sterilized, planted on 1/2 Murashige and Skoog medium supplemented with 2% sucrose in the presence or absence of Norflurazon (10 Ϫ6 M) and kept in a cold room in darkness for 2 days to synchronize germination. Seedlings were either kept in darkness for 10 days at 22°C or they were transferred to white light for 40 h prior to harvesting in the absence or presence of Norflurazon. Alternatively, etiolated seedlings were transferred to cytokinincontaining plates (N 6 -benzylaminopurine, 10 Ϫ5 M) in darkness 40 h prior to harvest.
Double-stranded Oligonucleotides, Nuclear Extracts from Tobacco, Southwestern Hybridization-Oligonucleotides for wild-type and mutant sequences were annealed (11) and cloned into the SmaI site of pBSC ϩ (Stratagene, San Diego) prior to sequencing. Plasmids were chosen in which the 5Ј-end of the promoter fragment was orientated toward the BamHI site of pBSC ϩ . Thus all promoter fragments had the following 5Ј-and 3Ј extensions: 5Ј-GATCCCCC[Insert]GGGCTGCA-3Ј and 3Ј-GGGG[Insert]CCCG-5Ј. After digestion of the plasmids with PstI and BamHI, the BamHI site was filled-in with all four radiolabeled nucleotides and the Klenow enzyme, and the fragments were gel-purified on agarose gels.
For the binding studies in the presence of CBF-C (1), nuclear protein extracts were isolated from six-week-old tobacco or Arabidopsis plants grown in the greenhouse. For physiological studies, seedlings were grown as described above, and the nuclear protein fractions were prepared as described (12). * This work was supported by the German Research Foundation and the "Fond der Chemischen Industrie." The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
Approximately 190,000 phages of an Arabidopsis thaliana cDNA expression library were screened with the radiolabeled AtpC promoter region (Ϫ68 to Ϫ39; see below). Two positive phage were plaque-purified, and pBSC ϩ was excised according to manufacturer's instructions (Stratagene) prior to sequencing.
"Southwestern" hybridization was performed at 4°C. The filters were first blocked with blocking buffer by gentle shaking for 3 h (SW buffer, 1% bovine serum albumin, 1.5 mg/ml denatured salmon sperm DNA; SW buffer ϭ 25 mM HEPES-NaOH, pH 7, 6, 50 mM NaCl, 5 mM MgCl 2 , 1 mM EDTA, 5% v/v glycerol, 1 mM dithiothreitol). After three washing steps (10 min each) with HS buffer (SW buffer with 1 mg/ml denatured salmon sperm DNA), the filters were incubated overnight with SW buffer in the presence of the radioactive DNA fragments. After washing with SW buffer, the signals were visualized on a PhosphorImager (Storm 820, Molecular Dynamics, Krefeld, Germany).
Overexpression and Purification of GST Fusion Proteins, Filter Binding Assays-The isolated cDNA as well as the cDNA encoding CBF-C (1) were N-terminally fused to GST according to the manufacturer's instructions (pGEX vector system from Amersham Pharmacia Biotech). Plasmids encoding GST fusion proteins were transformed into Escherichia coli strain BL21 (DES)pLysS. Cultures were grown overnight at 37°C in the presence of 40 g/ml chloramphenicol and 100 g/ml ampicillin, diluted 100-fold, and grown at 30°C to an optical density of 0.6 at 600 nm. Protein expression was induced by isopropyl-␤-D-thiogalactopyranoside, with a final concentration of 0.5 mM at 27°C. Cells were harvested by centrifugation (5,000 ϫ g for 10 min at 4°C) 4 h after induction, resuspended in 1ϫ phosphate-buffered saline, and sonicated. After centrifugation (12,000 ϫ g for 15 min at 4°C), aliquots of the supernatant were applied directly onto freshly prepared glutathione-Sepharose 4B columns and further processed according to the manufacturer's instructions (Amersham Pharmacia Biotech). The purified GST and GST fusion proteins were separated by SDS-polyacrylamide gel electrophoresis (12.5%) and transferred onto nitrocellulose. Southwestern analysis was performed as described previously (1) with the radiolabeled DNA fragments. For gel mobility shift assays, equal amounts of the purified proteins were used.
Gel mobility shift assays were performed essentially as described (1,12). CBF-C was incubated with nuclear protein extracts (1 g/l) in binding buffer (25 mM HEPES/KOH, pH 7, 6, 50 mM KCl, 5 mM MgCl 2 , 5% glycerol) for 30 min at 37°C. Finally, the DNA was added and the mixture incubated at 37°C for 2 h prior to gel electrophoresis. For competition studies, the purified ATPC-2⅐GST fusion protein, the CBF-C⅐GST fusion protein, or GST alone was added to the preformed radiolabeled protein⅐DNA complex and treated as described in the figure legends. In all of the studies the CBF-C fusion was preincubated with nuclear extract as described above, whereas the ATPC-2 fusion was added directly.

RESULTS
The AtpC Promoter Segment Ϫ69 to Ϫ39 Specifically Interacts with ATPC-2-A double-stranded oligonucleotide from the Ϫ69 to Ϫ39 AtpC region (5Ј-TTTACCTCCAAAATTCAATGGC-CAAAATCT-3Ј) harboring the expression-relevant AAAAT (bold) and the CAAT (italics) motifs was used initially to screen an expression library from Arabidopsis. Positive signals were obtained only when the screen was performed in the presence of 1 mg/ml nuclear extract. The four isolated cDNAs encoded CBF-C, the subunit C of a new CAAT box binding complex isolated from plants (1). Therefore, a second screen was initiated with an oligonucleotide in which the CAAT sequence was replaced by GTTA, an oligonucleotide that fails to bind to CBF-C (1). Two positive phages with significantly lower binding strength were obtained. Both of them also bind to the wild-type sequence harboring the CAAT motif but not to a mutant sequence with a single nucleotide exchange (A 3 G) in the AAAAT sequence (data not shown; see below). DNA sequence analysis revealed that both phages carried identical DNA insertions that code for the C-terminal part of a novel protein named ATPC-2. After the complete nucleotide sequence of this gene became available in the data banks, we polymerase chain reaction-amplified the complete cDNA from our library and confirmed that the protein is encoded by a genomic sequence without introns (Fig. 1). A homologous cDNA was also isolated from our tobacco cDNA library; however, it is not full-length (data not shown).
Data bank searches revealed that the deduced amino acid sequence exhibits homologies to DNA helicases from pro-and eukaryotic organisms (Fig. 2). Surprisingly, the strongest homologies were observed to the immunoglobulin S Mu-binding protein 2 from mouse, a DNA helicase from human, and a putative DNA-binding protein from yeast (GenBank TM accession numbers P40694, L24544, and Z98951, respectively), whereas an Arabidopsis protein (GenBank TM accession number AB026643) with an unknown function exhibits less sequence similarities. Interestingly, the isolated protein also exhibits striking similarities to various bacterial proteins (amino acid sequence identity Ͼ 30%); however, the function of these proteins has not yet been determined.
Motif searches did not reveal any obvious functions of ATPC-2 except that the protein harbors 29 (partially overlapping) putative phosphorylation sites for signal transduction components (Fig. 1). Closer inspection uncovered that the similarities to the mouse, human, and yeast proteins include the N-terminal five well characterized motifs known from DNA helicases (boxes I, Ia, II, III, and IV, Fig. 2). The protein as well as eight homologues with unknown function that are present in the data banks end behind box IV (Fig. 2). Furthermore, the C-terminal parts of the latter protein group contain a segment of ϳ60 amino acids exhibiting between 65 and 85% amino acid sequence identity to each other (data not shown). This suggests that the Arabidopsis polypeptide belongs to a novel class of proteins present in various species and that this class of proteins exhibits striking similarities to the N termini of some DNA helicases (Fig. 2; cf. "Discussion"). In addition, a phylogenetic tree indicates that the eukaryotic proteins are closely related to each other, whereas the bacterial proteins appear to be more distantly related (data not shown).
Binding of ATPC-2 to the AtpC Promoter in Vitro-To test the DNA sequence requirement for the formation of the ATPC-2⅐DNA complex, a series of mutant sequences for which the transcriptional activity is known from chimeric promoter::uidA gene fusions in transgenic tobacco (1) was tested for its ability to bind to ATPC-2 and CBF-C. Both proteins were overexpressed in E. coli as GST fusion proteins, isolated first by glutathione 4B-Sepharose chromatography and then by SDSgel electrophoresis (Fig. 3A). The isolated proteins were separated on a gel before transfer to nitrocellulose membranes. Filter binding assays (Fig. 3B) in the absence (left panels) or presence (right panels) of nuclear extracts demonstrate that the wild-type sequence (AAAATTCAAT, top panels) binds to ATPC-2 in both instances. As previously described (1), binding of CBF-C requires the addition of nuclear extracts (Fig. 3B,  right top panel). The mutant sequence with an A 3 G exchange (AAGATTCAAT) fails to bind to ATPC-2, whereas CBF-C is still capable of binding in the presence of nuclear extracts (middle panels). In contrast, the CAAT mutant sequence (AAAATTGTTA) fails to bind to CBF-C, whereas it does bind to ATPC-2 (bottom panels). This was confirmed by gel mobility shift assays. Fig. 4 demonstrates that the binding activity is not affected by the CAAT box mutation (compare lanes 2 and 3 with lanes 4 and 5) and that the binding activity of ATPC-2 to the oligonucleotide with the A 3 G exchange is completely lost (lanes 6 and 7). A shorter oligonucleotide, which contains the AAAATT sequence but lacks the CAAT box sequence, also fails to bind to ATPC-2 (lanes 8 and 9); this indicates that ATPC-2 binding to the AtpC promoter is sequence-specific and requires additional nucleotides located 3Ј to the AAAATT motif. Furthermore, we performed competition experiments between purified ATPC-2 and CBF with the wild-type oligonucleotide sequence (Fig. 5). A 10-fold molar excess of the ATPC-2⅐GST fusion protein cannot displace CBF-C⅐GST from the oligonucleotide, whereas the same molar excess of CBF-C⅐GST is sufficient for self-competition (Fig. 5, top panel). The same is true for the opposite experiment, if DNA preloaded with ATPC-2⅐GST is used for competition studies; again the binding activity cannot be competed with a 10-fold molar excess of CBF-C/ GST, whereas ATPC-2/GST is effective (Fig. 5, bottom panel). This indicates that both binding activities are tight and that either ATPC-2 or CBF can bind to this promoter region. The assembly of both complexes simultaneously is not possible for sterical reasons (cf. "Discussion").
A correlation of the promoter activity in vivo (1,3) and its capability to bind to ATPC-2 is presented in Fig. 6. Nuclear extracts from light-grown seedlings showed little binding activity to the wild-type oligonucleotide, whereas extracts from dark-grown seedlings showed substantially higher retardation signals. The same correlation between a low binding activity and a high promoter activity was observed with nuclear ex -FIG. 2. Schematic presentation of the modular organization of helicases belonging to superfamily I (top row) and ATPC-2 (bottom  row). The boxes represent the conserved helicase motifs, and letters inside the boxes are the consensus amino acid sequences of each motif. The labels above the boxes are the names assigned to the motifs. ϩ, hydrophobic residue (Ile, Leu, Val, Met, Phe, Tyr, and Trp); o, charged or polar residue (Ser, Thr, Asp, Glu, Asn, Gln, Lys, and Arg); x, residue not restricted to hydrophobic or hydrophilic. The lowercase letters in the ATPC-2 boxes show residues that are not consistent with the consensus amino acid sequences. The relative positions of the motifs and spacing between motifs are arbitrary (cf. Ref. 30). The conserved region between boxes IV and V of the helicase superfamily I is indicated.  3, 5, 7, and 9) and GST alone (lanes 2, 4, 6, and 8) to the wild-type (lanes 2 and 3) and mutant (lanes 4 -9) AtpC promoter sequences. Equal amounts (5 ng) of the following oligonucleotides cloned into the SmaI site of pSBC ϩ and excised with BamHI (5Ј-end) and PstI (3Ј-end) were used: lanes 1-3, wild-type sequence AAAATTCAAT; lanes 4 and 5, AAAATTGTTA; lanes 6 and 7, AAGAT-TCAAT; lanes 8 and 9, AAAATTC. Lane 1, without protein. tracts from etiolated seedlings treated with cytokinin (high promoter activity (1)) and from light-grown seedlings treated with the herbicide Norflurazon to block chloroplast development (low promoter activity (1)). This is exactly opposite to the result that we obtained previously for CBF binding activity with the same oligonucleotides (1). Furthermore, low retardation signals were always observed when the A 3 G mutant sequence was used for the retardation assay regardless of the physiological treatment of the seedlings; this is consistent with the idea that ATPC-2 prevents the assembly of the CBF at the AtpC promoter and thus functions as a transcriptional repressor.
All of the experiments presented so far were performed with ATPC-2 and/or CBF-C from Arabidopsis with nuclear extracts from tobacco. To test whether the AtpC promoter segment is regulated in a similar fashion in an Arabidopsis background and to confirm that the ATPC-2 repressor exhibits a comparable function in both species, the experiments shown in the Figs. 5 and 6 were repeated with nuclear protein fractions from Arabidopsis. Essentially the same results were obtained, except that the retardation signals were significantly weaker (data not shown), which might be caused by a contamination of the nuclear protein fraction with extranuclear proteins.

DISCUSSION
In many eukaryotic class II promoters, CCAAT motifs are often found between 50 and 100 nucleotides upstream of the transcription start site (13)(14)(15)(16). These motifs are recognized by CCAAT-binding proteins. The CAAT box sequences in metazoa, which are believed to influence the frequency of transcriptional initiation, can be the target site for regulation, and the assembly of CBF at the CAAT motif occurs in response to cell-internal or -external signals (8,(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28). Here we demonstrate that the binding activity of the CBF to the CAAT motif in the AtpC promoter is controlled by ATPC-2. Whenever the promoter is active, we observed a low binding activity of ATPC-2 and a high binding activity of CBF and vice versa (Figs. 5 and 6).
ATPC-2 exhibits striking similarities to the N-terminal half of DNA helicases belonging to superfamily I (29). These helicases couple the binding and hydrolysis of nucleotides, mostly ATP, to conformational changes that in turn alter the affinity of the enzymes for different forms of DNA. These helicases contain a modular structure and sequence comparison of more than 100 members from viral, prokaryotic, and eukaryotic organisms have revealed seven conserved amino acid motifs, the boxes I, Ia, and II-VI. Motif I contains a hydrophobic stretch known as a nucleoside triphosphate-binding domain, and motif II is responsible for Mg 2ϩ -ATP binding. Motif III is homologous to the conserved sequence of a viral DNA polymerase, and the function of the residual motifs is unknown. Sequence analyses indicate that the greatest variation in the primary structure is found in the regions between motifs Ia and II and between IV and V, as well as at the N and C termini (29,30). Fig. 2, however, demonstrates that 9 of the 12 polypeptides with the greatest similarities to ATPC-2 are significantly shorter than the well characterized helicases and lack motifs V and VI. In addition, these proteins including ATPC-2 exhibit the highest degree of similarities in the C-terminal region following motif IV, a segment that is not characteristic of all DNA helicases but found in some of them, as shown in Fig. 2. This suggests that the shorter polypeptides including ATPC-2 belong to a novel class of proteins that are either ancestors of some helicases or derive from helicases that have lost their C-terminal segment with boxes V and VI. Although the shorter proteins are found in various pro-and eukaryotic species, no functional analysis has yet been performed on any of them. Because of their striking similarities and identical modular structure, it is likely that they perform similar functions.
Although all helicases of superfamily I share common biochemical properties and their biochemical reactions are quite similar, multiple mechanisms of helicase-catalyzed nucleic acid unwinding may have evolved to support specific roles in various biological processes, such as DNA replication, DNA repair, cell cycle control, or transcription. They recognize different DNA motifs and DNA ends as well as single-stranded and double-stranded DNA segments (29,(31)(32)(33)(34)(35)(36)(37). ATPC-2, a truncated helicase capable of binding DNA (Figs. 4 -6) in a sequence-specific manner, discriminates between AAAATT and AAGATT. We have demonstrated previously that the former sequence forms a secondary structure in vitro, such as a partially unwound DNA region, whereas the latter does not (1). Thus, one might speculate that the sequence-specific recognition has a structural basis and might provide an ideal recognition site for a polypeptide with helicase features. Although the function of the missing helicase boxes V and VI is not clear, it is unlikely that a truncated helicase is capable of unwinding DNA. Therefore, a simple model could describe the role of ATPC-2 in AtpC gene regulation. The AAAAT sequence forms a stable protein⅐DNA complex and thus prevents the assembly of CBF, resulting in a diminished AtpC transcription. Both factors cannot bind simultaneously because of their overlapping binding regions (1). This concept is supported by crystal structural data from the yeast CBF, which protects a DNA region of at least 20 nucleotides (8). Activators of AtpC transcription, such as light or cytokinin, prevent repressor binding and thus allow for the assembly of CBF;. this is consistent with the observation that the A 3 G mutation, which prevents binding of the repressor but allows the assembly of CBF (Fig.  3), results in a constitutive promoter activity in vivo (1). This novel mechanism of gene regulation might add a new facet to the observations that etiolation is caused by processes that actively prevent the development observed in light (38).