DNA Binding Properties of TAF1 Isoforms with Two AT-hooks*

TATA-binding protein-associated factor 1 (TAF1) is an essential component of the general transcription factor IID (TFIID), which nucleates assembly of the preinitiation complex for transcription by RNA polymerase II. TATA-binding protein and TAF1·TAF2 heterodimers are the only components of TFIID shown to bind specific DNA sequences (the TATA box and initiator, respectively), raising the question of how TFIID localizes to gene promoters that lack binding sites for these proteins. Here we demonstrate that Drosophila TAF1 protein isoforms TAF1-2 and TAF1-4 directly bind DNA independently of TAF2. DNA binding by TAF1 isoforms is mediated by cooperative interactions of two identical AT-hook motifs, one of which is encoded by an alternatively spliced exon. Electrophoretic mobility shift assays revealed that TAF1-2 bound the minor groove of adenine-thymine-rich DNA with a preference for the sequence AAT. Alanine-scanning mutagenesis of the alternatively spliced AT-hook indicated that Lys and Arg residues made essential DNA contacts, whereas Gly and Pro residues within the Arg-Gly-Arg-Pro core sequence were less important for DNA binding, suggesting that AT-hooks are more divergent than previously predicted. TAF1-2 bound with variable affinity to the transcription start site of several Drosophila genes, and binding to the hsp70 promoter was reduced by mutation of a single base pair at the transcription start site. Collectively, these data indicate that AT-hooks serve to anchor TAF1 isoforms to the minor groove of adenine-thymine-rich Drosophila gene promoters and suggest a model in which regulated expression of TAF1 isoforms by alternative splicing contributes to gene-specific transcription.

TATA-binding protein-associated factor 1 (TAF1) is an essential component of the general transcription factor IID (TFIID), which nucleates assembly of the preinitiation complex for transcription by RNA polymerase II. TATA-binding protein and TAF1⅐TAF2 heterodimers are the only components of TFIID shown to bind specific DNA sequences (the TATA box and initiator, respectively), raising the question of how TFIID localizes to gene promoters that lack binding sites for these proteins. Here we demonstrate that Drosophila TAF1 protein isoforms TAF1-2 and TAF1-4 directly bind DNA independently of TAF2. DNA binding by TAF1 isoforms is mediated by cooperative interactions of two identical AT-hook motifs, one of which is encoded by an alternatively spliced exon. Electrophoretic mobility shift assays revealed that TAF1-2 bound the minor groove of adenine-thymine-rich DNA with a preference for the sequence AAT. Alanine-scanning mutagenesis of the alternatively spliced AT-hook indicated that Lys and Arg residues made essential DNA contacts, whereas Gly and Pro residues within the Arg-Gly-Arg-Pro core sequence were less important for DNA binding, suggesting that AT-hooks are more divergent than previously predicted. TAF1-2 bound with variable affinity to the transcription start site of several Drosophila genes, and binding to the hsp70 promoter was reduced by mutation of a single base pair at the transcription start site. Collectively, these data indicate that AT-hooks serve to anchor TAF1 isoforms to the minor groove of adenine-thymine-rich Drosophila gene promoters and suggest a model in which regulated expression of TAF1 isoforms by alternative splicing contributes to gene-specific transcription.
Binding of core promoters by the general transcription factor TFIID 2 correlates with transcription activity for most RNA polymerase II genes (1)(2)(3). In metazoans, core promoters for RNA polymerase II genes comprise ϳ70 bp surrounding the tran-scription start site (TSS) and are made up of ϳ8-bp elements (4,5). Core promoter elements include the TATA box located 26 -31 bp upstream of the TSS, the initiator element located at the TSS, and the downstream promoter element located 28 -33 bp downstream of the TSS. Multiple subunits of TFIID engage in DNA binding at the core promoter. TATA-binding protein binds the TATA box and TATA-binding protein-associated factors (TAFs) bind downstream elements (6,7). Several lines of evidence point to the importance of TAF1-core promoter DNA interactions. Studies in yeast and mammalian cells indicate that core promoters, not activator-binding sites, render genes TAF1dependent (8 -10); cross-linking studies in Drosophila place TAF1 in close proximity to the TSS of the hsp70 gene (11); studies with purified human TFIID reveal TAF1 as the major TAF species that can be cross-linked to the downstream core element of several promoters (12); and TAF1⅐TAF2 heterodimers preferentially bind initiator sequences in vitro (8,13,14). However, TAF1 domains required for core promoter interactions have not been identified, and direct DNA binding by TAF1 on its own has not been observed.
TFIID is a family of complexes with different TAF components (15,16). In Drosophila and humans, germline homologs of general TAFs are necessary for transcription during germ cell differentiation (17)(18)(19)(20)(21). In addition, changes in the TAF composition of TFIID result from alternative splicing of general TAF pre-mRNAs. For example, the human TAF6 pre-mRNA is alternatively spliced in response to apoptotic signals, and the encoded TAF6 isoform, TAF6␦, modulates the transcription of pro-apoptotic genes (22). Thus, regulated expression of TAF isoforms is an important determinant of gene-specific transcription.
In Drosophila, alternative splicing of the TAF1 pre-mRNA generates four TAF1 mRNA isoforms that differ by the inclusion of two small exons 12a and 13a. 3 The functional importance of TAF1 alternative splicing is suggested by the fact that TAF1 isoform levels differ between Drosophila tissues and are regulated in response to genotoxic stress. TAF1-1 and TAF1-3 encode one predicted AT-hook motif, whereas TAF1-2 and TAF1-4 encode two predicted AT-hooks. The DNA-binding AT-hook motif contains a central core of Arg-Gly-Arg-Pro residues flanked by Lys and Arg residues (24). AT-hooks were first described in the high mobility group non-histone chromosomal protein HMGA (also known as HMG-I(Y)) and have been shown to bind DNA through contacts with the minor groove of adenine-thymine (A-T) tracts (25,26). The AT-hook motif is found in one or more copies in a large number of transcription factors and components of chromatin remodeling complexes (24,27,28). AT-hooks may serve as accessory DNAbinding domains that anchor chromatin-associated proteins to particular DNA sequences or participate in cooperative DNAbinding with other proteins. Here we demonstrate direct DNA binding by the two AT-hook containing TAF1-2 and TAF1-4 proteins at the TSS of several Drosophila genes. We find that Lys and Arg residues of the AT-hook are critical for DNA binding, whereas Ala substitutions are tolerated at highly conserved Gly and Pro residues. TAF1 binds the minor groove of DNA with a preference for A-T tracts, especially an AAT sequence element. Direct DNA binding by TAF1 isoforms whose expression is regulated by developmental and stress signals has implications for the mechanism of gene-specific transcription initiation.
Purification of Recombinant TAF1-BL21 Rosetta cells (Novagen) were transformed with pET21a-TAF1 expression plasmids and grown to mid-log phase, and protein expression was induced by addition of isopropyl-␤-D-thiogalactopyranoside to a concentration of 150 M. Cell lysis was carried out using a cell lytic kit (Sigma). His-tagged TAF1 fusion proteins were purified on nickel-nitrilotriacetic acid columns (Novagen). Eluted proteins were concentrated using Amicon columns (Millipore), and the concentration was checked by Bradford assay (Bio-Rad), equalized by dilution, and dialyzed against Buffer A (50 mM Tris, pH 7.5, 0.5 mM EDTA, 100 mM NaCl, 10% glycerol, 1 mM dithiothreitol) overnight. TAF1-2 protein used in the DNase I footprinting assay was purified as above except that protein eluted from the nickel-nitrilotriacetic acid column was further purified by gel filtration chromatography on a Sephacryl S300 column (Amersham Biosciences) pre-equilibrated with Buffer A containing 500 mM NaCl. Fractions containing TAF1-2 were concentrated and dialyzed against Buffer A.
Electrophoretic Mobility Shift Assay-For EMSAs, complementary oligonucleotides were annealed according to the manufacturer's protocol (Integrated DNA Technology). dsDNA probes were 5Ј end-labeled with T4 polynucleotide kinase and [␥-32 P]ATP and analyzed on 20% polyacrylamide gels to verify Ͼ95% homogeneity prior to use. Sequences of probe plus Below are schematic diagrams of the cDNAs used to express TAF1 protein isoforms. An alignment is shown of amino acids encoded by the 3Ј end of exon 12 and exon 12a from Drosophila species that evolutionarily diverged over 40 million years ago (30). The numbering refers to CG17603-PC. The species are arranged based on evolutionary distance from Drosophila melanogaster. AT-hook motifs are in bold type. strands (oriented 5Ј to 3Ј) are indicated in Fig. 5A. Protein⅐DNA complexes were formed by incubation for 30 min at 25°C in a volume of 20 l under the following conditions: 10 mM Tris, pH 7.5, 100 mM NaCl, 0.5 mM EDTA, 1 mM dithiothreitol, 100 g of bovine serum albumin/ml, 5% glycerol, 50 ng of poly(dG-dC)/ l, and 40 fmol of probe. The reactions were fractionated on 6.5% (29:1 acrylamide:bisacrylamide) native gels that included 0.5ϫ TBE (Tris-borate-EDTA). Gels (18 cm ϫ 20 cm ϫ 1 mm) were electrophoresed at 4°C for 2.5 h at 10 V/cm before exposure to x-ray film or a phosphorimaging plate. Quantitation was performed with ImageQuant software (Molecular Dynamics).
DNase I Footprinting Assay-The 326-bp Sdic DNA probe used for footprinting contained 265 bp of the Sdic promoter (corresponding to nucleotides 11281-11545 of AF070687) flanked by vector sequence. Binding reactions contained 32 Pend-labeled probe, 20 mM Tris, pH 7.5, 75 mM NaCl, 0.5 mM EDTA, 0.5 mM dithiothreitol, 50 ng of poly(dG-dC), 500 ng of bovine serum albumin, and the indicated concentrations of TAF1-2. The reactions were incubated on ice for 30 min, supplemented with MgCl 2 to 5 mM, CaCl 2 to 2.5 mM, and DNase I to 0.16 units (Promega), and incubated at 25°C for 45 s. The reactions were terminated by addition of an equal volume of stop solution containing 200 mM NaCl, 30 mM EDTA, 1% SDS, and 100 g/ml yeast RNA. Following phenol extraction and ethanol precipitation, the samples were resuspended in formamide loading buffer and fractionated by electrophoresis on an 8% denaturing polyacrylamide gel. For sequencing markers, Sdic DNA was chemically cleaved by Maxam-Gilbert A Ͼ C and G ϩ A reactions (29). After drying, the gels were visualized by phosphorimaging (Molecular Dynamics).

TAF1 Alternative Exon 12a Encodes an AT-hook Motif-
Comparison of conceptually translated Drosophila TAF1 exon 12a to protein motif data bases revealed a 9-amino acid DNAbinding motif called an AT-hook ( Fig. 1). Hereafter, we refer to this motif as AT-hook 12a. An identical sequence is encoded by exon 12, and we refer to this motif as AT-hook 12. Thus, TAF1-1 and TAF1-3 proteins contain one AT-hook, whereas TAF1-2 and TAF1-4 proteins contain two AT-hooks separated by a 14-amino acid spacer. The functional importance of TAF1 AT-hooks is suggested by their sequence conservation between Drosophila species (30). TAF1 AT-hooks are identical in sequence to the first AT-hook in human HMGA1 and are classified by Huth et al. (26) as low affinity type II AT-hooks because they have a central Gly-Arg-Pro peptide with flanking Lys and Arg residues but lack additional carboxyl-terminal residues that contact the sugar-phosphate backbone of DNA. Here we have focused on determining functional differences between TAF1 proteins that contain one versus two AT-hooks and characterizing the DNA binding properties of TAF1 AT-hooks.
TAF1 Isoforms with Two AT-hooks Directly Bind DNA-To determine the extent to which predicted AT-hooks 12 and 12a bind DNA, we performed EMSAs using a 34-bp fragment of the human interferon-␤ gene enhancer (INF-␤) as a probe and Histagged recombinant proteins encoding exons 12-14 of the four TAF1 isoforms (Figs. 1, 2A, and 5A) (31-33). INF-␤, which con-tains PRDIII-1, PRDII, and NRDI enhancer elements, was chosen as a probe because it is bound by HMGA AT-hooks in vivo and in vitro (26,(31)(32)(33)(34)(35)(36)(37). TAF1-2 and TAF1-4, the isoforms that contain two AT-hooks, altered the mobility of the INF-␤ probe, indicating that they bind INF-␤ DNA (Fig. 2B). In contrast, TAF1-1 and TAF1-3, the isoforms that contain one AT-hook, did not affect the mobility of the INF-␤ probe, indicating that they do not bind INF-␤ DNA. These findings suggest that AThook 12 is not sufficient for DNA binding, whereas AT-hook 12a in combination with AT-hook 12 is competent for DNA binding. Therefore, alternative splicing of the TAF1 pre-mRNA is a mechanism to control the DNA binding activity of TAF1 proteins.
To determine whether binding of TAF1-2 to INF-␤ DNA depends on AT-hook 12a or both AT-hooks, we assayed DNA binding of TAF1-2 isoforms whose AT-hook motifs were mutated. TAF1-2 isoforms (mutant AT-hook 12 (mutAT1), mutant AT-hook 12a (mutAT2), and both mutant AT-hooks (mutAT1/2)) were produced in which the AT-hooks were mutated from KRGRGRPRK to KRGDGRPRK (Fig. 2, A and C). In the HMGA NMR structure, the corresponding basic Arg residue makes extensive contact with DNA bases within the minor groove, implying that substitution of this residue with an acidic Asp would disrupt the DNA binding ability of the AThook (26). EMSAs showed that DNA binding by mutAT1 or mutAT2 was reduced greater than 15-fold relative to TAF1-2, and DNA binding by mutAT1/2 was reduced below detectable levels (Fig. 2C). These results indicate that both AT-hooks contribute to binding of INF-␤ DNA by TAF1-2 and presumably TAF1-4. Therefore, inclusion of exon 12a confers a DNA binding activity upon TAF1-2 and TAF1-4 that is not present in TAF1-1 or TAF1-3. Furthermore, the DNA binding affinity of the AT-hooks together was greater than the sum of their individual affinities, suggesting that TAF1 AT-hooks bind DNA in a cooperative manner.
To establish the apparent equilibrium dissociation constant (K D(app) ) for TAF1-2, we measured the fraction of probe bound in reactions with different concentrations of TAF1-2. Based on maximal binding of the INF-␤ probe by TAF1-2 at ϳ85%, we estimate the K D(app) to be 1 M (data not shown). This value falls within the nM to mM range of K D(app) for other AT-hook proteins and is comparable with the K D(app) of 4 -10 M determined for the HMGA high affinity AT-hook DBD3 bound to a PRDII dodecamer (26,38).
Mutational Analysis of TAF1-2 AT-hook 12a-To determine which amino acids contribute to DNA binding by AT-hook 12a we performed alanine-scanning mutagenesis of the 33 amino acids encoded by exon 12a. TAF1 provides an ideal model for characterizing an AThook, because residues necessary for DNA binding are contained within the polypeptide encoded by alternatively spliced exon 12a. EMSAs were performed with the INF-␤ probe and each of the 31 mutant proteins resulting from all possible single Ala substitutions (2 of 33 residues are Ala in the wild type protein) (Fig. 3). Experiments performed using different methods of comparison with wild type TAF1-2 DNA binding resulted in qualitatively identical results (Fig. 3C).
In experiments, such as that shown in Fig. 3A, EMSAs were performed using TAF1-2 mutant proteins at 0.5 M, the concentration required for half-maximal binding by wild type TAF1-2 protein. In experiments, such as those shown in Fig. 3B, EMSAs were performed over a range of concentrations, starting with proteins of equal concentration. Evaluation of the fraction of probe shifted by mutant proteins relative to wild type protein revealed that substitution of Lys 1813 , Arg 1814 , Arg 1816 , Arg 1818 , Arg 1820 , or Lys 1821 with Ala decreased binding more than 8-fold, indicating that these residues are involved in critical DNA contacts. This result is consistent with limited mutational analysis of HMGA, which shows that Arg residues within DBD1 and DBD2 are critical for binding dsDNA (36). Substitution of Gly 1817 or Pro 1819 with Ala decreased binding less than 2-fold, suggesting that these residues do not make important interactions with DNA and may instead be important for appropriate spacing of adjacent Arg side chains. The remaining 23 mutant TAF1-2 proteins that contained individual Ala substitutions at positions Val 1806 -Val 1812 , Gly 1815 , or Asp 1822 -Glu 1838 possessed Ͼ90% of the DNA binding activity of wild type TAF1-2, indicating that these positions do not strongly contribute to binding by TAF1-2. These data show that the functional limits of AT-hook 12a are confined to a 9-amino acid region and raise questions about the functional importance of Gly and Pro residues, within the Arg-Gly-Arg-Pro core sequence, which are invariant in the current definition of an AT-hook (24).
TAF1-2 Binds dsDNA in the Minor Groove-HMGA AT-hooks bind the minor groove of DNA with a preference for A-T tracts. To determine whether TAF1 AT-hooks exhibit similar specificity, we performed EMSAs with TAF1-2 and sin-gle-stranded or double-stranded INF-␤ probes. As shown in Fig. 4A, TAF1-2 altered the mobility of the double-stranded INF-␤ probe but neither the top nor bottom single-stranded INF-␤ probes. As expected, mutAT1/2 failed to bind all three probes. Therefore, DNA binding by the TAF1-2 AT-hooks is specific for dsDNA and is not the result of simple electrostatic interaction between a solvent-exposed basic patch and the acidic DNA backbone of single-stranded DNA.
To determine the specificity of TAF1-2 for structural features of dsDNA, we performed EMSAs with TAF1-2 and the INF-␤ probe in the presence of increasing concentrations of unlabeled dsDNA competitors: poly(dG-dC), poly(dA-dT), or poly(dI-dC). Poly(dI-dC) effectively competes with A-T-rich sequences for binding by HMGA, presumably because poly(dI-dC) DNA adopts a structure with a narrow minor groove more similar to poly(dA-dT) than poly(dG-dC) (37). As shown in Fig.  4B, poly(dA-dT) and poly(dI-dC) competed with INF-␤ for binding to TAF1-2 at 4 g/20 l, but poly(dG-dC) did not. These observations are consistent with a model in which TAF1-2 AT-hooks recognize the structural characteristics of the minor groove of dsDNA with a preference for the narrow minor groove of A-T-rich DNA.
TAF1-2 Binds to the TSS Sequence of Several Promoters-Because TAF1 is a component of TFIID and is expected to make physiologically relevant interactions with core promoter DNA, we assayed the ability of TAF1-2 to bind DNA sequences derived from core promoters of several Drosophila genes. In EMSAs, TAF1-2 protein bound 40-bp probes centered on the Sdic (sperm-specific dynein intermediate chain) TSS, the hsp70 (heat-shock-protein 70bc) TSS, the ␤2t (␤2-tubulin) TSS, and the dj (don juan) TSS, but not the Cdic (cytoplasmic dynein intermediate chain) TSS (Fig. 5). Sdic, ␤2t, and dj are highly expressed in Drosophila testes, whereas Cdic is ubiquitously expressed (39 -41). Because TAF1-2 is the most abundant TAF1 isoform in testes, the observation that TAF1-2 can bind core promoter sequence of the testis-specific Sdic, ␤2t, and dj genes in vitro raises the possibility that TAF1-2 may contribute to the transcriptional activation of these genes in vivo. TAF1-2 Binds DNA That Contains an AAT Sequence-In the HMGA NMR structure, AT-hooks intimately contact the INF-␤ PRDII sequence AAT (26). To test whether TAF1-2 also binds AAT, we performed EMSA with TAF1-2 and the INF-␤ probe or the INF-␤ mut probe, in which the AAT sequence was changed to AGT (Fig. 5A). This experiment revealed that TAF1-2 weakly bound the INF-␤ mut probe, suggesting that TAF1-2 AT-hooks directly interact with the AAT sequence (Fig. 5B). The hsp70 TSS probe, to which TAF1-2 strongly bound, contained two AAT sequences (Fig. 5A). Mutation of one of these sequences to AGT, in the hsp70 mut probe, substantially reduced binding by TAF1-2 (Fig. 5B). Attempts to determine a minimal DNA sequence bound by TAF1-2 revealed that binding decreased with the length of the DNA probe, even with probes containing AAT sequences, suggesting that high affinity binding requires additional contacts (data not shown). Taken together these experiments define AAT as a preferred site for binding by TAF1-2 AT-hooks.
To further refine our understanding of TAF1 DNA binding specificity, we used DNase I footprinting to identify TAF1-2  1, 2, 4, and 5) or doublestranded INF-␤ probe (lanes 3 and 6) and TAF1-2 (lanes 1-3) or mutAT1/2 (lanes 4 -6) are shown. A and B refer to the sense and antisense strands, respectively. Positions of the shifted TAF1⅐DNA complex and free DNA are indicated on the right. B, EMSAs using radiolabeled INF-␤ probe and TAF1-2 are shown. Shifted TAF1-2⅐DNA complex was competed with increasing amounts of poly(dG-dC) (dG-dC, lanes 2-5), poly(dI-dC) (dI-dC, lanes 6 -9), or poly(dA-dT) (dA-dT, lanes 10 -13). Dashes denote no protein, and triangles denote increasing amounts of indicated competitor DNA in 2-fold increments. Positions of the shifted TAF1⅐DNA complex and free DNA are indicated on the right. OCTOBER 6, 2006 • VOLUME 281 • NUMBER 40 interaction sites on the Drosophila Sdic promoter (40). This analysis revealed five regions of protection by TAF1-2 on the 265-bp Sdic probe, which is centered at the TSS (Fig. 6). One protected site was located just upstream of the Sdic TSS and within the region predicted to be bound by TFIID. Each of the five protected regions contained Ͼ66% A-T bp, and taken together the five regions contained 75% A-T bp. Four of the five protected regions contained the AAT sequence element. These data indicate TAF1-2 AT-hooks bind DNA with a preference for the narrow minor groove of A-T-rich sequences and especially the sequence AAT.

DISCUSSION
This work provides evidence that Drosophila TAF1 directly interacts with DNA through AT-hook motifs. Drosophila appears to be the only organism in which TAF1 contains AT-hooks, which may account for the different DNA binding capabilities of Drosophila relative to mammalian TAF1s (7,8,13). 4 TAF1-2 AT-hooks bound, with variable affinity, TSS sequences from several Drosophila genes and mutation of the initiating A-T bp in the hsp70 TSS probe drastically reduced binding (Fig. 5B). Thus, we speculate that evolutionary accretion of Drosophila TAF1 AT-hooks may have occurred to recognize A-T-rich promoters. Aerts et al. (42) found differences in the base composition surrounding TSSs in Drosophila versus vertebrates. Both Drosophila and vertebrates have a greater A-T than G-C content in intergenic regions. But although Drosophila have increased A-T content at the TSS, relative to intergenic regions, vertebrates have a greater G-C than A-T content at the TSS. Furthermore, Drosophila genes can be divided into classes with small and large increases in A-T content at the TSS, which suggests a functional meaning to the A-T content of Dro-4 C. E. Metcalf and D. A. Wassarman, unpublished observation. sophila promoters, possibly analogous to CpG islands in vertebrates (43). Hence, TAF1 may add to a growing list of proteins that have acquired AT-hooks in a particular organism or species to regulate DNA-dependent events through A-Trich elements (44). For example, in Schizosaccharomyces pombe, the amino terminus of the Orc4p subunit of the origin recognition complex contains nine AT-hook motifs, which bind DNA replication origins that are extremely rich in A-T bp (45). AT-hooks in Orc4p and extremely A-T-rich replication origins do not occur in other eukaryotic organisms, including other yeast species such as Saccharomyces cerevisiae. In addition, in plants there are several examples of AT-hook-containing transcription factors that activate transcription through binding A-T-rich enhancer elements (46,47). These observations imply that other species-specific AT-hook proteins regulate DNA-dependent events controlled by A-T-rich elements.
This work has also extended our understanding of AT-hook-DNA interactions. TAF1 isoforms that contain one versus two AT-hooks greatly differ in their capacity to bind DNA. Neither TAF1 AT-hook 12 nor 12a has DNA binding activity, but together the AT-hooks bind DNA with low micromolar affinity. This finding highlights the importance of intramolecular cooperativity of AT-hooks for DNA binding, as shown for HMGA family proteins (26,31,34).
Like HMGA family proteins, TAF1-2 binds the minor groove of A-T-rich dsDNA (26). DNA binding by TAF1-2 was competed by poly(dA-dT) and poly(dI-dC), which have a similar narrow minor groove, but not poly(dG-dC), and TAF1-2 failed to bind single-stranded DNA (Fig. 4). Thus, although TAF1 and HMGA AT-hooks have different sequences and spacing, they appear to bind DNA in a similar manner.
Systematic mutational analysis of the AT-hook region encoded by exon 12a supports the definition of the AT-hook as a 9-amino acid Lys/Arg-rich motif but brings into question the importance of DNA binding of Gly and Pro residues within the Arg-Gly-Arg-Pro core sequence (24). Although 25 of 33 amino acids encoded by exon 12a are conserved among Drosophila species, only the six conserved Lys or Arg amino acids within the predicted AT-hook motif drastically reduced DNA binding (Fig. 3). Similarly, protein hydroxyl radical footprinting analysis of HMGA showed that DBD1, which is identical in sequence to TAF1 AT-hooks, but not surrounding residues contact INF-␤ DNA (35). In contrast, residues surrounding DBD2 and DBD3 substantially contribute to DNA binding. Thus, conserved residues surrounding TAF1 AT-hook 12a are likely to serve functions independent of DNA binding.
The strong requirement for Lys and Arg residues flanking the Arg-Gly-Arg-Pro core was unexpected given the frequency of Gly and Pro residues at these positions in predicted AT-hook motifs (24). Also unexpected was the relatively minor requirement for Gly and Pro residues within the Arg-Gly-Arg-Pro core sequence. Proteins containing Ala substitutions at these positions retained substantial DNA binding activity. In the NMR structure of HMGA DBD2 and DBD3 bound to a fragment of the INF-␤ enhancer, the two AT-hooks made similar contacts with DNA (26). The Arg-Gly-Arg portion of the Arg-Gly-Arg-Pro core sequence formed a concave surface that inserted into the minor groove of A-T tracts, the Pro directed the peptide backbone away from the minor groove, and flanking Arg and Lys residues made extensive hydrophobic and polar contacts with the DNA. Our findings indicate that the methyl side group of Ala, within the sequence Arg-Ala-Arg, is tolerated for interactions with the minor groove and that reorientation of flanking sequences by the Pro residue is not necessary for DNA binding.
These data suggest that the AT-hook motif, and more specifically the Gly-Arg-Pro, is more divergent than that described by Aravind and Landsman (24). In support of this proposal, DNA binding by Lens epithelium-derived growth factor 75 (LEDGF/ p75) requires three AT-hooks (AARRGRKRK, SPKRGRPAA, and PKPRGRPKM, where the underlined sequence corresponds to the Gly-Arg-Pro sequence), the first of which contains a Lys instead of a Pro within the putative invariant Gly-Arg-Pro sequence (48). Furthermore, our searches of the Drosophila data base have identified several proteins with potential AT-hooks that do not contain the Gly-Arg-Pro sequence. 4 For example CG2995, which encodes a chromatinmodifying enzyme, contains five potential AT-hooks within a 168-amino acid region (KRGRGRARK, EKKPGRKRK, VKRM-RRSVR, PKKRGRKAK, and PKRSKRRIK). Thus, the AT-hook may be a more prevalent DNA-binding motif than previously predicted.
As expected based on experiments with other AT-hook proteins, EMSA and footprinting studies revealed that TAF1-2 AT-hooks bind DNA sequences containing A-T tracts (23,26,31,32,34,49). The trinucleotide sequence AAT may be a preferred binding site for TAF1-2 because four of the five regions protected by TAF1-2 in the footprinting analysis contained an AAT and mutation of AAT to AGT within the hsp70 TSS or INF-␤ probes severely reduced binding by TAF1-2 (Figs. 5 and 6). These data are consistent with the prediction, based on NMR analysis of HMGA, that AA(T/A)T is an optimal AThook-binding site and other studies showing that AT-hooks binds an A-T tract of 3-5 bp (26,49).
In summary, experiments presented here suggest that modulation of the TAF1-DNA interaction, by regulated expression of TAF1 isoforms containing one versus two AT-hooks, contributes to gene-specific transcription. Further characterization of TAF1 should shed light on the relationship between TFIID-promoter affinity and the specificity and efficiency of transcription initiation. Additionally, detailed analysis of the determinants of the TAF1-DNA interaction and comparison with the determinants of the HMGA-DNA interaction suggests the existence of unrecognized AT-hook-containing proteins, elevating the importance of understanding the molecular consequences of AT-hook-DNA interactions in vivo.