JBC

HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Originally published In Press as doi:10.1074/jbc.M004294200 on June 23, 2000

J. Biol. Chem., Vol. 275, Issue 37, 28363-28370, September 15, 2000
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplemental Data
Right arrow All Versions of this Article:
275/37/28363    most recent
M004294200v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Szymczyna, B. R.
Right arrow Articles by Arrowsmith, C. H.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Szymczyna, B. R.
Right arrow Articles by Arrowsmith, C. H.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

DNA Binding Specificity Studies of Four ETS Proteins Support an Indirect Read-out Mechanism of Protein-DNA Recognition*,

Blair R. Szymczyna and Cheryl H. ArrowsmithDagger

From the Ontario Cancer Institute and Department of Medical Biophysics, University of Toronto, Toronto, Ontario M5G 2M9, Canada

Received for publication, May 18, 2000

    ABSTRACT
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

Members of the ETS family of transcription factors are involved in several developmental and physiological processes, and, when overexpressed or misexpressed, can contribute to a variety of cancers. Each family member has a conserved DNA-binding domain that recognizes DNA sequences containing a G-G-A trinucleotide. Discrimination between potential ETS-binding sites appears to be governed by both the nucleotides flanking the G-G-A sequence and protein-protein interactions. We have used an adaptation of the "length-encoded multiplex" approach (Desjarlais, J. R., and Berg, J. M. (1994) Proc. Natl. Acad. Sci. U. S. A. 91, 11099-11103) to define DNA binding specificities for four ETS proteins: Fli-1, SAP-1, PU.1, and TEL. Our results support a model in which cooperative effects among neighboring bases flanking the central G-G-A site contribute to the formation of stable ETS/DNA complexes. These results are consistent with a mechanism for specific DNA binding that is partially governed by an indirect read-out of the DNA sequence, in which a sequence-specific DNA conformation is sensed or induced.

    INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

The ETS gene family encodes a group of more than 45 proteins, each with a highly conserved 85-amino acid DNA-binding domain initially mapped to v-ets-1, the member for which this group is named (1, 2). Members of the family identified to date share between 36 and 97% sequence identity with the Ets-1 DNA-binding domain and have been found in species ranging from lower invertebrates to humans. ETS family members can function as transcriptional activators or repressors and are involved in a wide range of tissue specific developmental processes. In humans, they are involved in hematopoiesis (3-5), ossification (6, 7), myogenisis (8), and angiogenesis (9, 10). ETS proteins have also been implicated in several types of cancer and other human diseases (11). Because ETS proteins have overlapping DNA binding specificities and because their expression is often tissue type-specific, inappropriate expression or altered forms of a given ETS protein will likely activate genes that are normally not expressed. Thus, an understanding of the sequence specificity of ETS proteins is important for understanding the mechanism of deregulation in ETS-related cancers.

All ETS DNA-binding domains recognize a purine-rich G-G-A sequence, yet each family member has specificity for characteristic bases flanking this segment (2, 12-19). The recent solution and crystal structures of the ETS domains of Fli-1 (20), Ets-1 (21-23), PU.1 (24), GABPalpha /beta (25), SAP-1 (26), and Elk-1 (27) have established that the ETS proteins constitute a subgroup within the super-family of "winged helix-loop-helix" DNA-binding proteins. Structures of these six domains complexed with DNA show that helix-3 lies in the major groove centered at the G-G-A recognition site. In the crystal structures of PU.1 (24), GABPalpha (25), SAP-1 (26), and Elk-1 (27), two conserved Arg residues within helix 3 make direct hydrogen bonds with the bases of the G-G-A motif. Importantly, the pattern of hydrogen bonds from these conserved arginines is not the same in the different high resolution crystal structures. This suggests that the ETS domain may have some degree of flexibility and diversity in its mode of interaction with DNA. Regions of the ETS domain flanking helix-3 interact with phosphates along the minor groove both upstream and downstream of the G-G-A element, further stabilizing the complex and bending the DNA around the protein. Variation in DNA bending from 11 to 28° for the SAP-1 and PU.1 complexes, respectively, and few unique direct contacts to the bases flanking G-G-A suggest a possible "indirect read-out" mechanism of DNA recognition, wherein the ETS domain recognizes a sequence-dependent structure that is either induced or already present in DNA. This model is in contrast to a "direct read-out" mechanism of DNA binding specificity, in which protein residues recognize and interact with unique base pairs within an ETS-binding site. In both mechanisms water molecules could potentially mediate contacts between the protein and the DNA.

To better understand the DNA-binding sequence specificity within the ETS family of transcription factors, we have investigated the specificity of four distantly related ETS domains (Fig. 1) under identical conditions using the same experimental technique. The ETS domains of Fli-1 and SAP-1 are the most similar of the proteins tested (62% identity) and differ considerably from TEL (44 and 40% identity, respectively) and PU.1 (36 and 41%, respectively). There is only 35% identity between the ETS domains of TEL and PU.1. Fli-1, SAP-1, and PU.1 were chosen because the three-dimensional structures of these three proteins bound to DNA have been determined experimentally (20, 24, 26), and there is a large body of literature on their DNA binding specificity. No known structural or specificity studies have been published for TEL. We have used an experimental approach, the length-encoded multiplex method (28), which can rapidly screen the DNA binding specificity of multiple proteins. Our data reveal subtle differences in DNA binding specificity among these four ETS proteins and support an indirect read-out mechanism of protein-DNA recognition. These results have important implications for understanding the role of misexpressed oncogenic ETS proteins in transcriptional regulation.

    EXPERIMENTAL PROCEDURES
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

Cloning and Expression of ETS Proteins-- Polymerase chain reaction-amplified DNA from murine Fli-1 (residues 231-373, Fli-B), human SAP-1 (residues 1-156, SAP-B), murine PU.1 (residues 106-271, PU.1-P), and human TEL (residues 330-452, TEL-E) (Fig. 1) were cloned into the pET-15b (29) expression vector using Nde1 and BamHI restriction sites. The resulting plasmids expressed the recombinant ETS protein with an N-terminal hexahistidine tag followed by a thrombin cleavage site. Proteins were expressed in BL21(DE3) plysS (Novagen) Escherichia coli cells and purified using Ni2+ affinity resin (Qiagen). The histidine tag was removed by thrombin cleavage. Aliquots (50 µl) of stock protein were stored in 10% glycerol, 1 mM phenylmethylsulfonyl fluoride, 10 mM dithiothreitol, and 1× CPhi mpleteTM EDTA-free protease inhibitor mixture tablet solution (Roche Molecular Biochemicals) at -20 °C. The concentrations of protein stock solutions range from 0.1 to 0.5 mg/ml.


View larger version (26K):
[in this window]
[in a new window]
 
Fig. 1.   A, gene structure of several ETS family members showing functional domains. Black bars across the top of each protein indicate recombinant protein fragments used in the multiplex analysis. B-box, PEST, and Pointed domains are protein-protein interaction domains. B, alignment of ETS domains used in multiplex selection studies, relative to ETS1. Identical bases are represented by dashes, whereas gaps are represented by dots. The alpha -helices and beta -sheets of the ETS domain are represented by bars under the protein sequences.

Synthesis and Preparation of DNA-- Desalted oligonucleotides for the multiplex and affinity studies were purchased from Life Technologies, Inc. Multiplex oligonucleotides each have a G-G-A ETS-binding sequence flanked by six randomized bases, except for a single position that is held constant. The identity and position of the defined base relative to the G-G-A triplet was length-encoded. As a negative control, the G-G-A triplet in one oligonucleotide was replaced with C-A-T, and all six flanking positions were randomized. The sequence of the negative control oligonucleotide was GTC TAT GCA TTG GCT CTT ATN NNC ATN NNG GTA CCA CTT TTG TGG TAC (48 bases, unextended). The other multiplex oligonucleotides were successively shorter by one or two bases (3' to 5') from the sequence indicated by bold type above (see Fig. 2). The initial +1/-1 groups of oligonucleotides (34-37 bases, unextended) proved to be too short for optimal binding by the ETS domains; the results were biased toward binding to the longest multiplex oligonucleotide, which encoded a thymidine. Consequently, results for the +1 and -1 positions were repeated using a longer set of oligonucleotides (39-42 bases, unextended).

Multiplex oligonucleotides were further purified using 20% polyacrylamide (1:19) denaturing gel (8 M urea) electrophoresis in 1× TBE (Tris-boric acid-ETDA) (30). Oligonucleotide bands were identified by soaking the gel in 1 µg/ml ethidium bromide solution for 15 min. and exposing to UV light or by UV254 nm shadowing. Crushed gel pieces were incubated in 2 volumes of extraction buffer (500 mM ammonium acetate, 10 mM magnesium acetate, 1 mM EDTA, 0.1% SDS) overnight (14-16 h) and a subsequent 1 volume for 2 h at 37 °C. Both washes were pooled, and DNA was concentrated into 20-40 µl of TE (Tris-EDTA) buffer using the QIAEX II gel extraction kit (150) (Qiagen). DNA concentration was measured spectrophotometrically (A260 nm).

Multiplex pools were made by combining equimolar amounts (2 × 10-11 mol) of each oligonucleotide. + and - pools contained oligonucleotides in which a base downstream and upstream of the G-G-A core was held constant, respectively. The altered +1 and -1 groups were run independently. The oligonucleotides in the multiplex pools were 5' end-labeled with [gamma -32P]ATP (Amersham Pharmacia Biotech) and 1 unit of T4 polynucleotide kinase (Life Technologies, Inc.) by incubating at 37 °C for 30 min. Excess radiolabeled ATP was removed from the multiplex oligonucleotide solution using the NucTrap® Probe Purification Columns (Stratagene). The oligonucleotide mixture was heated to 95 °C for 5 in, and allowed to slowly cool to room temperature. Oligonucleotides were subsequently treated with DNA polymerase I (Klenow) (New England Biolabs) and an equimolar mixture of dNTPs to complete the hairpins. Hairpins were reannealed by heating to 95 °C for 5 min and slowly cooling.

The following oligonucleotides and their complementary strands were purchased from Life Technologies, Inc. for quantitative binding studies: consensus, GCA AAA CCG GAA GTG AGG C; +1 mutant, GCA AAA CCG GAT GTG AGG C; +2 mutant, GCA AAA CCG GAA CTG AGG C; -1 mutant, GCA AAA CGG GAA GTG AGG C; -2 mutant, GCA AAA GCG GAA GTG AGG C; and -1-2 mutant, GCA AAA GGG GAA GTG AGG C. After annealing, the double-stranded oligonucleotides were purified using nondenaturing 20% polyacrylamide (1:19) gel electrophoresis with 0.25× TBE buffer. The double-stranded oligonucleotides were extracted and quantified as described above. After 5' end labeling with 32P (as above), the oligonucleotides were reannealed in TE buffer with 0.3 M NaCl by heating to 95 °C for 5 min and slowly cooling to room temperature.

Multiplex Binding Assays-- The ETS protein of interest was incubated with 1-2 µg of poly(dI-dC) in 15 µl of binding buffer (20 mM HEPES-KOH (pH 7.9), 60 mM KCl, 0.2 mM EDTA, 6 mM MgCl2, 1 mM dithiothreitol, 10% (v/v) glycerol) for 15 min at room temperature (SAP-B and PU.1-P) or 4 °C (Fli-B and TEL-E), followed by the addition of approximately 50 ng of the multiplex oligonucleotide pool. SAP-B and PU.1-P were then incubated for 20 min at room temperature, whereas Fli-B and TEL-E were incubated at 4 °C for 1 h and then 20 min at room temperature. The final concentration of proteins used resulted in less than 10% of the oligonucleotides shifting: Fli-B, 10-40 ng/µl; SAP-B, 0.3-3 ng/µl; PU.1-P, 2-20 ng/µl; TEL-E 70-100 ng/µl. The protein-DNA mixture was partitioned on a 5% polyacrylamide (1:19, 16 cm) native gel (0.25× TBE) electrophoresed at 200 V for 1.5-2 h at room temperature using 0.25× TBE running buffer.

Protein-DNA complexes and free DNA were visualized by autoradiography and excised from the gel. DNA was eluted as described above and subsequently concentrated to 10 µl using Microcon-3 Microconcentrators (Amicon). 5 µl of the mixture was added to 2 µl of denaturing loading buffer (0.25% bromphenol blue, 0.25% xylene cyanol, 20% glycerol, 80% formamide) and heated to 95 °C for 5 min. Using 1× TE running buffer, samples were run on a 12% denaturing polyacrylamide (1:29) sequencing gel (1× TBE, 5 M urea, 50% (v/v) deionized formamide). Formamide was deionized by stirring with 2.6 g of AG® 501-X8 (D) Resin (Bio-Rad) for 3 h. For comparison, the original multiplex pool was run as a standard. Gels were electrophoresed in 1× TBE for 10-13 h at 40 W. Gels were washed in DNA fixing solution (10% ethanol, 10% acetic acid) to remove urea and dried for 2 h at 80 °C. Dried gels were imaged using a STORM® PhosphorImager (Molecular Dynamics) and analyzed with ImageQuant® V1.11 (Molecular Dynamics). Band intensities were determined by measuring the peak heights off a plot of counts versus position.

Relative fractional saturation values were calculated by normalizing band intensities to the corresponding band intensity in the original multiplex pool. The normalized base propensity of each base was calculated by dividing the relative fractional saturation value of the base by the sum of all four relative fractional saturation values for the corresponding position. The change in free energy associated with changing base Y to base X at position I was calculated from the propensity values using the equation Delta Delta GXi = -RTln(fX/fY) (28), where R is the universal gas constant, T is the temperature (K), fx is the propensity of base X, and fY is the propensity of reference base Y. Results were compiled from 3-15 independent selection studies

Quantitative Protein-DNA Binding Assays-- 10 nM of purified, 32P-labeled oligonucleotide was incubated with decreasing concentrations of protein in binding buffer (0.1 nM to 10 µM), as described above. Samples were loaded on a 5% polyacrylamide gel (1:19, 16 cm) and electrophoresed at 200 V for 1-1.5 h at room temperature. Gels were subsequently dried and imaged on a PhosphorImager. Band intensity was determined by volume integration. Binding isotherms were generated by plotting fraction bound as a function of protein concentration. Dissociation constants were calculated by fitting the isotherms to the following equation.


&PHgr;=<FR><NU><FENCE>1+<FR><NU>[<UP>P</UP>]</NU><DE>K<SUB>D</SUB></DE></FR>+<FR><NU>[<UP>D</UP>]</NU><DE>K<SUB>D</SUB></DE></FR>−<RAD><RCD><FENCE>1+<FR><NU>[<UP>P</UP>]</NU><DE>K<SUB>D</SUB></DE></FR>+<FR><NU>[<UP>D</UP>]</NU><DE>K<SUB>D</SUB></DE></FR></FENCE><SUP>2</SUP>−<FR><NU>4[<UP>P</UP>][<UP>D</UP>]</NU><DE>K<SUB>D</SUB><SUP>2</SUP></DE></FR></RCD></RAD></FENCE></NU><DE>2<FENCE><FR><NU>[<UP>D</UP>]</NU><DE>K<SUB>D</SUB></DE></FR></FENCE></DE></FR> (Eq. 1)
where Phi  is the fraction of DNA bound, [P] and [D] are the total molar concentrations of active protein and double-stranded oligonucleotide, respectively, and KD is the dissociation constant. KD values were obtained from 3-6 independent experiments for each protein/oligonucleotide combination. Dissociation constants were converted to the free energy of complex formation, Delta G, using Delta G = -RTln(1/KD), where R is the universal gas constant and T is the reaction temperature (K). The predicted free energy change upon mutating one base pair is calculated by subtracting corresponding Delta G values.

    RESULTS
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

Diverse Ets Domains Have Similar but Distinct Consensus Binding Sites-- We adapted the use of a "length-encoded multiplex" of oligonucleotides (28) for assessing the relative affinity of ETS proteins for each of six base positions, three upstream and three downstream of the G-G-A core ETS recognition element. Each of the six sites was randomized except for one. The position of the non-random base relative to the G-G-A (-3, -2, -1, +1, +2, or +3) and its identity (A, C, G, or T) was length-encoded (Fig. 2). A hairpin loop was designed at the 3' end to serve as a self-primer for DNA extension using DNA polymerase I (Klenow fragment). This ensures the oligonucleotide is double-stranded and fully complementary within the randomized region. The ETS proteins of interest were individually incubated with the pools of length-encoded oligonucleotides, and the fraction bound was isolated electrophoretically. Protein-bound oligonucleotides were subsequently extracted and run on a highly denaturing polyacrylamide gel. The relative amount of each set of length-encoded oligonucleotides was read from the ladder of bands (Fig. 3). Provided less than 10% of the total multiplex pool was gel shifted, the intensity pattern of the protein-bound oligonucleotides remained the same regardless of protein concentration (chi 2 test, p < 0.05). This intensity pattern was significantly different from the unselected pool (chi 2 test, p < 0.001). Intensity patterns of the original multiplex pool before and after extraction from the native gel were similar (chi 2 test, p < 0.05).


View larger version (30K):
[in this window]
[in a new window]
 
Fig. 2.   The length-encoded multiplex pool for the +1, +2, and +3 variable base positions with the ETS binding site shown as XXX GGA XXX, where X is a random mixture of all four bases. The dotted line represents the region that is extended enzymatically. The sequence of the longest oligonucleotide is GTC TAT GCA TTG GCT CTT ATX XXC ATX XXG GTA CCA CTT TTG TGG TAC CXX XAT GXX XAT AAG ACG CAA TGC ATA GAC. Underlined bases are represented by the solid line in the figure, whereas bases in bold type form the complementary strand of the hairpin. Each extended hairpin is shorter by one base pair upstream of the G-G-A sequence (see "Experimental Procedures"). An analogous set of oligonucleotides was prepared for the -1, -2 and -3 positions.


View larger version (28K):
[in this window]
[in a new window]
 
Fig. 3.   Representative results for Fli-1. A, a representative electrophoretic mobility shift assay used for isolating bound oligonucleotides (upper bands) from the unbound multiplex pool (lower band). B and C, the intensity patterns for Fli-B-bound (lanes 2-4 and 6) and unbound (lanes 1 and 5) oligonucleotides from the +1, +2, +3, -1, -2 and -3 base pair multiplex pools, respectively. Lanes 5 and 6 correspond to the longer +1/-1 oligonucleotides required for analyzing these positions (see "Experimental Procedures").

In initial experiments, the groups of oligonucleotides defining the +1 and -1 positions suggested thymidine is preferred over an adenosine at position +1, and cytidine is greatly favored at -1 for each ETS protein studied (Fig. 3). To discern whether the results for the +1/-1 positions were influenced by the shorter lengths of these oligonucleotides, their encoded lengths were increased to those of the +2/-2 oligonucleotides, and run separately (Fig. 3, lanes 5 and 6). Propensity results from the new set of oligonucleotides indicated that an adenosine is preferred to a thymidine at position +1. This suggests the length of the original oligonucleotide (25 base pairs) that encodes adenosine at the +1 position may have been too short for optimal binding. On the other hand, the propensity pattern at the -1 position was not significantly altered by changing the oligonucleotide lengths.

The C-A-T oligonucleotide (G-G-A core replaced by C-A-T) serves as a negative control. Although the C-A-T band was always observed in the bound pools, the normalized intensity was always similar to the band intensities of weakly bound oligonucleotides. Bands corresponding to the 37-base pair C-A-T oligonucleotide from the initial experiments and its 32-base pair analog in the -1/+1 multiplex experiments are also similar in relative intensity (data not shown). Because binding of the oligonucleotides does not appear to be length-dependent, it likely does not arise from nonspecific binding but from the presence of low affinity G-G-A sites within the randomized regions.

Band intensities in the ETS-bound oligonucleotide patterns were converted to normalized base propensities (Fig. 4), and a consensus sequence was derived for each ETS construct (Fig. 5). Positions -1, +1, +2 and +3 typically have strong base preferences, with all proteins preferring an adenosine at position +1 and all except TEL-E preferring cytosine at the -1 position. A comparison of the 95% confidence limits for the base propensities indicates that no two selection patterns are alike.


View larger version (41K):
[in this window]
[in a new window]
 
Fig. 4.   Results of multiplex DNA binding experiments. Band intensities from the protein-bound fractions are normalized to those of the corresponding bands in the original multiplex pool. Data are presented as a histogram of base propensities. Error bars represent 95% confidence intervals.


View larger version (19K):
[in this window]
[in a new window]
 
Fig. 5.   Comparison of consensus sequences derived from multiplex, SELEX and promoter alignment results. For multiplex results, bases with propensities greater than 25% are indicated with the most prevalent base indicated in capital letters. Capital letters indicate those bases that are most favored at a site (>80% of the frequency of the most favored base, left-most letter). Lowercase letters indicate a frequency less than 80% of the most favored base. Bases that have statistically similar propensities according to the Student-Newman-Keuls test (p < 0.05) are underlined. SELEX derived consensus sequences for Fli-1 (16), SAP-1 (19), and PU.1 (18) are presented using similar criteria. Unlike the published SELEX sequences, the SELEX sequences presented here include bases from the primer sequences of the SELEX oligonucleotides. For PU.1, a consensus sequence was also derived from the in vivo promoter sequences reported in the literature (Supplementary Table IV).

Subtle Differences between Published and Multiplex-derived Consensus Sequences-- To evaluate how well the multiplex consensus sequences correspond to those derived from other methods, we compared our data to ETS-binding sites identified in other studies. Results from "SELEX-like" studies for Fli-1, SAP-1, and PU.1 are presented in Fig. 5. SELEX experiments involve the isolation of individual high affinity sequences from a randomized pool of oligonucleotides (31). To a first approximation, both methods agree with one another, with the multiplex consensus sequences having more redundancy at many positions. However, subtle differences between the two methods are observed for preferred bases at positions -3, +2 and +3. These discrepancies are likely due to the different experimental approaches and may reveal important features of ETS-DNA recognition. Whereas SELEX experiments detect individual DNA sequences with high affinities, multiplex analysis measures the contribution of a single base to complex stabilization without knowledge of the sequence context. A consensus sequence derived from the alignment of naturally occurring PU.1 target sequences is also presented in Fig. 5. Promoter regions that have been shown to bind PU.1 in vivo were aligned using the core PU.1 binding element, G-G-A or A-G-A, as a reference. The resulting consensus promoter sequence differs from the consensus sequences derived from both multiplex and SELEX data at the -1 position.

Quantitative Binding Data Confirm the Multiplex Results-- To rule out the possibility that consensus sequence differences are due to experimental errors, we measured dissociation constants for Fli-B, SAP-B, and PU.1-P with six ETS binding sites using quantitative electrophoretic mobility shift assay (Fig. 6, Table I, and Supplementary Table III). The control oligonucleotide contains a high affinity sequence, ACC GGA AGT, whereas the five other oligonucleotides differ from this sequence by only one or two bases. A qualitative comparison of KD values shows good agreement with the multiplex-derived base preferences, except for the PU.1-P/-2 mutant complex (Table I). This KD value also disagrees with the SELEX consensus for PU.1. Interestingly, the KD measurement for SAP-B binding to the +2 mutant agrees with the multiplex consensus sequence (G or C equally preferred) as opposed to the SELEX result (G much preferred over C). Binding data from Brown et al. (13, 32) also support this preference. In vivo evidence is provided by a recent report of PU.1 binding to various mutant ETS-binding sites in the promoter regions of p47 and CD18 using a luciferase reporter gene to detect binding (15). The double mutation in the CD18 binding site yields a smaller percentage of decrease in luciferase activity than each of the analogous single mutations in the p47 binding sequence. These data confirm many of the multiplex-derived base propensities and suggest that the differences observed between multiplex and SELEX data reflect real differences between the experimental approaches (see "Discussion").


View larger version (39K):
[in this window]
[in a new window]
 
Fig. 6.   Quantitative electrophoretic mobility shift assay results for SAP-B. A, representative electrophoretic mobility shift assay used in generating binding isotherms. Increasing protein concentration results in a slow moving band corresponding to the SAP-B/-1 mutant oligonucleotide complex. B, binding isoforms for SAP-B/oligonucleotide complexes. The curves represent the fitted isotherms, whereas data points represent the average fraction of oligonucleotide bound for a given concentration of SAP-B. black-triangle, control; open circle , +2 mutant; black-square, +1 mutant; triangle , -2 mutant; , -1 mutant; , -1-2 mutant. For the sake of clarity, error bars are not included. Considering the standard deviation distribution associated with the data, the median is 0.03 (5th percentile, 0.002; 95th percentile, 0.07).

                              
View this table:
[in this window]
[in a new window]
 
Table I
Comparison between multiplex-predicted and measured changes in free energies for Fli-B, SAP-B, and PU.1-P, relative to a high affinity oligonucleotide
Dissociation constants determined using mobility shift experiments for the indicated protein-DNA complex are also included.

Multiplex-predicted Free Energies Do Not Correspond to Experimental Values-- One advantage of the multiplex approach is its potential to quantitatively predict the affinity of a protein for a given binding site (28). This is possible only if each base contributes independently to complex formation. In this situation, the changes in free energy associated with the altered bases are additive, and the total change in free energy is the sum over all bases. By comparing predicted and measured free energies for specific DNA sequences, it should be possible to distinguish between a direct read-out mechanism of DNA binding and a mechanism in which combinations of bases cooperate in the recognition of specific ETS proteins. When we compared the multiplex-predicted and experimental Delta Delta G values, we find less than 50% agreement (Table I). This inability of the multiplex assay to quantitatively predict the change in free energy associated with a point mutation suggests that individual bases do not independently contribute to complex stability.

Comparison of High Affinity ETS-binding Sites Suggests Cooperative Sequence Effects-- The comparisons described above suggest that the sequence context at each base position may be important in ETS domain-DNA interactions. To investigate this possibility we compared the series of high affinity binding sites identified for PU.1 and Fli-1 using the SELEX method (18). Table II shows the frequency of bases at the -3 and -1 positions for PU.1P selected oligonucleotides that have either C or G at the -2 site. The two-tailed Fisher exact test confirms that the base composition at positions -1 and -3 is interdependent with the cytosine or guanosine base at -2. A cytosine at -2 is most often flanked by adenosine bases at both the -3 (p = 0.001) and -1 (p = 0.004) positions. On the other hand, guanosine at position -2 is most often flanked by a thymidine at -3 (p = 0.001) and a cytosine (p = 0.004) or guanosine (p = 0.04) at the -1 position. A highly significant coupling is also observed for Fli-1 at the +2 and +3 positions, where the AC and GT base combinations are favored (Table II; p = 0.005).

                              
View this table:
[in this window]
[in a new window]
 
Table II
Cooperativity of bases in high affinity PU.1 and Fli-1 binding sequences
The frequency of bases at the -1 and -3 positions in PU.1 selected oligonucleotides is influenced by a C or G at the -2 position (18). Similarly, there is a correlation between the base frequencies at the +2 and +3 positions of oligonucleotides selected by Fli-1 (16). Bases from the primer sequences of the SELEX oligonucleotides were included in the data.


    DISCUSSION
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

We have carried out a detailed comparative analysis of the DNA binding specificity of four ETS domain proteins using several experimental techniques. We find that the apparent DNA binding specificity can depend on the experimental approach. Most specificity studies of ETS proteins to date have used various forms of SELEX experiments to identify individual high affinity oligonucleotides, which are subsequently aligned to determine a consensus binding sequence for a given protein (12-14, 16, 18, 19, 33-37). This technique does not reveal how and to what extent each base contributes to the stability of a protein-DNA complex. We have used an alternative technique, the multiplex method, which addresses this issue.

The length-encoded multiplex approach was introduced in 1994 to study the DNA binding specificity of zinc finger proteins (28). The multiplex approach is particularly useful for rapidly screening the sequence preference of multiple proteins. Furthermore, if each base in the sequence contributes independently to complex formation, the change in affinity associated with mutating a given site may be directly predicted. In the original multiplex study, a small trinucleotide region of a zinc finger binding site was tested (28). The results indicated that each nucleotide contributed independently to complex formation (r = 0.83). However, the multiplex results presented here for the ETS proteins yield no such correlation (r = 0.45), suggesting synergistic contributions from bases in the G-G-A flanking regions. Furthermore, the multiplex method was able to identify bases that contribute to high affinity sites that the SELEX methods did not detect. Two examples are the cytidine at the +2 position and a guanosine at the -3 position for SAP-B binding sequences.

Clues that G-G-A flanking regions synergistically contribute to ETS domain-DNA interactions have been reported. These include disagreement between SELEX-derived consensus sequences and direct affinity measurements (13, 15, 19, 32) and competitive binding between two different ETS proteins (33). As we have shown for PU.1 and Fli-1 (Table II), synergism within specific sequence combinations may also be detected by alignment of SELEX-derived oligonucleotides, provided enough sequences are available. Together with the multiplex data presented here, these observations suggest that a primary component of DNA recognition for ETS proteins is an indirect read-out mechanism in which proteins recognize sequence-dependent structural features.

Structural Determinants in ETS Protein-DNA Binding-- Structural studies lend support to the concept of indirect read-out for ETS domains. The structures of six ETS protein-DNA complexes, determined by x-ray crystallography and NMR, provide insight into how ETS proteins recognize their DNA targets (20, 22-27). These complexes have three main features. First, the central G-G-A sequence has many direct contacts with highly conserved residues in helix 3 of the ETS domain, but the nature and orientation of these contacts depend on the protein studied. Solution NMR studies of ETS-1 and Fli-1 confirm that the conserved Arg residues that interact with G-G-A do not have a single defined conformation when bound to DNA (20, 23). Instead they appear to be in equilibrium between several conformational states. Second, interactions with the bases flanking the G-G-A core by both conserved and nonconserved residues are few and often mediated by a water molecule. Third, conserved and nonconserved protein residues, mostly within the wings of the winged helix-loop-helix motif, mediate direct contacts with 5-7 backbone phosphates within the minor grooves flanking the G-G-A core. The pattern and number of phosphate contacts vary from one protein complex to another (Fig. 7).


View larger version (19K):
[in this window]
[in a new window]
 
Fig. 7.   Patterns of direct amino acid contacts with backbone phosphate groups for GABPalpha (25), SAP-1 (26), and PU.1 (24) bound to DNA. The numbering of protein residues is based on the alignment in Fig. 1B.

From comparison of these structures and biochemical studies several proposals have been put forward to explain sequence specificity based on specific base pair-amino acid interactions. However, these predictions rarely stand up to all the available evidence. For example, based on a comparison of the crystal structures, it has been suggested that the base preferred at position +1 is dependent on residues corresponding to Lys-62, Ala-66, and Tyr-69 of SAP-1 (26) (Fig. 1). Fli-1 and SAP-1 have identical residues at these positions, yet Fli-1 strongly prefers an adenosine at the +1 position, whereas SAP-1 can also tolerate a thymidine. Similarly, for the +2 position it was hypothesized that Tyr-69 of SAP-1 recognizes a guanosine or an adenosine by forming Van der Waals' interactions with the cytidine or thymidine on the opposite strand. Although SELEX data for SAP-1 supports this hypothesis, it is contradicted by multiplex and oligonucleotide-binding data, which suggest that a cytidine is also preferred at this site. These inconsistencies imply direct and water-mediated interactions between protein residues and DNA bases do not fully account for ETS protein-DNA binding specificity.

The variable pattern of electrostatic contacts between ETS domains and the DNA phosphate backbone further supports an indirect read-out mechanism (38). The total footprint of ETS domains on the DNA covers approximately 13 base pairs (39) with the protein-phosphate contacts occurring along the two minor grooves that flank the central G-G-A major groove. The electrostatic interactions at the outer edges of the ETS domain footprint require the slight bending of DNA around the protein. Bending angles reported from crystal structures were between 11 and 28°. Because the propensity for DNA to bend is sequence-dependent (40), it is likely that the right fit between protein and DNA will require a specific sequence upstream and downstream of the G-G-A; one that may not necessarily be recognized by direct interactions between protein side chains and DNA bases.

Promoter Sites Do Not Always Correspond to Consensus Sites-- One goal of investigating the consensus sequence of a protein is to identify regions of DNA in the genome that interact with a given protein. Compiled from the literature, Supplementary Table IV lists confirmed promoter sequences bound by PU.1. From an alignment of these sequences, we tabulated the frequency of the bases at each position in a fashion similar to the multiplex and SELEX studies (Fig. 5). The consensus sequence according to promoter alignment is At G Ag GGA A G T, which does not agree with multiplex or SELEX results. A similar inconsistency with promoter alignment was observed for Elf-1, another ETS protein (14).

The discrepancy between selection studies and promoter alignment data may arise from two factors. First, it is possible that very high affinity sites are undesirable in the cell. Perhaps the proper expression of some ETS-dependent genes depends on a high concentration of the ETS protein for induction. Because high affinity sites of many ETS proteins appear to be similar, differentiation between target genes may require discrimination between lower affinity binding sites in addition to their tissue specific expression. Of the proteins studied here, mutating apparently crucial residues in the flanking regions of the G-G-A motif results in a modest decrease in affinity. The largest measurable change is the 50-fold decrease in affinity for SAP-1 by mutating the bases at the -1 and -2 positions in the ETS binding site, whereas most of the observed and predicted changes involve less than a 5-fold change in affinity. Similar binding studies with ETS-1 yielded similar affinities and affinity changes (34, 35). A second major factor that plays a role in ETS-DNA recognition is cooperative protein-protein interactions between ETS proteins and accessory transcription factors at adjacent DNA sites (1). Most ETS proteins identified to date participate in such interactions, which can alter either the affinity or specificity of the ETS domain for a composite DNA site. Thus, it is not surprising that natural promoter sites to not correspond to high affinity consensus sites derived from in vitro studies of isolated ETS proteins. This fact necessarily complicates the application of high affinity consensus sites for identifying biologically relevant promoter sites. On the other hand, knowledge of high affinity consensus ETS sites should be useful for identifying sequences which may be inappropriately activated by misexpressed oncogenic ETS proteins.

Consensus Sequence for TEL-- The oncogenic properties of TEL fusion proteins and their role in oncogenesis are widely studied, but little is known about the role of wild-type TEL as a transcription factor. Although identified as a transcriptional repressor, its repression properties do not require the presence of the ETS domain (41). The DNA-binding domain of TEL is able to bind the sequence A TAA ACA GGA AGT GG (42), but no specificity studies have been reported. Our multiplex analysis suggests that the consensus sequence of TEL is quite unique relative to the other ETS proteins investigated here. TEL, like the other proteins, prefers an AGT sequence downstream of the G-G-A but unlike other ETS proteins does not tolerate an adenosine at the +2 position or a cytidine at -3. Upstream of the G-G-A motif, TEL has equal preference for adenosine and cytidine at the -1 position, whereas other ETS proteins prefer a cytidine at this position.

    ACKNOWLEDGEMENTS

We thank Drs. Robert Macgregor, Xenia Morin, and Paul Morin for fruitful discussions and Peter Yin for excellent technical assistance.

    FOOTNOTES

* This work was supported by the Medical Research Council of Canada.The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.

The on-line version of this article (available at http://www.jbc.com) contains supplementary tables.

Dagger To whom correspondence should be addressed: Ontario Cancer Inst. and Dept. of Medical Biophysics, University of Toronto, 610 University Ave., Toronto, ON M5G 2M9, Canada. Tel.: 416-946-2017; Fax: 416-946-6529; E-mail: carrow@uhnres.utoronto.ca.

Published, JBC Papers in Press, June 23, 2000, DOI 10.1074/jbc.M004294200

    REFERENCES
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

1. Sharrocks, A. D., Brown, A. L., Ling, Y., and Yates, P. R. (1997) Int. J. Biochem. Cell Biol. 29, 1371-1387
2. Wasylyk, B., Hahn, S. L., and Giovane, A. (1993) Eur. J. Biochem. 211, 7-18
3. Barton, K., Muthusamy, N., Fischer, C., Ting, C. N., Walunas, T. L., Lanier, L. L., and Leiden, J. M. (1998) Immunity 9, 555-563
4. Scott, E., Simon, MC., Anastasi, J., and Singh, H. (1994) Science 265, 1573-1577
5. Wang, L. C., Swat, W., Fujiwara, Y., Davidson, L., Visvader, J., Kuo, F., Alt, F. W., Gilliland, D. G., Golub, T. R., and Orkin, S. H. (1998) Genes Dev. 12, 2392-2402
6. Kola, I., Brookes, S., Green, A. R., Garber, R., Tymms, M., Papas, T. S., and Seth, A. (1993) Proc. Natl. Acad. Sci. U. S. A. 90, 7588-7592
7. Maroulakou, I. G., Papas, T. S., and Green, J. E. (1994) Oncogene 9, 1551-1565
8. Taylor, J. M., Dupont-Versteegden, E. E., Davies, J. D., Hassell, J. A., Houle, J. D., Gurley, C. M., and Peterson, C. A. (1997) Mol. Cell. Biol. 17, 5550-5558
9. Iwasaka, C., Tanaka, K., Abe, M., and Sato, Y. (1996) J. Cell Phys. 169, 522-531
10. Oda, N., Abe, M., and Sato, Y. (1999) J. Cell Phys. 178, 121-132
11. Dittmer, J., and Nordheim, A. (1998) Biochim. Biophys. Acta 1377, F1-F11
12. Bemark, M., Martensson, A., Liberg, D., and Leanderson, T. (1999) J. Biol. Chem. 274, 10259-10267
13. Brown, L. A., Amores, A., Schilling, T. F., Jowett, T., Baert, J. L., de Launoit, Y., and Sharrocks, A. D. (1998) Oncogene 17, 93-104
14. John, S., Marais, R., Child, R., Light, Y., and Leonard, W. J. (1996) J. Exp. Med. 183, 743-750
15. Li, S. L., Schlegel, W., Valente, A. J., and Clark, R. A. (1999) J. Biol. Chem. 274, 32453-32460
16. Mao, X., Miesfeldt, S., Yang, H., Leiden, J. M., and Thompson, C. B. (1994) J. Biol. Chem. 269, 18216-18222
17. Pio, F., Assa-Munt, N., Yguerabide, J., and Maki, R. A. (1999) Protein Sci. 8, 2098-2109
18. Ray-Gallet, D., Mao, C., Tavitian, A., and Moreau-Gachelin, F. (1995) Oncogene 11, 303-313
19. Shore, P., and Sharrocks, A. D. (1995) Nucleic Acids Res. 23, 4698-4706
20. Liang, H., Mao, X., Olejniczak, E. T., Nettesheim, D. G., Yu, L., Meadows, R. P., Thompson, C. B., and Fesik, S. W. (1994) Nat. Struct. Biol. 1, 871-875
21. Donaldson, L. W., Petersen, J. M., Graves, B. J., and McIntosh, L. P. (1996) EMBO J. 15, 125-134
22. Werner, M. H., Clore, M., Fisher, C. L., Fisher, R. J., Trinh, L., Shiloach, J., and Gronenborn, A. M. (1995) Cell 83, 761-771
23. Werner, M. H., Clore, G. M., Fisher, C. L., Fisher, R. J., Trinh, L., Shiloach, J., and Gronenborn, A. M. (1997) J. Biomol. NMR 10, 317-328
24. Kodandapani, R., Pio, F., Ni, C. Z., Piccialli, G., Klemsz, M., McKercher, S., Maki, R. A., and Ely, K. R. (1996) Nature 830, 456-460
25. Batchelor, A. H., Piper, D. E., de la Brousse, F. C., McKnight, S. L., and Wolberger, C. (1998) Science 279, 1037-1041
26. Mo, Y., Vaessen, B., Johnston, K., and Marmorstein, R. (1998) Mol. Cell 2, 201-212
27. Mo, Y., Vaessen, B., Johnston, K., and Marmorstein, R. (2000) Nat. Struct. Biol. 7, 292-297
28. Desjarlais, J. R., and Berg, J. M. (1994) Proc. Natl. Acad. Sci. U. S. A. 91, 11099-11103
29. Studier, F. W. (1991) J. Mol. Biol. 219, 37-44
30. Sambrook, J., E., Fritsch, F., and Maniatis, T. (1989) Molecular Cloning: A Laboratory Approach , 2nd Ed. , pp. 6.44-6.48, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY
31. Klug, S. J., and Famulok, M. (1994) Mol. Biol. Rep. 20, 97-107
32. Brown, L. A., Yang, S. H., Hair, A., Galanis, A., and Sharrocks, A. D. (1999) Oncogene 18, 7985-7993
33. Brown, T. A., and McKnight, S. L. (1992) Genes Dev. 6, 2502-2512
34. Fisher, R. J., Mavrothalassitis, G., Kondoh, A., and Papas, T. S. (1991) Oncogene 6, 2249-2254
35. Nye, J. A., Petersen, J. M., Gunther, C. V., Jonsen, M. D., and Graves, B. J. (1992) Genes Dev. 6, 975-990
36. Urness, L. D., and Thummel, C. S. (1990) Cell 63, 47-61
37. Woods, D. B., Ghysdael, J., and Owen, M. J. (1992) Nucleic Acids Res. 20, 699-704
38. Strauss-Soukup, J. K., and Maher, L. J., III (1997) J. Biol. Chem. 272, 31570-31575
39. Gross, P., Arrowsmith, C. H., and Macgregor, R. B., Jr. (1998) Biochemistry 37, 5129-5135
40. Dickerson, R. E. (1992) Methods Enzymol. 277, 67-111
41. Lopez, R. G., Carron, C., Oury, C., Gardellin, P., Bernard, O., and Ghysdael, J. (1999) J. Biol. Chem. 274, 30132-30138
42. Poirel, H., Oury, C., Carron, C., Duprez, E., Laabi, Y., Tsapis, A., Romana, S. P., Mauchauffe, M., Le Coniat, M., Berger, R., Ghysdael, J., and Bernard, O. A. (1997) Oncogene 14, 349-357


Copyright © 2000 by The American Society for Biochemistry and Molecular Biology, Inc.
Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Proc. Natl. Acad. Sci. USAHome page
K. Gangwal, S. Sankar, P. C. Hollenhorst, M. Kinsey, S. C. Haroldsen, A. A. Shah, K. M. Boucher, W. S. Watkins, L. B. Jorde, B. J. Graves, et al.
Microsatellites as EWS/FLI response elements in Ewing's sarcoma
PNAS, July 22, 2008; 105(29): 10149 - 10154.
[Abstract] [Full Text] [PDF]


Home page
Cancer Res.Home page
G. Fazio, C. Palmi, A. Rolink, A. Biondi, and G. Cazzaniga
PAX5/TEL Acts as a Transcriptional Repressor Causing Down-modulation of CD19, Enhances Migration to CXCL12, and Confers Survival Advantage in pre-BI Cells
Cancer Res., January 1, 2008; 68(1): 181 - 189.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
W. M. Knosp, C. Saneyoshi, S. Shou, H. P. Bachinger, and H. S. Stadler
Elucidation, Quantitative Refinement, and in Vivo Utilization of the HOXA13 DNA Binding Site
J. Biol. Chem., March 2, 2007; 282(9): 6843 - 6853.
[Abstract] [Full Text] [PDF]


Home page
J. Immunol.Home page
I. Pawlitzky, C. V. Angeles, A. M. Siegel, M. L. Stanton, R. Riblet, and P. H. Brodeur
Identification of a Candidate Regulatory Element within the 5' Flanking Region of the Mouse Igh Locus Defined by Pro-B Cell-Specific Hypersensitivity Associated with Binding of PU.1, Pax5, and E2A.
J. Immunol., June 1, 2006; 176(11): 6839 - 6851.
[Abstract] [Full Text] [PDF]


Home page
Infect. Immun.Home page
S.-B. Lee and J. S. Schorey
Activation and Mitogen-Activated Protein Kinase Regulation of Transcription Factors Ets and NF-{kappa}B in Mycobacterium-Infected Macrophages and Role of These Factors in Tumor Necrosis Factor Alpha and Nitric Oxide Synthase 2 Promoter Function
Infect. Immun., October 1, 2005; 73(10): 6499 - 6507.
[Abstract] [Full Text] [PDF]


Home page
BloodHome page
T. Tamura, P. Thotakura, T. S. Tanaka, M. S. H. Ko, and K. Ozato
Identification of target genes and a unique cis element regulated by IRF-8 in developing macrophages
Blood, September 15, 2005; 106(6): 1938 - 1947.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. C. Hollenhorst, D. A. Jones, and B. J. Graves
Expression profiles frame the promoter specificity dilemma of the ETS family of transcription factors
Nucleic Acids Res., October 21, 2004; 32(18): 5693 - 5702.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
S. Y. Reddy, S. Obika, and T. C. Bruice
Conformations and dynamics of Ets-1 ETS domain-DNA complexes
PNAS, December 23, 2003; 100(26): 15475 - 15480.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
B. J. Irvin, L. D. Wood, L. Wang, R. Fenrick, C. G. Sansam, G. Packham, M. Kinch, E. Yang, and S. W. Hiebert
TEL, a Putative Tumor Suppressor, Induces Apoptosis and Represses Transcription of Bcl-XL
J. Biol. Chem., November 21, 2003; 278(47): 46378 - 46386.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
D. O'Reilly, C. M. Quinn, T. El-Shanawany, S. Gordon, and D. R. Greaves
Multiple Ets Factors and Interferon Regulatory Factor-4 Modulate CD68 Expression in a Cell Type-specific Manner
J. Biol. Chem., June 6, 2003;