Biochemical and biophysical properties of the core-binding factor alpha2 (AML1) DNA-binding domain.

The Runt domain is the DNA-binding domain defining a small family of transcription factors that are involved in important developmental processes. Developmental pathways controlled by Runt domain proteins include sex determination, neurogenesis, segmentation, and eye development in Drosophila and hematopoiesis in mammals. In addition to binding DNA, the Runt domain also mediates heterodimerization with another subunit called the core-binding factor beta (CBFbeta) subunit. In this study we overexpress the Runt domain from the mouse CBFalpha2 (AML1) protein in Escherichia coli, and purify it from the insoluble fraction. We determine the equilibrium constants for Runt domain binding to two different DNA sequences by surface plasmon resonance technology. Circular dichroism spectroscopy demonstrates that the Runt domain is a folded beta-domain with essentially no alpha-helical content. The single tryptophan residue in the CBFalpha2 Runt domain at amino acid 79 is shown by tryptophan fluorescence spectroscopy to reside in a polar environment. Finally, we demonstrate that ATP can be UV cross-linked to the Runt domain and that ATP binding is sensitive to an amino acid substitution in the putative Kinase-1a motif (P-loop).

The core-binding factors (CBFs) 1 are DNA-binding transcriptional activators that bind the consensus sequence PyGPyGGT (for review, see Refs. 1 and 2). CBF consists of two unrelated subunits, ␣ and ␤, that form a heterodimeric complex both in solution and on DNA (3-7). 2,3 The CBF␣ subunit is the DNAbinding subunit of the complex and is capable of binding DNA in vitro in the absence of its partner protein, CBF␤ (3)(4)(5)(6). CBF␤ stabilizes binding of CBF␣ to DNA without contacting the DNA directly (6,8).
Three genes encode mammalian CBF␣ subunits, CBFA1, CBFA2 (otherwise known as AML1), and CBFA3 (4, 9 -12). The CBF␣ subunits contain a highly conserved 128-amino acid region called the "Runt" domain, named for the Drosophila homologue of the CBF␣ proteins, Runt (13,14). The Runt domain is the DNA-binding domain of the CBF␣ and Runt proteins, and also contains the heterodimerization domain for the CBF␤ subunit (4,5,14). The Runt domain has no homology to other known DNA-binding domains (e.g. zinc fingers, helix-turn-helix, helix-loop-helix, b-Zip), thus the CBF␣ subunits comprise a new family of DNA-binding proteins. The only identifiable structural motif within the Runt domain is a Kinase-1a motif (also known as a Walker A consensus sequence or P-loop), found in many proteins that bind nucleotides (15)(16)(17). The Kinase-1a motif is a glycine-rich sequence that characteristically folds into a flexible loop between an ␣-helix and a ␤-strand, and interacts with one of the phosphate groups of the bound nucleotide.
Various members of the CBFA gene family function in distinct developmental pathways. The Drosophila runt gene plays a role in three developmental pathways, sex determination, segmentation, and neurogenesis (18 -21). A Drosophila homologue of the runt gene called lozenge functions in the pathway that specifies photoreceptor cell identity in the Drosophila eye (22). Homozygous disruption of the mammalian CBFA2 gene in mice identified an essential role for the gene in at least two developmental processes. Mice homozygous for a mutation of the Cbfa2 gene die in midgestation from extensive hemorrhaging in the central nervous system, which is preceded by cellular necrosis (23). Mutation of the Cbfa2 gene also completely blocked definitive hematopoiesis, resulting in the absence of progenitors capable of differentiating into mature erythroid, myeloid, and lymphoid cells of all lineages (23,24). An essential role for CBF in hematopoiesis was also predicted from the involvement of both the CBFA2 and CBFB genes in various forms of human leukemias (for review, see Nucifora and Rowley (25) and Liu et al. (26)). The CBFA2 (AML1) gene is encoded on human chromosome 21 and is disrupted by the t(8;21)(q22; q22) associated with de novo acute myeloid leukemia (M2 subtype) (9,27), by the t(12:21)(p13;q22) in de novo acute lymphocytic leukemia (pre-B cell) (28,29), and by the relatively rare t(3;21)(q26;q22) associated with therapy-related leukemias and myelodysplasias (30,31). The CBFB gene is also disrupted in human leukemias (M4 eosinophil) by the inv(16)(p13;q22) (32). All of these translocations result in the production of chimeric proteins that are thought to act as dominant negative inhibitors of CBF to inhibit hematopoiesis.
Since the Runt domain appears to represent a new structural motif for recognizing DNA, it is an interesting subject for biophysical and structural analyses. Here we characterize several biochemical and biophysical properties of the Runt domain of the CBF␣ subunit encoded by the murine Cbfa2 (AML1) gene.
We describe a protocol for expressing the Runt domain in bacteria and purifying the protein to homogeneity from the insoluble fraction. We have determined the association and dissociation rate constants (k on and k off ), as well as the equilibrium dissociation constants (K d ) of the purified Runt domain for two independent target sequences by surface plasmon resonance methods. Unfolding of the domain in urea monitored by tryptophan fluorescence spectroscopy shows a cooperative transition indicative of a stable folded structure. Circular dichroism spectroscopy of the purified domain indicates that the protein is a ␤-domain with virtually no ␣-helical content. Finally, we demonstrate that ATP can be cross-linked to the Runt domain, and that ATP binding is sensitive to mutation of a conserved lysine in the Kinase-1a motif.

Expression and Purification of the CBF␣2 Runt Domain and Runt Domain Mutants
Bacterial Expression-DNA sequence corresponding to amino acid residues 41-190 (11) of the murine CBF␣2 protein was amplified by polymerase chain reaction (PCR) from a cDNA template using the following primers: 5Ј-CGGAATTCCCATATGGCCAGCAAGCTGAGGA-GC-3Ј (5Ј Runt) and 5Ј-CGGGATCCTTACCCGGGCTTGGTCTGATC-3Ј (3Ј Runt). The 476-base pair PCR product was digested with NdeI and BamHI, subcloned into the corresponding sites of the pET-3C expression vector (Novagen), and transformed into the bacterial strain XL1blue (Stratagene).
A cDNA clone encoding the human CBF␣2 (AML1) protein with a single amino acid substitution in the Runt domain (K144M) was kindly provided by Scott Hiebert. A fragment encoding residues 41-190 encompassing the Runt domain was amplified from this cDNA clone by PCR, as described above, and subcloned into the NdeI/BamHI sites of pET-3C.
A Runt domain with a lysine to arginine substitution at amino acid 144 (K144R) was prepared by PCR amplification (33) using the previous two primers listed above, plus the following two mutant primers (mutations underlined): 5Ј-GCTCCTGCCTCTACCGCTCCG-3Ј (5Ј-K144R) and 5Ј-CGGAGC-GGTAGAGGCAGGAGC-3Ј (3Ј-K144R). The 5Ј end of the murine Runt domain was amplified using primers 5Ј-Runt and 3Ј-K144R, and the 3Ј end was amplified using primers 5Ј-K144R and 3Ј-Runt. The products were purified and used in a ratio of 1:1 as template for a second round of PCR with primers 5Ј-Runt and 3Ј-Runt. The product from the second round of PCR, which contains the Runt domain incorporating the mutation, was purified and subcloned into the pET-3C vector as described above.
All PCR-amplified regions were sequenced using the PRISM TM sequencing kit (Perkin-Elmer Corp.) according to the manufacturer's instructions, and the sequence was analyzed on an Applied Biosystems Automated Sequencer Stretched 373A (Perkin-Elmer Corp.), run by the Dartmouth College Molecular Biology Core Facility.
Positive clones were transformed into the Escherichia coli strain BL21(DE3)LysS (34). A starter culture was made by inoculating 300 ml of LB plus antibiotics (100 g⅐ml Ϫ1 carbenicillin and 50 g⅐ml Ϫ1 chloramphenicol) with 2-5 l of a 1:1000 dilution from a scraping of a glycerol stock of transformed BL21(DE3)LysS cells. The cells were grown approximately 12 h at 37°C until the A 600 reached a value of 0.2. This culture was then used to inoculate 1 liter of 2 ϫ YT medium containing the same antibiotics. The culture was allowed to grow until it reached an A 600 ϭ 0.4 -0.6, at which point protein production was induced by the addition of 1 mM isopropyl-1-thio-␤-D-galactopyranoside. Cells were harvested between 6 and 8 h postinduction.
Extract Preparation and Protein Renaturation-The CBF␣2 Runt domain was solubilized and renatured from bacterial inclusion bodies (35). Unless otherwise indicated, all operations were carried out at 4°C. The bacterial culture prepared as described above was placed on ice for 15 min, then the bacteria were harvested by centrifugation at 4800 ϫ g for 10 min. The resulting bacterial cell pellet was resuspended in 1-2 packed cell volumes of buffer A (50 mM Tris-HCl, pH 7.5, 10 mM EDTA, 25% sucrose). The bacteria were treated with lysozyme (2.5 mg⅐ml Ϫ1 ) (Boehringer Mannheim) and benzonase (25 units⅐ml Ϫ1 ) (E. Merck, Darnstadt, Germany) for 30 min. Triton X-100 (Mallinckrodt) was added in four aliquots to a final concentration of 1%, with a 5-10-min incubation between detergent additions. The volume of the lysate was adjusted to 40 -50 ml with buffer A containing 1% Triton X-100, and subjected to six to eight rounds of sonication (6 -8 s on, 54 -52 s off). After sonication, the homogenate was centrifuged at 13,000 ϫ g for 30 min. The inclusion body pellet was washed three times with buffer A containing 1% Triton X-100, followed by one to two washes with buffer B (50 mM Tris-HCl, pH 7.5, 100 mM NaCl, 1 mM EDTA, 1 mM EGTA, 0.5 mM DTT) containing 0.5 M urea. Washes were performed by resuspending the inclusion body pellet in 10 -15 ml of wash buffer, then subjecting the pellet to 10 -15 strokes of dounce homogenization with a loose pestle. Additional wash buffer was added to the resuspended inclusion bodies to bring the volume to 40 -50 ml, and the inclusion bodies were collected by centrifugation at 13,000 ϫ g for 15 min. The final washed pellet of inclusion bodies was resuspended in 10 -15 ml of buffer B containing 7 M urea and 300 mM NaCl. The resuspended pellet was Dounce homogenized with both the loose and tight pestles (10 -15 strokes each) at room temperature. The solubilized protein was incubated at room temperature for 30 min, then centrifuged at 13,000 ϫ g for 15 min at 25°C.
Chromatography-Ten to fifteen milliliters of hydrated hydroxylapatite (Bio-Rad) was equilibrated in buffer D. The renatured protein was batch-adsorbed to the hydroxylapatite for 15 min. The resin was poured, and the column was washed with 10 column volumes of buffer D followed by 10 column volumes of buffer D minus Triton X-100. The protein was eluted with a linear gradient of 20 to 320 mM phosphate in buffer D in a total of 5 column volumes. The protein eluted in a broad peak around 150 mM phosphate.
The hydroxylapatite pool was concentrated to 5-10 ml on a bed of polyethylene glycol or by ultrafiltration and applied to a 2.5 ϫ 30-cm Sephacryl-S100 column (Pharmacia) equilibrated in buffer E (50 mM potassium phosphate, pH 7.5, 150 mM KCl, 10% glycerol, 0.1 mM EDTA, 1.0 mM DTT, 0.1% NaN 3 ). The protein peak was pooled and concentrated by ultrafiltration for storage at 4°C.

Determination of Molar Extinction Coefficient
For determination of the molar extinction coefficient (⑀) at 280 nm for the Runt domain, a sample of protein with known absorbance was subjected to hydrolysis and amino acid analysis (Anaspec, Inc.). The concentration of protein present was determined on the basis of the concentrations of nonhydrolyzable amino acids and the molar extinction coefficient calculated from this concentration and the measured absorbance at 280 nm. A value of 11,000 M Ϫ1 ⅐cm Ϫ1 was obtained in this manner. The concentrations of all three Runt domains (wild type, K144M, K144R) were normalized by absorbance at 280 nm using this extinction coefficient, since the content of tryptophan and tyrosine was the same in all three proteins.

Electrophoretic Mobility Shift Assays
Assays were performed as described previously (6). Preparation of DNA probes was also as described (6), except that the probe was purified through a Bio-Spin 6 column (Bio-Rad) to remove unincorporated nucleotide. All equilibrium binding reactions were loaded onto running gels. Results were obtained by PhosphorImager analysis of dried gels on a Molecular Dynamics PhosphorImager 445SI scanner (Molecular Dynamics, Sunnyvale CA), and then quantified in the scientific image processing program IPLab gel (Signal Analytics Corp., Vienna, VA) to determine relative amounts of bound and free DNA.
The percentage of active Runt domain was determined by DNA titration using a fixed amount of protein. The DNA site used in the titrations was from the T cell receptor ␤-chain enhancer (5Ј-GGATATATGTGGTTTGCA-3Ј). Binding reactions (20 l total volume) were performed at 4°C for 20 min in binding buffer containing 10 mM Tris-HCl, pH 7.5, 100 mM NaCl, 1 mM EDTA, 5 mM DTT, 0.2 g⅐ml Ϫ1 bovine serum albumin, 0.05% Triton X-100, and 4% glycerol. Fifteen microliters of the binding reactions were loaded onto a running gel (10%) at 4°C. The results were quantified by PhosphorImager scanning and analysis. The amount of total DNA (D t ) was determined from a standard curve of known DNA concentrations. The fraction of bound DNA (PD) was calculated from determining the free DNA (D f ) remaining in each lane (PD ϭ 1 Ϫ (D f /D t )).

Effector Study
Electrophoretic mobility shift assays were performed as described above, except that assays were performed in 10 mM Tris-HCl, pH 7.5, 15% ethylene glycol buffer (15 l final volume), to which various effectors (mono-and divalent salts, detergents, denaturants, and so forth) were added at various concentrations. Binding reactions were performed at room temperature for 20 min. The protein concentration used in all the analyses was approximately equal to the K d (5-10 nM) for the binding site that was used (HA) (36), as determined in the buffer minus effectors.

Surface Plasmon Resonance
Surface plasmon resonance experiments were conducted on Biosensor instrumentation (BIAcore TM ) from Pharmacia Biosensor. Oligonucleotides for surface plasmon resonance (purchased from Midland Certified Reagent Company, Midland, TX) were designed so that only the top strand was biotinylated on the 5Ј end. Complementary strands were annealed in a ratio of top strand: bottom strand of 1:4 by boiling for 5 min in 15 mM sodium citrate, pH 7.0, 150 mM NaCl, and then cooling slowly to room temperature before final storage at 4°C. The CM5 sensorchip (Pharmacia Biosensor) was modified with streptavidin (Sigma) according to the BIAcore instruction manual. Briefly, 40 l of a 1:1 mix of N-hydroxysuccinimide and N-ethyl-NЈ-(dimethylaminopropyl)carbodiimide (Pharmacia) were injected onto the chip surface at a flow rate of 5 l⅐min Ϫ1 . Following activation of the chip surface, 40 l of streptavidin (200 g⅐ml Ϫ1 in 10 mM sodium acetate, pH 4.5) were injected, immediately followed by a 35-l injection of ethanolamine to block any residual coupling groups. Addition of streptavidin typically gave a response unit (RU) change of 3000 -4000. The surface was then washed two to three times with 5-l injections of regeneration solution (0.1% SDS). The annealed DNA was added to the surface to create either a low density surface for kinetic binding experiments (RU ϭ 130 -140) or high density surface for equilibrium binding experiments (RU ϭ 140 -260). Typically, a stock concentration of 0.4 g⅐ml Ϫ1 or 4 g⅐ml Ϫ1 of biotinylated double-stranded DNA was used for creating either the low or high density surface, respectively. Repeat injections of 5-10 l at a flow rate of 5 l⅐min Ϫ1 were performed until the desired RU change was obtained.
Equilibrium binding experiments were performed at a flow rate of 10 l⅐min Ϫ1 . Kinetic rate experiments were performed at a flow rate of 50 l⅐min Ϫ1 to reduce the effects of mass transport. Buffer for surface plasmon resonance experiments contained 20 mM sodium phosphate, pH 7.5, 150 mM NaCl, and 0.005% P-20 detergent. All binding experiments were conducted at 25°C. The association phase (about 5-30 s) was analyzed by nonlinear least squares analysis using the BIAevaluation (Pharmacia Biosensor) software to yield a k s value at each concentration. Dissociation measurements were initiated by injection of buffer lacking the Runt domain. In order to prevent protein from reassociating with the bound DNA, nonbiotinylated, double-stranded HA oligonucleotide (8 -10 ϫ 10 Ϫ7 M) was included in the washes. For generating a protein concentration series, the protein was diluted in the same buffer plus 1 mM DTT and 0.1% bovine serum albumin.

Fluorescence Spectroscopy
Fluorescence spectroscopy of the Runt domain or N-acetyltryptophanamide (NATA) was performed on a Shimadzu fluorescence spectrometer RF-1501 (Shimadzu Scientific Instrument, Inc., Columbia, MD). All determinations were done in a 1-cm path length quartz cuvette at 25°C, with both the excitation and emission slit widths set at 10 nm. To monitor tryptophan fluorescence in the native and denatured state, an excitation wavelength of 295 nm was used, with a protein concentration of 7.3 M. The protein or NATA was diluted in buffer that contained 25 mM Tris-HCl (pH 7.5), 200 mM NaCl, 0.05 mM EDTA, and 0.125 mM DTT. Denaturation was performed in the same buffer in the presence of 6 M urea for 2-4 h at 25°C. Fluorescence spectra of buffer with and without 6 M urea were also recorded and subtracted from the experimental curves to yield net relative fluorescence spectra.
Urea-induced unfolding of the Runt domain was monitored using an excitation wavelength of 285 nm, and a protein concentration of 0.9 M. The more sensitive 285-nm excitation wavelength shifted the emission maximum of the native protein to 325 nm. Denaturation was performed in the same buffer described above in the presence of various concentrations of urea, and the loss of fluorescence at 325 nm was used to monitor the process of unfolding. A 60% change in relative fluorescence was observed between native and denatured states of the protein under these conditions.

Circular Dichroism
Circular dichroism spectra were collected at 20°C in a Jasco 715 spectrometer calibrated using 10-camphorsulfonic acid. Mean amide ⌬⑀ values were calculated using the known protein sequence and concentrations derived from absorption measurements at 280 nm on an ultraviolet absorption spectrometer.
The protein solution was dialyzed extensively against 25 mM potassium phosphate, pH 7.3, 0.1 mM EDTA, 0.5 mM DTT prior to CD measurements. Measurements in the far ultraviolet (176 -260 nm) were performed on a sample of 56 M protein in a quartz cell of 0.05 mm path length. The data was corrected by subtraction of a spectrum of the buffer alone. A total of eight scans were recorded at 1-nm resolution from 265 to 175 nm for both protein and buffer at a rate of 10 nm⅐min Ϫ1 with a 16-s response time. The resulting data for 178 -260 nm was fit using the variable selection protocol of Johnson (37) and Manavalan and Johnson (38), with software provided by Dr. Johnson. Three proteins at a time were removed from the 33-protein data base, and the resulting 5456 combinations were examined for total percentage of secondary structure and root mean square error. Ten combinations were finally selected, all of which gave values of 100% for total secondary structure, and which had root mean square error values less than 0.099.

UV Cross-linking of [␥-32 P]ATP to the Runt Domain
The CBF␣2 Runt domain (0.063 mg⅐ml Ϫ1 ), with or without CBF␤(1-141) (0.043 mg⅐ml Ϫ1 ) (expression and purification of the interaction domain (amino acids 1-141) of the non-DNA-binding CBF␤ subunit from bacteria will be described elsewhere 4 ) was preincubated in a buffer containing 10 mM Tris-HCl, pH 7.5, and 15% ethylene glycol (120 l final volume) for 15 min at 4°C before addition of nucleotide. Adenylate kinase (Sigma) at a concentration of 0.125 mg⅐ml Ϫ1 was used as a positive control for nucleotide binding. [␥-32 P]ATP was added to the Runt domain (plus or minus CBF␤(1-141)), to CBF␤(1-141), and to adenylate kinase at a final concentration of 67 nM (specific activity ϭ 1.33 Ci⅐l Ϫ1 ), and the reaction was incubated for an additional 15 min at 4°C. UV cross-linking was performed using a UV light (model UVGL-25, UVP Inc., San Gabriel, CA) at 254 nm and at a distance of 6 -8 cm. Aliquots (15 l) were removed at various times during UV cross-linking (0, 5, 10, 20, 40, and 80 min), combined with an equal volume of 2 ϫ sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) sample buffer, and 5 l of the sample were electrophoresed through a 15% SDS-PAGE gel. The proteins were stained with Coomassie Brilliant Blue and destained, and the gel was dried. Phos-phorImager analysis was used to determine the relative incorporation of radioactivity.
Photoaffinity Labeling with [8-azido-␣-32 P]ATP-The photoaffinity label [8-azido-␣-32 P]ATP (RPI Corp., Mount Prospect IL) was vacuum dried and resuspended in 10 mM sodium phosphate, pH 7.5, along with excess nonradiolabeled 8-azido-ATP, to a final concentration of 2.5 mM with a specific activity of 100 Ci⅐ml Ϫ1 . Wild type and mutant Runt domains (K144M, K144R) were dialyzed overnight into buffer containing 20 mM sodium phosphate, pH 7.5. Proteins (1.8 M) were incubated with concentrations of [8-azido-␣-32 P]ATP ranging from 0.0625 to 1.0 mM for 60 min on ice in a buffer containing 10 mM sodium phosphate (pH 7.5), 5 mM MgCl 2 (final volume ϭ 50 l). Binding reactions were then transferred from Eppendorf tubes to separate wells of a Falcon 3912 assay plate and exposed to short wave UV light (model UVGL-25, UVP Inc.) at a distance of 6 -8 cm for 7 min. Aliquots (8 l) of the reactions were removed before and after exposure to UV light, combined with an equal volume of 2 ϫ SDS-PAGE sample buffer, and electrophoresed through 15% SDS-PAGE gels without prior boiling, according to the manufacturer's instructions (RPI Corp.). SDS-PAGE gels were vacuum-dried and subjected to autoradiography and/or Phosphor-Imager analysis.

Expression and Purification of the CBF␣2 Runt Domain-
The Runt domain was expressed in the E. coli strain BL21(DE3)LysS from a pET-3C vector (Fig. 1A). The expressed domain (amino acids  contains the region conserved in the Drosophila and mammalian proteins (amino acids 51-178). In addition, N-and C-terminal extensions that were found to improve expression levels in bacteria and stabilize DNA-binding by the isolated domain were included. 5 The expressed Runt domain accounted for 25-30% of the total bacterial cell fraction (Fig. 1B, lane 2); however, the soluble fraction contained only 20 -25% of the expressed protein (Fig. 1B, lane 3), while the remaining 75-80% partitioned to the insoluble fraction (Fig.  1B, lane 4). Previous work had demonstrated that the mammalian CBF␣2 subunit could be subjected to denaturation and renaturation (6,8,36). Therefore, we took advantage of the fact that most of the protein partitioned with the insoluble fraction, since this provided a tremendous initial purification away from bulk cytosolic proteins. After cell lysis and centrifugation, the insoluble fraction was washed several times in buffers contain-ing 1% Triton X-100 followed by 0.5 M urea to remove loosely associated proteins. The insoluble fraction was then completely solubilized in a buffer containing 7 M urea and subjected to DEAE-Sephacel chromatography to remove nucleic acids. Protein in the flow-through fraction from the DEAE-Sephacel column was renatured by diluting 7-fold to a concentration of 1 M urea in buffer without urea and dialyzed to remove the urea. The loss of protein during renaturation varied, but typically 20 -25% of the Runt domain was recovered. The renatured protein was further purified by sequential chromatography on hydroxylapatite and Sephacryl-S100 to remove traces of contaminating proteins. Fig. 1B and Table I detail a typical purification of the Runt domain starting with cells from a 1-liter bacterial culture. The overall yield of the protein is 5-8 mg⅐L Ϫ1 of bacterial culture. The percent active protein determined by DNA titration in electrophoretic mobility shift assays was between 80 and 100% (Fig. 1C) and is equivalent to that of the Runt domain prepared from the soluble fraction under native conditions. 6 Effectors of DNA Binding Activity-We examined the ability of several agents to stabilize DNA binding activity of the Runt domain (Fig. 2). All salts tested were inhibitory to DNA binding activity. The trend from least inhibitory to most inhibitory (taken at the 50% inhibition point) in the monovalent series was NaCl/KCl Ͻ CsCl Ͻ LiCl/RbCl ( Fig. 2A). The divalent salts MgCl 2 and CaCl 2 inhibited at equivalent concentrations (Fig.  2B), but were clearly more inhibitory than any of the monovalent salts.
Several effectors were found that stabilize protein-DNA complex formation. A dramatic stabilization of DNA binding activity was observed in the presence of polyethylene glycol and nonionic detergents, such as Triton X-100 (Fig. 2, C and D). Stabilization by Triton X-100 is not observed until the detergent reaches its critical micellar concentration (0.24 mM). Dramatic stabilization was also observed for two other nonionic detergents, Tween-20 and Nonidet P-40, as well as a zwitterionic detergent, CHAPS, whereas the anionic detergents SDS and Sarkosyl completely inhibited DNA binding activity at similar concentrations (not shown).
DNA-binding Properties of the Runt Domain-surface plasmon resonance technology was used to determine the equilibrium and kinetic rate constants for Runt domain binding to two different DNA sequences coupled to the surface. The DNA surfaces included a high affinity (HA) site originally identified by Thornell et al. (39) and a site derived from the Moloney murine leukemia virus enhancer (WT) (36). A mutant site known not to bind CBF was used as a negative control (36). Sensorgrams were recorded at a variety of Runt domain concentrations on both the HA and WT surfaces (Fig. 3, A and B). Equilibrium binding constants were determined for both the HA and WT sites (Fig. 3D). A nearly 10-fold difference in the K d values of the Runt domain for the HA site (50 nM) and WT site (440 nM) was determined by Scatchard analysis, confirming that the WT sequence from the Moloney murine leukemia virus is a lower affinity site than the HA site. Binding to a third DNA surface containing a mutant DNA site was indistinguishable from binding to a surface that contained only the cross-linked streptavidin antibody. Either the mutant DNA surface or the antibody surface was used as a negative control to subtract out reflective index changes due to buffer components and nonspecific binding to the surface. The RU change on the HA surface was 89% of that predicted for a 1:1 Runt domain-DNA complex.
We used lower density DNA surfaces to determine apparent association and dissociation rates for Runt domain-DNA binding. Plots of k s versus protein concentration yield a straight line where the slope is equal to k on (Fig. 3C). The value of k on obtained for the HA surface was 2.53 ϫ 10 6 M Ϫ1 s Ϫ1 , and for the WT surface was 1.15 ϫ 10 6 M Ϫ1 s Ϫ1 . Both of these values are approximations due to limitations of the system. The validity of the kinetic analysis we used is based on the assumption that the concentration of the Runt domain available for binding to DNA immobilized on the dextran surface is equal to the concentration of the injected Runt domain. Although experimental precautions were taken to minimize mass transport effects, including injections at a very high flow rate (50 -100 l/min) and utilization of low density surfaces for kinetic analysis, because the association rates for the Runt domain to the DNA surface were so high, mass transport effects cannot be entirely eliminated. According to Hall et al. (40), the assumption that the concentration of analyte in the flowing phase remains constant at its injected value is valid for systems with effective association rate constants Յ1 ϫ 10 5 M Ϫ1 s Ϫ1 , and values of k on Ͼ 1 ϫ 10 6 M Ϫ1 s Ϫ1 are most likely to be underestimates because of mass transport effects. We conclude, therefore, that k on ϭ 1.15 ϫ 10 6 M Ϫ1 s Ϫ1 for the WT surface is approximately correct, particularly in view of the fact that the plot of k s versus concentration (Fig. 3C) intercepts the ordinate at a positive value of k s . On the other hand, the value of k on ϭ 2.53 ϫ 10 6 M Ϫ1 s Ϫ1 for the HA surface is considerably underestimated, seeing that its plot intercepts the ordinate at a negative value implying that k off is negative.
Nonlinear least squares curve fitting was also used to analyze the dissociation phase (120 -150 s) for the same set of experiments. To measure a dissociation rate constant, it was necessary to coinject competitor DNA at the time of dissociation to prevent rebinding of the protein to the DNA site at the surface. A k off rate for the HA surface of 0.097 s Ϫ1 (Ϯ0.019 s Ϫ1 ) was determined from the first 5 s of real dissociation. We were limited to 2 s of real dissociation from the WT surface, for which we obtained a k off rate of 0.517 s Ϫ1 (Ϯ0.069 s Ϫ1 ). The k off for the WT surface approaches the absolute limits of the instrumentation, and the higher standard deviation on the WT surface is indicative of the lower reliability of the kinetic determination. Mass transport effects also limited our measurement of dissociation rate constants. Since mass transport effects should be approximately equivalent in k on and k off analyses, the k off /k on ratio should still be consistent with the value obtained by equilibrium binding measurements. The apparent k off /k on ratios we observed, Ϸ38 nM and Ϸ450 nM for HA and WT sites, respectively, are in good agreement with the K d values measured by equilibrium binding analysis (50 nM and 440 nM). This confirms that whatever mass transport effects are present affect the k on and k off measurements equivalently. Based on the kinetic analysis, the Runt domain binds DNA in a fast-on, fast-off fashion. The difference in the dissociation rate constants accounts for most of the difference in the affinity of the Runt domain for the HA and WT sites.
The Single Tryptophan in the Runt Domain Is in a Polar Environment-Fluorescence spectroscopy was employed to probe the local environment of the single tryptophan residue in the Runt domain (Trp 79 ). Fluorescence spectra of the Runt domain and NATA were performed in the absence and presence of 6 M urea (Fig. 4A). The native protein exhibits approximately the same fluorescence intensity as the urea denatured protein, with a shift in the maximum of the fluorescence spectrum from 340 to 350 nm upon urea denaturation. Both the emission maximum and the quantum yield, i.e. intensity, have been shown to be sensitive to the polarity of the local environment around a fluorophore (41,42), with more hydrophobic environments typically displaying large blue shifts in the emission maximum and increased quantum yield relative to a fluorophore fully exposed to a polar solvent. A large range of emission maxima are observed for the tryptophans in proteins ranging from 308 nm for the protein azurin to 350 nm for the peptide glucagon (41). The tryptophan (Trp 79 ) in the Runt domain shows a modest shift in the emission maximum upon denaturation and it does not show any change in the fluorescence intensity, suggesting that Trp 79 is located in a partially shielded, polar environment and that it is unlikely to be found buried in the hydrophobic core of the protein.
The decrease in fluorescence at the emission maximum for the native protein (325 nm) was used to detect the extent of denaturation of the Runt domain in the presence of increasing concentrations of urea (Fig. 4B). The denaturation curve ob-tained was consistent with a cooperative transition between two states, as expected for a folded protein. The denaturation curve paralleled the loss of DNA binding activity that was observed in equilibrium binding reactions, with the half point of denaturation occurring at approximately 3 M urea (Fig. 4B). These results show that the Runt domain, which was purified under denaturing conditions and renatured, has a folded structure. The folded nature of the protein was further confirmed by circular dichroism (see below). From the data in Fig. 4B, a ⌬G D H2O can be calculated as outlined by Pace (43). The value obtained using the linear extrapolation method on data from the transition zone was 3.79 Ϯ 0.48 kcal⅐mol Ϫ1 (15.89 Ϯ 2.07 kJ⅐mol Ϫ1 ). For comparison, the value of ⌬G D H2O for other proteins such as ribonuclease T 1 is 4.7 kcal⅐mol Ϫ1 and for the ␣ subunit of tryptophan synthase is 3.6 kcal⅐mol Ϫ1 (43).
CD Analysis Indicates That the Runt Domain Is Primarily a ␤ Structure-The far ultraviolet CD spectrum for the Runt domain (Fig. 5) was analyzed using the variable selection protocol of Johnson (37) and Manavalan and Johnson (38) to obtain estimates for the content of various secondary structures in the Runt domain. As shown in Fig. 5, the Runt domain is predominantly a ␤ protein with 7% of parallel ␤-strand and 29% of antiparallel ␤-strand secondary structure. Virtually no ␣-helical content was observed, so the Runt domain appears to fall into the small class of ␤ proteins that bind DNA sequence specifically.
The Runt Domain Binds Nucleotides-The Runt domain contains an ATP/GTP binding motif GRSGRGKS at amino acids 138 -145 ( Fig. 1) that conforms to the Kinase-1a consensus sequence GXXXXGKS/T/G. The GRSGRGKS sequence is 100% conserved in all Runt domain proteins, and raises the question of whether the Runt domain binds nucleotides. We performed UV cross-linking experiments to determine if the Runt domain binds ATP. The amount of [␥-32 P]ATP cross-linked to the Runt domain increased in a linear fashion with increasing times of UV exposure, both in the absence and presence of the CBF␤(1-141) subunit (Fig. 6, lanes 1-12). CBF␤(1-141) contains the N-terminal 141 amino acids of CBF␤, including its heterodimerization domain (3). Adenylate kinase, a known nucleotide binding protein, was used as a positive control for ATP binding (Fig. 6, lanes 19 -24). The amount of [␥-32 P]ATP crosslinked to the Runt domain (3,874 PhosphorImager units/pmol at t ϭ 20 min) was approximately 2-fold greater than that cross-linked to adenylate kinase (1,606 PhosphorImager units/ pmol). In contrast, [␥-32 P]ATP cross-linked poorly to equivalent amounts of the non-DNA-binding CBF␤(1-141) subunit (Fig. 6,  lanes 7-18). A small amount of labeled species migrating at a molecular mass of approximately 23 kDa was seen in samples containing CBF␤(1-141) (asterisk). This could be caused either by photodimerization of a proteolytic product of CBF␤ and trapping of the [␥-32 P]ATP (the molecular mass of CBF␤(1-141) is 16.7 kDa, thus a dimer of this protein would migrate at approximately 33 kDa) or more likely, by a minor contaminant in the CBF␤ preparation that efficiently binds ATP. A contaminant in the adenylate kinase samples was also detected (Fig. 6, lanes 20 -24, asterisk); however, since the intensity of this band also increased with time of UV exposure in Coomassie Bluestained gels, this contaminant is probably a photodegradation product of adenylate kinase.
To determine if ATP-binding is sensitive to mutations in the Kinase-1a motif we performed UV cross-linking of [8-azido-␣-32 P]ATP to Runt domains that contained two different amino acid substitutions for the lysine at position 144, which is the only invariant amino acid in the Kinase-1a consensus sequence (16). Mutant K144M contains a methionine in place of the invariant lysine (7). The other mutant, K144R, contains an arginine at the position of the conserved lysine. Although this lysine is invariant in Kinase-1a motifs, other closely related motifs found, for example, in phosphofructokinase, pyruvate phosphate dikinases, cAMP-dependent protein kinase, and aminoglycoside 3Ј-phosphotransferases have an arginine at this position (16,(45)(46)(47)(48).
The DNA binding activity of the mutated Runt domains was first examined by electrophoretic mobility shift assay (Fig. 7A). The K144R protein bound DNA as well as the wild type Runt domain (Lys 144 ), whereas the K144M mutant had attenuated DNA binding activity, consistent with results reported by Lenny et al. (7). Increased smearing of DNA behind the free DNA band in lanes containing the K144M protein indicates that the protein-DNA complex dissociates more rapidly in the gel matrix than protein-DNA complexes containing the wild type or K144R protein. Still, the detectable DNA-binding activity indicated that the overall structure of the K144M Runt domain is not completely disrupted by the introduced mutation.
The wild type and mutated Runt domains were incubated with increasing concentrations of [8-azido-␣-32 P]ATP, subjected to UV light, and analyzed by SDS-PAGE (Fig. 7B). Tenfold more [8-azido-␣- 32  ative substitution to arginine (Lys 144 3 Arg), an amino acid that is often found in related nucleotide binding loops at that position, did not disrupt nucleotide binding by the Runt domain. These results suggest that ATP-binding by the Runt domain requires amino acids in the Kinase-1a motif. DISCUSSION The Runt domain proteins constitute a small family of transcription factors that contain a conserved DNA-binding domain that also mediates heterodimerization with the non-DNA-binding CBF␤ subunit. The Runt domain proteins can bind DNA independently, but their affinity for DNA increases upon association with the CBF␤ subunit. The three-dimensional structures of the Runt domain and its CBF␤ partner are unknown, and are likely to be unique given the lack of amino acid sequence homology to other DNA-binding proteins and dimerization motifs. Mutations in the genes encoding the CBF␣2 (AML1) Runt domain protein and the non-DNA-binding CBF␤ subunit are associated with a large number of leukemias. Determining the structures of these proteins and their mode of interaction with each other and with DNA should facilitate the development of new drugs that could be used to treat the leukemias associated with variant forms of these proteins.
We have overexpressed, purified, and characterized the DNA-binding Runt domain from the murine CBF␣2 (AML1) protein. We reproducibly obtained 5-8 mg of purified Runt domain per liter of bacterial culture, which is well within the range needed for structural determinations. Although the protein partitioned to the insoluble fraction, we were able to solubilize and renature the protein and recover 80 -100% of its DNA binding activity. CD spectroscopy and urea unfolding experiments also demonstrated that the purified and renatured Runt domain has a folded structure.
The Runt domain binds a high affinity DNA site with a K d of 5 ϫ 10 Ϫ8 M, and a lower affinity site (WT) with a K d of 4.4 ϫ 10 Ϫ7 M. The high affinity (HA) site is an experimentally derived site (39), and few in vivo sites have been found that bind the Runt domain with equivalent affinity (39). The WT site from the Moloney murine leukemia virus enhancer is more typical of sites found in vivo. 6  ation constant of 3.1 ϫ 10 Ϫ9 M for the CBF␣1 (PEBP2␣A1) Runt domain protein using a high affinity site from the polyomavirus enhancer. The K d determined by Ogawa et al. is approximately 10-fold lower than the K d that we obtained for the isolated Runt domain on a site (HA) of equivalent affinity. However, meaningful comparison of these K d values is not possible given the differences in the assay conditions and proteins. DNA binding by the Runt domain is inhibited by high salt concentrations, as is the case for most DNA-binding proteins, suggesting that DNA binding by the Runt domain is highly dependent on salt bridges to the phosphate backbone (49).
Circular dichroism spectroscopy shows the domain to be comprised predominantly of ␤-strands with virtually no ␣-helical content. A number of proteins have been shown to contact DNA through interactions mediated by ␤-strands including the members of the ribbon-helix-helix family such as the MetJ and Arc repressors, and the TATA box-binding protein (50 -52). However, in all of these cases, the DNA-binding domain contains a significant ␣-helical content. To our knowledge, the only sequence-specific DNA-binding domains that are almost exclusively ␤ domains are the Rel homology domain such as that found in NFB and the core domain of p53 (53)(54)(55). In both cases, the domains are comprised of a ␤-sandwich with very low ␣-helical content. Both of these proteins are known to exist as oligomers in solution. CBF␣, in contrast, can bind DNA as a monomer (3), 2 thus the DNA-binding domain of CBF␣ appears to represent a novel ␤-domain that can bind sequence specifically to DNA in a monomeric form.
We also demonstrated that ATP can be UV cross-linked to the Runt domain and that cross-linking was sensitive to a nonconservative amino acid substitution at the invariant lysine in the Kinase-1a motif. These results suggest that the Runt domain binds nucleotides and that this binding is mediated, at least in part, by the Kinase-1a motif.
The Kinase-1a motif in nucleotide-binding proteins is a glycine-rich loop positioned between a ␤-strand and an ␣-helix, with the invariant lysine at the first position of the ␣-helix (16,17). The core of the nucleotide-binding fold in many nucleotide binding proteins contains a sheet of mostly parallel ␤-strands with helices above and below the plane of the sheet (16). However, secondary structure prediction of the Runt domain sequence predicts that a ␤-strand, and not an ␣-helix, immediately follows the glycine-rich loop, and CD analysis indicates that the Runt domain has very little ␣-helical content. The structure of the nucleotide fold in the Runt domain is therefore likely to diverge significantly from the fold found, for example, in adenylate kinase, p21 ras , and phosphoglycerate kinases (44,56,57). The putative Kinase-1a motif in the Runt domain also has a F immediately following the GXXXXGKS/T/G sequence, which has not yet been found in other proteins with Kinase-1a motifs (16). There are other examples of nucleotide-binding proteins that contain glycine-rich loops flanked by ␤-strands. The glycine-rich loop in cAMP-dependent protein kinase joins two antiparallel strands at the beginning of a ␤-sheet (47). Therefore, although the glycine-rich region in the Runt domain may not completely conform to a Kinase-1a motif, it may still be part of a nucleotide binding fold.
Nucleotide binding proteins contain several motifs that together form the active site, the glycine-rich loop comprising only one of these motifs. The demonstration that ATP can be cross-linked to the Runt domain suggests that other sequences in the Runt domain will participate in nucleotide binding. Additional nucleotide-binding motifs, such as Kinase-2 and Kinase-3 motifs, generally reside C-terminal to the glycine-rich loop (16). A candidate sequence in the Runt domain that may participate in nucleotide binding is a KVTVD sequence at amino acids 167-171 in the CBF␣2 protein, which diverges from a consensus Kinase-2 motif at only one position (V) (16). The related CBF␣1 protein has a conservative substitution at that position to KITVD, which conforms to the Kinase-2 motif. The KV/ITVD sequence is also among the most highly conserved sequences in the Runt domain.
What might be the role of nucleotide binding by the Runt domain? To date, no role has been demonstrated, although in preliminary experiments we could detect stabilization of DNA binding by the Runt domain in the presence of ATP (not shown). The nucleotide binding fold could also be part of the recognition motif for DNA. Interestingly, the amino acid sequence of the glycine-rich loop and putative ␤-strand that follows the loop is highly conserved in the Runt domain proteins, and DNA binding by the Runt domain is particularly sensitive to amino acid substitutions in that region (7).