Characterization of the Conserved Interaction between GATA and FOG Family Proteins* 210

The N-terminal zinc finger (ZnF) from GATA transcription factors mediates interactions with FOG family proteins. In FOG proteins, the interacting domains are also ZnFs; these domains are related to classical CCHH fingers but have an His → Cys substitution at the final zinc-ligating position. Here we demonstrate that different CCHC fingers in the FOG family protein U-shaped contact the N-terminal ZnF of GATA-1 in the same fashion although with different affinities. We also show that these interactions are of moderate affinity, which is interesting given the presumed low concentrations of these proteins in the nucleus. Furthermore, we demonstrate that the variant CCHC topology enhances binding affinity, although the His → Cys change is not essential for the formation of a stably folded domain. To ascertain the structural basis for the contribution of the CCHC arrangement, we have determined the structure of a CCHH mutant of finger nine from U-shaped. The structure is very similar overall to the wild-type domain, with subtle differences at the C terminus that result in loss of the interaction in vivo. Taken together, these results suggest that the CCHC zinc binding topology is required for the integrity of GATA-FOG interactions and that weak interactions can play important roles in vivo.

Zinc binding domains (often referred to as zinc fingers, or ZnFs) 1 are extremely common in eukaryotes; for example, perhaps 3-4% of human genes encode proteins that contain such domains. At least 20 different classes of zinc binding domains have been identified, differing in the number of zinc ions they bind and the identity and spacing of the ligating amino acids.
Since the initial discovery of the classical zinc fingers in TFIIIA (1), it has become clear that many zinc binding domains function as sequence-specific DNA (or RNA) binding motifs. However, it has also become apparent that many classes of zinc binding domains, including LIM domains and PHD domains, have roles as mediators of protein-protein interactions. GATA and classical ZnFs have also been shown to be capable of contacting other proteins, especially in the context of the regulation of gene expression (for review, see Ref. 2).
It has been demonstrated, both in Drosophila melanogaster and in mammals, that interactions between ZnFs of GATA and FOG family transcription factors are essential for normal development. In mammals, GATA-1 and FOG-1 cooperate to drive erythroid development (3), whereas the Drosophila proteins Pannier (a GATA family protein; Ref. 4) and U-shaped (a FOG family protein; Refs. 5 and 6) combine to direct the development of both the heart (7) and sensory bristles (6).
GATA proteins ( Fig. 1A; Ref. 8) typically contain two ZnFs with a CX 2 CX 17 CX 2 C consensus. The more C-terminal of these two domains (CF) binds DNA sequences containing GATA sites, whereas the N-terminal zinc finger (NF) may contact promoters with different motifs (9). In addition, the zinc finger domains have been shown to mediate interactions with other proteins, including CREB-binding protein (10), erythroid Kruppel-like factor (11), and members of the FOG family (12)(13)(14)(15). FOG family proteins (Fig. 1B) contain either eight or nine ZnFs that are related to the classical CCHH ZnFs (which have a conserved C-X 2-5 CX 12 HX 2-5 H sequence). Several of the fingers in each FOG protein, however, have an altered consensus sequence in which the final zinc binding histidine is replaced with a cysteine.
Interestingly, it is only these variant CCHC ZnFs that mediate interactions with GATA NFs. The structures of two of these domains (fingers 1 and 9 from D. melanogaster U-shaped) have recently been determined (16) and were found to resemble classical CCHH fingers in structure. The only notable difference is a short extended portion of the polypeptide backbone before the fourth zinc ligand (Fig. 1C). The GATA binding surface of one CCHC finger (finger 1 from U-shaped; USH-F1) has been determined (16,17). Two of the residues implicated in GATA binding are located on the extended portion of the backbone, immediately preceding the final cysteine, and mutation of the cysteine to histidine (changing the finger to a classical CCHH ZnF) is sufficient to abolish the GATA-FOG interaction in vivo (18). The CCHC topology therefore appears to play a key role in facilitating the interaction, but the molecular explanation for this is currently unknown.
The residues that are implicated in GATA binding are largely conserved across FOG family CCHC ZnFs (Fig. 1D), although three of the residues that affect the binding affinity show some variation. This variation gives rise to strong and weak interac-tors, as assigned using a yeast two-hybrid assay (17). It is unclear, however, whether such affinity differences are real or result from the indirect and qualitative nature of the assay.
The interactions between the GATA and FOG proteins have become a topic of considerable interest because natural mutations that interfere with the interactions are associated with human genetic disorders. The GATA-1 gene is on the X chromosome, and males carrying defective GATA-1 alleles exhibit various hematological abnormalities, including anemia, thrombocytopenia, and ␤-thalassemia trait. Five different mutations in the GATA-1 N-finger have now been described (19 -22). Four of these interfere with the interaction between the GATA-1 N-finger and FOG. Of these, two mutations (V205M, G208S) map to residues that directly mediate GATA-FOG contacts, whereas two others (D218G, D218Y) map to a region close to the FOG binding surface. The fifth mutation (R216Q) interferes with the DNA binding activity of the GATA-NF but does not affect the interaction with FOG. The four mutations that do impair the interaction with FOG do so to different extents, and all five mutations cause different genetic disorders. For instance, V205M disrupts the binding of the GATA-1 NF to FOG fingers 1, 6, and 9 and causes a severe anemia and thrombocytopenia (22). G208S (21) and D218G (19), on the other hand, significantly disrupt the interaction with FOG finger 9 but have little effect on the complex formed with FOG fingers 1 and 6; these mutations cause thrombocytopenia without significant anemia. The D218Y mutation (20) also affects the interaction with FOG, although this effect has not yet been correlated with the binding of specific FOG fingers. It is likely that an understanding of the molecular details of how different FOG fingers contact GATA-1 may help explain these different phenotypes.
Here we characterize the interaction between U-shaped finger 9 (USH-F9; assigned as a "weak" interactor) and the N-terminal ZnF of Pannier (Pnr-NF). The residues involved in the interaction are identified using a 1 H, 15 N HSQC titration and compare closely with those defined for the complex formed between U-shaped finger 1 (USH-F1; a "strong" interactor) and murine GATA-1 NF. These results suggest a common mechanism of interaction among weakly and strongly interacting FOG fingers. We also use isothermal titration calorimetry to directly measure the association constants for a number of GATA⅐FOG complexes. All of these association constants are relatively weak (ϳ10 4 -10 5 M Ϫ1 ), and the USH-F9⅐Pnr-NF complex is found to be weaker than the USH-F1⅐Pnr-NF complex. Finally, the importance of the CCHC topology in FOG zinc fingers is investigated. The introduction of a C1142H mutation in USH-F9 weakens but does not abolish its interaction with Pannier. Determination of the structure of this CCHH-type mutant finger reveals the basis for this difference, a subtle change in the conformation at the C-terminal end of the ␣-helix.

EXPERIMENTAL PROCEDURES
Expression and Purification-USH-F1, USH-F9, and murine GATA-1 NF were expressed and purified as previously described (16,23). Pannier NF (Pannier (165-208)) was subcloned, overexpressed, and purified using the same protocol as that used for the GATA-1 NF, except that the template for PCR of the construct was obtained from a D. melanogaster cDNA library. The C1142H mutant of U-shaped finger 9 (C32H), encoding residues 1113-1146 of U-shaped, was produced by single-primer mismatch PCR using a 3Ј primer encoding the Cys 3 His substitution. The amplified oligonucleotide was inserted into the Escherichia coli expression vector pGEX-2T (Amersham Biosciences). The peptide (with residues numbered 1-36, including Gly-1 and Ser-2, which are the product of the thrombin recognition sequence) was overexpressed and purified by glutathione-affinity chromatography and reverse-phase high performance liquid chromatography as previously described (23). The purification yielded ϳ2 mg of Ͼ95% pure peptide per liter of culture. The identity of the peptide was confirmed using electrospray mass spectrometry (M theor. ϭ 4201.8 Da; M obs. ϭ 4202.0 Da). Overexpression of 15 N-labeled C32H was performed using a bacterial fermenter as previously described (16).
Sample Preparation for NMR-Samples for HSQC titrations or structure determination were prepared by dissolving 0.4 mg of 15 N-labeled USH-F9, 2.6 mg of Pnr-NF, or 2.1 mg of C32H in H 2 O/D 2 O (95:5; 500 l) containing 1.5 molar eq of both tris(2-carboxyethylphosphine)⅐HCl and ZnSO 4 ; this gave sample concentrations of 200 M and 1 mM for USH-F9 and Pnr-NF/C32H, respectively. The pH was adjusted to 5.0 using 0.1 and 0.01 M NaOH.
NMR Spectroscopy-NMR experiments were performed on a Bruker DRX-600 spectrometer equipped with a 5-mm triple-resonance gradient probe. Spectra were acquired at 293 K using spectral widths of 12 ppm for 1 H and 30 ppm for 15 N. The following homonuclear two-dimensional spectra were recorded on the unlabeled sample: total correlation spectroscopy (24) with MLEV mixing (T m ϭ 70 ms), NOE spectroscopy (T m ϭ 50 and 250 ms; Ref. 25), double quantum-filtered COSY (26), and E.COSY (27). Using the 15 N-labeled sample, HSQC (28,29) and threedimensional HNHA (30) experiments were recorded. For the detection of two-and three-bond scalar couplings involving the histidine side chain nitrogens, an HSQC was recorded in which the dephasing delay 1/4J was set to 11 ms (31), and the carrier frequency and spectral width  (16). The extended region just before the fourth zinc ligand is indicated with a bracket. D, alignment of the amino acid sequences of murine FOG-1 and D. melanogaster U-shaped. Fingers that interact with GATA-1 NF contain a conserved motif. Residues essential for the interaction, judging from mutagenesis studies, are shown in a dotted box; residues that are important (but not essential) for the interaction are shown in a dashed box. Zinc ligands are underlined. The numbering from the native proteins is indicated beside the sequences, with the numbering system used in the text shown at the bottom.
were set to 200 ppm and 6000 Hz, respectively. Solvent suppression was achieved using either pulsed-field gradients or presaturation (double quantum-filtered-COSY and E.COSY). NMR data were processed as previously described (23) and analyzed using XEASY (32). The 1 H frequency scale of all spectra was directly referenced to sodium 3-trimethylsilyl[2,2,3,3-2 H]propionate (d 4 -TSP) at 0.00 ppm, whereas the 15 N frequency scale was indirectly referenced to liquid NH 3 via the 1 H frequency of the d 4 -TSP resonance (33).
HSQC Titrations-Samples were prepared as described above. , where the association constant, K a , is given by If U 0 and P 0 are the total concentrations of USH-F9 and Pnr-NF, respectively, at a given point in the titration, then K a and [UP] can be expressed as If the product of the lifetime of the complex ( UP ) and the difference in chemical shift between the free and bound states (␦ UP Ϫ ␦ U ) is much less than one (i.e. UP .⌬␦ Ͻ Ͻ 1), then the system is considered to be in fast exchange. Under these conditions, a single resonance is observed at the population-weighted average chemical shift (␦ obs ) of the nuclei in the free and bound states, where ␦ U is the chemical shift of a nucleus in free USH-F9, ␦ UP is the chemical shift of the same nucleus in the USH-F9⅐Pnr-NF complex, and f U and f UP are the mole fractions of the free and bound species, respectively (given by (5), the resulting expression can be used to fit a plot of ␦ obs against P 0 using nonlinear least-squares methods. Data from the titration of USH-F9 with Pan-NF were fitted using the nonlinear least squares package provided in Origin (Microcal, MA); the values of K a and (␦ UP Ϫ ␦ U ) were allowed to vary during the fit. For resonances in the intermediate exchange region, where U is the lifetime of free USH-F9 and U Ϫ UP is the magnitude of the frequency difference between the free and bound states. The peak position for a resonance in intermediate exchange does not accurately reflect the weighted average between the free and bound states of the protein and can no longer be determined by a simple expression such as Equation 5. However, the amplitude of a signal from USH-F9 as a function of frequency for a particular Pnr-NF concentration can be determined from the imaginary component of the complex quantity G() (34), which is given by where ϭ f U /k off , k off is the off-rate for the complex, and C is a scaling factor (which scales the intensity of the signal but has no bearing on its line-shape or position). The quantities ␣ U and ␣ UP are given by and where T 2U and T 2UP are the transverse relaxation times for free and bound USH-F9, respectively. To simulate the binding curve for a resonance in intermediate exchange, the two-site intermediate exchange model was coded into Mathematica (Wolfram Research, IL). For each point in the titration (i.e. each pair of values of U 0 and P 0 and [Pnr-NF]), the line-shape was calculated as the imaginary part of G(), and the frequency at which the signal amplitude was a maximum was determined. This simulated binding curve was compared with the experimental data by calculating the sum of the squares of the differences between the simulated and actual data ( 2 ). The values of UP -U , K a , and k off were varied in a grid search to find the best fit of the data to the model.
Isothermal Titration Calorimetry (ITC)-Lyophilized peptides were dissolved in H 2 O/D 2 O (95:5) to a final concentration of 150 M and treated with both 1.5 molar equivalents of ZnSO 4 and 2 molar eq of tris(2-carboxyethylphosphine)⅐HCl. The pH was adjusted to 5.0 (using 10 mM NaOH), and sodium acetate (pH 5.0) was added to a final concentration of 20 mM. The samples were then concentrated by vacuum centrifugation (at 55°C) to the required concentrations (see Table  I). One-dimensional 1 H NMR spectra of each sample were acquired to confirm that they were correctly folded, and samples were dialyzed overnight into a buffer containing sodium acetate (20 mM, pH 5.0), ZnSO 4 (1.5 mM), and tris(2-carboxyethylphosphine)⅐HCl (100 M). Experiments at pH 7 were conducted in 20 mM MOPS buffer. Samples were degassed for 5 min immediately before the titrations. Titrations were carried out on a Microcal VP-ITC microcalorimeter. Of the two proteins samples to be used in each titration, 1.8 ml of the less concentrated protein sample was transferred into the cell, whereas the automatic injector was filled with the more concentrated protein (the titrant). Titrations involving USH-F1 and USH-F9 were conducted at 278 and 293 K, respectively. The titrant was automatically injected into the cell containing either its binding partner or buffer alone. For each titration, 15-20 injections were made at 5-min intervals, with each injection consisting of 10 -15 l of titrant. The reference power was set at 5 cal s Ϫ1 , and the cell was stirred continuously at 310 rpm. Data were analyzed using the Origin isothermal titration calorimetry analysis software (Microcal Software, Northampton, MA). A linear fit of the isotherm of the protein injected into buffer alone was subtracted from the experimental isotherms to account for the heat of dilution and mixing. A nonlinear least squares fit to a single binding site model was used to obtain values for the binding constant, stoichiometry, and heat of binding.
Structure Determination for C32H-Resonance assignment was carried out using the sequential assignment method. The three-dimensional HNHA experiment was used to assign the 1 H, 15 N HSQC of C32H. Cross-peaks in the two-dimensional NOE spectra were integrated in XEASY and converted to upper distance limits using the CALIBA module of DYANA (35) and the default DYANA parameters. 3 J NH␣ coupling constants were measured by analysis of the three-dimensional HNHA experiment. In combination with distance restraints, a set of allowable dihedral angles was generated using the GRIDSEARCH module of DYANA. Stereospecific assignments for 13 pairs of methylene protons were obtained from an analysis of E.COSY and short mixing time NOE spectra. Stereospecific assignments in conjunction with interproton distance restraints were used to generate a set of allowable 1 dihedral angle restraints in the GRIDSEARCH module of DYANA. DYANA was used to calculate 50 structures from random starting conformers. Calculations were performed in the absence of zinc ligation restraints. The conformer with the lowest penalty function value was used as input for refinement by restrained molecular dynamics/simulated annealing calculations in CNS (36). Zinc was incorporated into calculations by the introduction of covalent restraints to maintain tet- rahedral geometry. S ␥ -Zn and N ⑀2 -Zn bond lengths were constrained to 2.3 and 2.0 Å, respectively, with force constants of 250 kcal mol Ϫ1 Å Ϫ2 . Bond angles defining the zinc coordination site were constrained to the following values: 112°for S ␥ -Zn-S ␥ bond angles, 108°for C ␤ -S ␥ -Zn angles, 111°for S ␥ -Zn-N ⑀2 angles, and 102°for N ⑀2 -Zn-N ⑀2 angles. These values were applied with a force constant of 50 kcal mol Ϫ1 deg Ϫ2 . The standard protein all-hydrogen force field was used, and covalent geometry was constrained using standard CNS parameters. Calculations were performed in torsion-angle space, with randomized initial atomic velocities. The first stage consisted of a search of torsion-angle space at high temperature (50000 K), with 1000 time steps of 0.015 ps and a low weight on the repulsive energy term (w vdW ϭ 0.1). Force constants for calculation of the square-well NOE and dihedral angle potentials were 150 kcal mol Ϫ1 Å Ϫ2 and 100 kcal mol Ϫ1 deg Ϫ2 , respectively. The second stage consisted of slow cooling torsion-angle dynamics, with 1000 time steps of 0.015 ps, in which the temperature decreased in 250-K steps. The weight on the repulsive energy term was increased to one, and force constants remained unaltered. The third and final stage consisted of 10 cycles of conjugate gradient minimization (for 1000 steps), with force constants for NOE and dihedral angle potentials changed to 50 kcal mol Ϫ1 Å Ϫ2 and 300 kcal mol Ϫ1 deg Ϫ2 , respectively. A total of 200 structures were calculated, and the 20 conformers with the lowest total energy were used to represent the solution structure of C32H. The structures were visualized and analyzed using the programs MOLMOL (37) and PROCHECK-NMR (38).

RESULTS
The GATA Binding Face of Different FOG Family CCHC Fingers Is Conserved-We first sought to determine the surface of USH-F9 that was responsible for its interaction with GATA family N-fingers. The GATA binding face of USH-F1 had previously been identified by carrying out a 1 H, 15 N HSQC titration of 15 N-labeled USH-F1 with unlabeled murine GATA-NF (16). Initially, the same methodology was employed for USH-F9. However, the addition of GATA-1 NF to 15 N-labeled USH-F9 resulted in substantial line broadening in the 1 H, 15 N HSQC, which prevented the assignment of peaks in the HSQC spectrum. Furthermore, it was clear that a great deal more GATA-NF would have been required to reach completion. Together these observations indicated that the kinetics and affinity of this interaction were different from those for formation of the USH-F1⅐GATA-1 NF complex.
In an attempt to circumvent this problem, the titration was repeated using the Pannier N-finger (Pnr-NF). Pannier is the biological partner of U-shaped; its N-finger shares ϳ90% sequence identity with that of murine GATA-1 N-finger, including all of the residues previously implicated in FOG binding (16,39). Titration with Pnr-NF resulted in a similar pattern of changes in the USH-F9 spectrum but was completed after the addition of five equivalents of Pnr-NF ( Fig. 2A). Resonances in the resulting spectrum were much narrower than for the GATA-1 NF titration, allowing essentially full assignments to be made.
The chemical shift changes observed during the titration were summarized by calculating an average chemical shift change (averaged over H N and N atoms; Fig. 2B). Residues were judged to have undergone a substantial chemical shift change if the average chemical shift change was Ͼ0.3 ppm and these (Tyr-10, Phe-18, Thr-23, His-27, and Lys-33) are mapped onto the structure of USH-F9 in Fig. 2C (red). Some residues that were expected to be involved in the interaction (Ile-16, Phe-30, and Tyr-31), judging from mutagenesis studies using  15 N HSQC spectrum of USH-F9 alone is shown in black, whereas the 1 H, 15 N HSQC spectrum of USH-F9 in the presence of 5 molar eq of Pan-NF is shown in red. Signals that shifted significantly are labeled; Tyr-10 and His-27 are only visible at lower contour levels. Spectra were recorded at 293 K and pH 5. B, plots of average chemical shift change versus residue number for USH-F9 and USH-F1. A dotted line indicates the lower limit for residues defined as having undergone substantial change. Average chemical shift changes were calculated using the H N and N atoms of USH-F9 and the H N , N, C ␣ , and CЈ atoms of USH-F1. C, spacefilling model of USH-F9 with residues identified in B, highlighted in red (except for Tyr-24, which lies on the rear surface of the structure as shown). Additional residues that are thought to be important for binding (16,17) are shown in yellow.
FOG-F1, did not show a significant change, perhaps because their bulky hydrophobic side chains shield the H N and N from the effects of Pannier binding. A similar result was observed for USH-F1 ( Fig. 2B; Ref. 16), where the backbone resonances of residues Phe-18, Pro-21, Thr-23, Leu-24, His-27, and Tyr-31 each showed significant chemical shift changes. In addition, the side chain resonances of Ile-16 and Tyr-30 in USH-F1 underwent substantial shifts.
The additional GATA-interacting residues that were identified in the FOG-F1⅐GATA-1 NF mutagenesis study (Ile-16, Asn-19, Phe-30, and Tyr-31) are highlighted in yellow in Fig.  2C. The commonality of residues involved in NF binding of USH-F1, USH-F9,and FOG-F1 (17) suggests that a common mechanism of interaction between GATA NFs and FOG CCHC ZnFs exists. Note that the changes observed for Tyr-10 and Tyr-24 during the USH-F9⅐Pan-NF titration probably reflect conformational differences that arise as a result of NF binding, given that the side chains of these residues lie on the opposite face of USH-F9 to the majority of the affected residues.
Quantifying the USH-F9⅐Pan-NF Interaction Using NMR Methods-Sedimentation equilibrium experiments performed on a USH-F1 and GATA-NF mixture (16) suggested that the interaction between them might be of moderate affinity. We therefore sought to estimate the affinity of the 15 N-labeled USH-F9⅐Pnr-NF interaction by analyzing our 1 H, 15  Some line broadening was observed for residues that appeared to be in fast exchange during the course of the titration. This suggested that the lifetime of the complex approached the chemical shift difference between the free and bound states for these resonances. It is also evident that the binding curves for these resonances deviate from the rectangular hyperbolic binding curves, characteristic of a system in fast exchange (Fig. 3B). These observations are indicative of intermediate chemical exchange (40), and the model described by Equations 7-9 was therefore fitted to these data. In the case of Val-21, fitting of the change in 15 N chemical shift against P 0 gave a K a of 2.5 ϫ 10 4 M Ϫ1 , a UP Ϫ U of 64 Hz, and a k off of 185 s Ϫ1 . It is clear from the fitted curves that the intermediate exchange model is more appropriate for this resonance (Fig. 3C). For this fit, values of 32 and 20 ms were used for T 2U and T 2UP , respectively. These were determined from 15 N line widths of the Val-21 signal in the free and complexed forms in high resolution HSQC spectra.
Comparing the Affinities of Different GATA⅐FOG Complexes-To obtain an independent estimate of the affinity of the USH-F9⅐Pnr-NF complex and to compare this to the affinities of the USH-F1⅐GATA-1 NF complex (16) and the USH-F1⅐Pnr-NF complex, titrations were carried out in an isothermal titration calorimeter. Analysis of the data from USH-F9⅐Pnr-NF (Fig. 4A) and USH-F1⅐Pnr-NF (Fig. 4B) titrations yielded association constants of (1.9 Ϯ 0.1) ϫ 10 4 M Ϫ1 and (1.45 Ϯ 0.1) ϫ 10 5 M Ϫ1 , respectively. Titrations of the same two USH fingers with the murine GATA-1 NF that was used for previous studies (16) showed that the affinities for this domain are similar to those for the "same organism" interactions (K a , USH-F1⅐GATA-1-NF ϭ (2.3 Ϯ 0.1) ϫ 10 5 M Ϫ1 and K a , USH-F9⅐GATA-1-NF ϭ (1.9 Ϯ 0.1) ϫ 10 4 M Ϫ1 ). Furthermore, it appeared that pH had a relatively small effect on the K a in the range 5.0 -7.0 (data not shown). This is an important result given that all structural studies on this system have been conducted in the pH range 5.0 -5.5.
The Structure of the C32H Mutant of USH-F9 -It has pre-viously been shown in a yeast two-hybrid assay that mutants of murine FOG fingers 1 and 9 in which the final zinc-ligating cysteine was mutated to a histidine (to simulate a classical CCHH configuration) were unable to interact with the GATA-1 NF (18). This result, together with the observation that all known GATA-interacting fingers in FOG family proteins possess the CCHC topology, led us to ask about the role of this distinctive ligand arrangement. To do this, we mutated the final zinc-ligating cysteine of USH-F9 to histidine (Cys-1142 3 His or Cys-32 3 His in the current numbering), creating a CCHH domain similar to classical ZnFs. Circular dichroism and one-dimensional 1 H NMR data confirmed that the peptide was folded (data not shown). We then determined the structure of C32H using homo-and heteronuclear NMR methods. The positions of most resonances were essentially unchanged compared with wild-type USH-F9 (16); changes were restricted to the loop between the two zincligating histidines (His-27 and His-32) and to residues Thr-13 and Cys-14, which are in the immediate vicinity of Cys-32 in wild-type USH-F9. The tetrahedral nature of the zinc coordination site and the identity of the ligands were confirmed by calculating structures in the absence of any zinc coordination restraints. On the basis of these structures, it was evident that the thiol sulfurs of Cys-11 and Cys-14 and the N ⑀2 atoms of His-27 and His-32 were the zinc binding atoms and that the coordination sphere was approximately tetrahedral in nature. The identities of the zinc binding nitrogen atoms were further confirmed by the patterns observed for the histidine side chains in an HSQC spectrum optimized for long range ( 2 J and 3 J) couplings. In the final structure calculations, the geometry of the zinc binding site was defined using standard interatomic distances and angles. Table II summarizes the experimental constraints used in this final round of structure calculations. No hydrogen-bonding constraints were used in the calculations.
A total of 200 structures were calculated in CNS, and the 20 structures with lowest overall energies were used to represent the solution structure of C32H (Fig. 5A). The structures display good covalent geometry, judging from the small deviations from ideal bond lengths and angles, and good non-bonded contacts, as shown by the low value of the mean Lennard-Jones potential (Table II). There are no violations of distance or angle constraints greater than 0.11 Å and 1.5°, respectively. A PRO-CHECK-NMR analysis shows that for residues Tyr-10 -Lys-33, 98% of backbone / pairs lie within the most favored or additionally allowed regions of the Ramachandran plot. Over the same region, atomic root mean square differences for the final 20 structures with respect to the mean coordinate positions are 0.20 Ϯ 0.08 Å for the backbone atoms and 0.80 Ϯ 0.13 Å for all heavy atoms. The atomic coordinates for this family of conformers have been deposited with the Protein Data Bank (accession code 1JN7).
The structure of C32H (Fig. 5B) comprises a short ␤ hairpin (residues Tyr-10 -Cys-11 and Ile-16 -Ser-17) followed by an ␣-helix (residues Val-21-Phe-30) and is essentially the same as FIG. 4. Analysis of GATA⅐FOG complexes using isothermal titration calorimetry. A, raw (upper traces) and integrated (lower traces) data from a titration of Pan-NF into USH-F9. The fit to a simple 1:1 binding model is also shown. B, raw and integrated data from a titration of Pan-NF into USH-F1. The fit to a simple 1:1 binding model is also shown.   0.20 Ϯ 0.08 All heavy atoms  0.80 Ϯ 0.13 a Idealized geometry is defined by the CHARMM force field as implemented in CNS.
b Atomic differences are given as the average root mean square against the mean coordinate structure. All energies, violations and root mean square differences are given as the mean Ϯ S.D. the structure of USH-F9 (Fig. 6A). As for USH-F9, the strands of the hairpin in C32H are connected by a type I ␤ turn, with a type IV turn connecting the hairpin and the helix. The expected backbone hydrogen bonds are observed throughout the helix (unlike USH-F9, where hydrogen bonds are not observed in some positions) and across the hairpin. However, whereas in USH-F9 the carbonyl group of Cys-11 forms hydrogen bonds with the amide protons of both Asp-15 and Ile-16, the Cys-11 carbonyl group only forms hydrogen bonds with the amide of Ile-16 in C32H. This is most probably because of a slight change in conformation of the two zinc binding cysteines, a consequence of the cysteine to histidine substitution, which alters the zinc coordination site. This is the only change in this region of the structure.
A comparison of the zinc binding sites shows that there are subtle changes in the position of the zinc binding atoms or the zinc ion itself (Fig. 6B). Cys-11, Cys-14, and His-27 adopt the same conformation in both USH-F9 and C32H, whereas a slight change in the conformation of His-32 is apparent. In the mutant, His-32 takes up a similar position to the corresponding histidine of classical CCHH ZnFs, with the more helical conformation of this region of C32H causing the histidine ring to approach the zinc ion from a different direction compared with USH-F9. The incorporation of the histidine does not affect the conformations of residues in the hydrophobic core (residues Cys-11, Ile-16, Thr-13, Tyr-24, and His-27).
The ␣-helices of C32H and USH-F9 are superimposable up to residue Gln-29, after which the effect of the Cys 3 His substitution becomes apparent (Fig. 6A). Although the residues immediately preceding the final zinc binding ligand exist in an extended conformation in USH-F9 ( 30 ϭ Ϫ150°and 31 ϭ Ϫ109°), the same residues adopt a conformation that is closer to ␣-helical in C32H ( 30 ϭ Ϫ104°and 31 ϭ Ϫ75°). Residues Cys-32 and Lys-33 in USH-F9 also adopt an extended conformation and do not form any regular secondary structure. In C32H, residues His-32 and Lys-33 both appear to form another half-turn of ␣-helix, although the turn is not strictly defined as ␣-helix. This change in conformation probably arises from the difference in sizes of the cysteine and histidine side chains. The distance from C ␣ to the zinc binding atom is 2.8 and 4.5 Å, respectively, in the two residues. This difference clearly requires different packing to maintain the zinc binding atom of this ligand in the same position in space, which preserves the overall fold of the finger. Despite the changes in packing, however, the H-bonding patterns observed in this region are unchanged; both Tyr-31 H N and Cys-32 H N make H-bonds with His-27 O.
The impact of the histidine substitution is immediately apparent when examining the positions of the aromatics on the loop, Phe-30 and Tyr-31. Although Phe-30 undergoes a subtle change of conformation and still occupies almost the same region of space, Tyr-31 is shifted well away from its original position (Fig. 6A). The loop preceding His-32 is less pronounced, and the positions of the C ␣ atoms of Phe-30 and Tyr-31 now superimpose well with the corresponding residues of classical CCHH ZnFs.
Interaction of C32H with Pan-NF-A 1 H, 15 N HSQC titration was carried out to investigate the ability of C32H to bind Pnr-NF. After the addition of 5 molar eq of unlabeled Pnr-NF to 15 N-labeled C32H, a number of signals had either decreased in intensity or disappeared completely. These were the same signals that changed in the wild-type USH-F9 titration, indicating that the basic mode of interaction was unchanged. Despite this, HSQC spectra of USH-F9 and C32H, each containing 1 molar eq of Pnr-NF, indicated that the C32H-Pnr-NF interaction was substantially weaker. Peaks that disappeared rapidly in the titration of USH-F9 were still clearly visible in the titrations of C32H (Fig. 7).
Protein Binding Versus DNA Binding in Classical Zinc Fingers-Finally, we sought to determine whether classical-type zinc fingers that contact other proteins (including the CCHC fingers of FOG proteins) have diverged significantly from DNA binding classical zinc fingers. A sequence alignment of 64 zinc finger domains that are known to interact with DNA together with the protein binding zinc fingers of Roaz (41), FOG-1 (15,17), Ikaros (42), and YY1 (43) was created using the Pileup program from the Wisconsin sequence analysis package (Supplemental Fig. 1). The program performs comparisons between each possible pair of sequences and creates a final alignment in which the most closely related sequences are nearest to each other in the alignment, whereas the more distantly related ones are separated. This alignment suggests that the sequences of many of the protein binding domains are distinct overall from DNA binding zinc fingers. In part, this presumably reflects the fact that particular residues important for DNA recognition (for example, the basic residue in position 12 that contacts the sugar-phosphate backbone) are not conserved in the protein-binding fingers. DISCUSSION We have shown here that the molecular details of the interaction between the N-terminal zinc finger of GATA family proteins and CCHC-type zinc fingers from FOG family proteins are highly conserved between different fingers and across different phyla. Thus, the predominantly hydrophobic surface on finger 9 from D. melanogaster U-shaped that contacts the Nterminal ZnF on its in vivo partner, Pannier, is the same as that identified previously in a U-shaped finger 1-GATA-1 NF complex (16). This presumably reflects the conservation of key contact residues in the CCHC fingers (Fig. 1) as well as con- servation among the N-fingers of GATA family proteins.
Interestingly, two of the contact residues (Phe-30 and Tyr-31) are located on the loop preceding the final zinc-ligating residue. Thus, the conformation induced by the His 3 Cys substitution in this subclass of classical zinc fingers may be required to position the two aromatic side chains in such a way as to allow specific contacts with GATA NFs. These two aromatics are largely conserved across FOG family CCHC ZnFs that are capable of interacting with the NF, and it has been noted previously (44) that protein-protein interfaces exhibit a higher concentration of aromatic residues than protein surfaces as a whole. These residues, however, are not conserved in other classical-type zinc finger domains that are known to mediate protein-protein interactions (such as certain fingers from Roaz, Ikaros, and YY1). In fact, there appears to be substantially more sequence variation among protein binding classical fingers than among DNA binding fingers. This probably reflects the diversity of protein partners that are contacted by these zinc fingers compared with the relative homogeneity of a DNA substrate. In this context, it is also interesting to note that the sequences of many DNA binding zinc fingers are conserved not only on the surface involved in DNA recognition but also on the other surfaces. This may indicate that such domains have additional recognition surfaces; the third finger of erythroid Kruppel-like factor falls into this category because it mediates interactions both with DNA and with GATA-1 (11).
A combination of calorimetric and NMR methods has been used to demonstrate that these interactions are relatively weak, with affinities in the range 10 4 -10 5 M Ϫ1 , depending on the domains involved. These affinities are substantially lower than the majority of well characterized biological interactions, and it has become increasingly clear in recent years that weak interactions can play very important roles in biology. The GATA-FOG interactions are also obviously specific; the C-terminal zinc finger from GATA-1 is unable to bind FOG proteins (15), and some FOG CCHC fingers cannot interact with GATA proteins (17). Recent work has demonstrated that FOG-GATA interactions are essential for normal development (45), and it is interesting to speculate as to how weak interactions such as these are formed between proteins that are probably at low concentration in the nuclear milieu. It is possible that high local concentrations of transcription factors exist as a result of either nuclear compartmentalization or recruitment by other factors or by DNA. It is also possible that the very nature of the physical environment in the nucleus increases the strength of these interactions substantially (through, for example, molecular crowding). On the other hand, it is feasible that off rates such as that measured here for the USH-F9⅐Pan-NF complex (185 s Ϫ1 , corresponding to an average lifetime for the complex of ϳ5 ms) allow rapid exchange of binding partners and are appropriate for the precise control of gene expression that is required during development.
We previously showed that mutation of the final cysteine in FOG-F1 to histidine (effectively creating a classical CCHH ZnF) abolishes the interaction of FOG-F1 with GATA-1 in a yeast two-hybrid assay (18). To understand the molecular basis for this result, we determined the solution structure of the analogous C32H mutant of USH-F9. This mutant forms a stable structure with a fold very similar to that of the wild-type protein. It therefore appears that, although the effects of the histidine substitution are subtle in terms of the protein fold, the small change in the conformations of Phe-30 and Tyr-31 has a substantial effect on the GATA-FOG interaction. A comparison of the NF binding faces of USH-F9 and C32H shows that, with the exception of the two aromatics, the binding faces are otherwise largely unperturbed. The two aromatics lie to one side of the binding face of wild-type USH-F9 but are rotated somewhat in the binding face of C32H. This change in position appears to be enough to substantially decrease the affinity of the interaction.
The results presented here underscore the role of zinc finger domains as protein-protein interaction domains and support the idea that weak interactions can play important roles in the regulation of cellular development. The loss of in vivo activity (16) that results from the C 3 H substitution (despite the fact that binding affinity is probably only decreased by ϳ10-fold) may, however, suggest that such interactions are finely balanced and not tolerant to mutation. Conversely, one can see how novel protein-protein interactions can evolve as a result of small changes in domains that may have originally had a predominant role in DNA binding.