Disorder in a target for the smad2 mad homology 2 domain and its implications for binding and specificity.

The Smad2 Mad homology 2 (MH2) domain binds to a diverse group of proteins which do not share a common sequence motif. We have used NMR to investigate the structure of one of these interacting proteins, the Smad binding domain (SBD) of Smad anchor for receptor activation (SARA). Our results indicate that the unbound SBD is highly disordered and forms no stable secondary or tertiary structures. Additionally we have used fluorescence binding studies to study the interaction between the MH2 domain and SBD and find that no region of the SBD dominates the interaction between the MH2 and the SBD. Our results are consistent with a series of hydrophobic patches on the MH2 that are able to recognize disordered regions of proteins. These findings elucidate a mechanism by which a single domain (MH2) can specifically recognize a diverse set of proteins which are unrelated by sequence, lead to a clearer picture of how MH2 domains function in the transforming growth factor-beta-signaling pathway and suggest possible mechanisms for controlling interactions with MH2 domains.

The Smad2 Mad homology 2 (MH2) domain binds to a diverse group of proteins which do not share a common sequence motif. We have used NMR to investigate the structure of one of these interacting proteins, the Smad binding domain (SBD) of Smad anchor for receptor activation (SARA). Our results indicate that the unbound SBD is highly disordered and forms no stable secondary or tertiary structures. Additionally we have used fluorescence binding studies to study the interaction between the MH2 domain and SBD and find that no region of the SBD dominates the interaction between the MH2 and the SBD. Our results are consistent with a series of hydrophobic patches on the MH2 that are able to recognize disordered regions of proteins. These findings elucidate a mechanism by which a single domain (MH2) can specifically recognize a diverse set of proteins which are unrelated by sequence, lead to a clearer picture of how MH2 domains function in the transforming growth factor-␤-signaling pathway and suggest possible mechanisms for controlling interactions with MH2 domains.
The TGF-␤ 1 superfamily of cytokines plays an essential role in a variety of cellular responses including differentiation, cell fate specification, and growth inhibition (1)(2)(3)(4). Dysregulation of TGF-␤ signaling has been associated with many diseases, such as human cancers, fibrosis, hereditary hemorrhagic telangiectasia (2,5), and Marfan syndrome (6). TGF-␤ signal transduction is mediated intracellularly by the Smad proteins, including the R-Smads (receptor-regulated Smads), the I-Smads (inhibitor Smads), and the Co-Smad (common Smad). Ligand binding to the TGF-␤ receptor complex or a homologous complex leads to receptor phosphorylation of the R-Smads (7,8). Subsequently, the phosphorylated R-Smads can interact with the Co-Smad, Smad4, and accumulate in the nucleus (9,10). In the nucleus the Smad proteins bind to transcription factors and promoter regions and play a role in the transcription of various genes (11,12).
The R-Smads and Smad4 consist of a conserved N-terminal Mad homology 1 (MH1) domain and a conserved C-terminal MH2 domain joined by a poorly conserved linker region. The MH1 domain is known to function in binding DNA (13). The MH2 domain appears to exhibit many different roles and binds to many different proteins. Although many modular protein domains involved in signal transduction, such as SH3, SH2, WW, PTB, and PDZ domains, interact with a recognizable sequence motif (14), MH2 domains appear to be able to bind to a wide variety of ligands that do not share a single common motif. For example the MH2 domain of Smad2 binds a plethora of non-homologous proteins including SARA (15), the TGF-␤ receptors, FoxH1 (16), Mixer (17), TGIF (18,19), CBP (20), AML1 (21,22), Ski (23,24), and SIP1 (25). These interacting partners represent a diverse group of protein types, encompassing receptors, membrane anchoring proteins, and transcription factors. Furthermore, they have no region of sequence or structural similarity in common. A key question is how the MH2 domain functions in such a versatile manner, i.e. what properties allow the MH2 domain to maintain specificity despite being able to recognize a diverse group of proteins.
To examine the unique properties of the MH2, we have focused on the interaction between Smad2 and SARA because it is the best characterized. SARA is an anchoring protein that specifically binds to Smad2 and 3, transporting them to the receptor complex and increasing the efficiency of phosphorylation by the receptors (15). Crystallographic studies of Smad2 and Smad3 MH2 bound to the 60-residue SBD of SARA have provided evidence for an extensive hydrophobic interaction surface on the MH2 (26,27). The bound SBD wraps around the MH2, forming an extended structure comprised of a prolinerich rigid coil region, an ␣-helix, and a ␤-strand. These three structural elements bind extensive hydrophobic grooves on the MH2 covering ϳ2600 Å 2 .
The extended structure of the SBD in complex with MH2 suggests that the unbound SBD is disordered in solution or that it undergoes an unfolding transition before interaction with the MH2. Although the prevailing paradigm had been that proteins must form folded structures to function, the relevance of protein disorder in biological systems has been recognized recently (28). Using a neural network program to identify disordered regions on the basis of protein primary sequence, 35-51% of eukaryotic proteins are estimated to have disordered regions of 40 residues or more (29). Disordered protein regions are thought to play roles in the regulation of many protein-protein and protein-DNA interactions. In many cases disordered or partially ordered proteins undergo folding transitions during target recognition. Examples include GCN4 (30), LEF1 (31), and bacteriophage protein N (32). In another case, folding induced by phosphorylation is thought to control the interaction between the kinase-inducible activation domain of CREB and the KIX domain of CREB-binding protein (33). Disordered regions have been postulated to function in thermodynamic fine-tuning, plasticity, and control of protein interactions (28).
To investigate the structural requirements of a MH2-interacting partner, we have examined the structure of the unbound SBD of SARA, compared it to the bound form, and probed the binding of MH2 and SBD. NMR analysis indicates that the unbound SBD is highly disordered, allowing the SBD to make extensive contacts with the MH2. Individual contacts provide low binding affinity; however, together these interactions provide a substantial free energy of interaction. No region of the SBD appears to make a dominant contribution to the interaction between the SBD and MH2. These results indicate that the disorder of the SBD facilitates an interaction with the MH2 and suggest a binding mechanism that enables the MH2 to recognize a diverse group of disordered regions.

EXPERIMENTAL PROCEDURES
Disorder Prediction-Disorder predictions were made using the neural network program PONDR (Predictor Of Native Disordered Regions; Molecular Kinetics), which predicts probable disordered regions of proteins based on their primary sequence (34,35). Specifically we used the VLXT predictor (www.pondr.com/background.html).
Protein and Peptide Preparation-Initially SBD was purified using a non-denaturing purification protocol. SARA-SBD (amino acids 663-751) was amplified using PCR and cloned into pRSETB, a His-tag vector, using the NheI and XhoI restriction sites. The protein was expressed in Escherichia coli BL21(DE3) pLysS cells, then purified using a nickel nitrilotriacetic acid affinity column (Amersham Biosciences) in a solution of 20 mM imidazole, 25 mM Tris-HCl, pH 7.4, 50 mM NaCl, 0.5% Triton X-100, and 10 mM ␤-mercaptoethanol. The protein was eluted using 50 mM EDTA, 25 mM Tris-HCl, pH 7.5, and 10 mM ␤-mercaptoethanol. After concentration, the protein was purified on a Superdex 75 gel filtration column (Amersham Biosciences) in 75 mM NaCl, 25 mM Tris-HCl, pH 7.4. The construct severely affected the growth rate of the bacteria, and yield from the purification was less than 1 mg/liter in minimal medium. HSQC NMR spectra were recorded on 1 H, 15 N-labeled sample in two different buffers, one containing 420 mM NaCl and 50 mM NaHPO 4 , pH 6.0, and one containing only 100 mM NaHPO 4 , pH 6.0. These spectra were recorded at 5, 10, and 25°C.
Because of poor yield from this construct we began using a denaturing purification with a different construct after establishing that the SBD is disordered (as described under "Results"). SARA-SBD (amino acids 663-721) was amplified using PCR, cloned into pGEX 4T1 (Amersham Biosciences) using the BamH1 and XhoI restriction sites, and confirmed by sequencing. The last 30 residues of the previous construct were not included in this construct, as these residues are not involved in the interaction with the Smad2 MH2 domains (26,27). The GST-SBD fusion was expressed in E. coli BL21(DE3) cells at 30°C for 4 h. A 1 H, 15 N, 13 C-labeled SBD sample was prepared by expressing the protein in M9 minimal medium containing 1 g/liter [ 15 N]NH 4 Cl and 3 g/liter [ 13 C]glucose as the sole nitrogen and carbon sources, respectively. Similarly, a 2 H, 15 N-labeled sample was prepared by expressing the protein in M9 minimal medium containing 1 g/liter [ 15 N]ammonium chloride and 99.9% D 2 O. For the D 2 O growth only, 2 liters of M9 were inoculated with E. coli BL21(DE3) cells transformed with pGEX 4T1 SBD to an OD 600 of 0.17. This was grown at 37°C until an OD of 0.5 was reached (ϳ4.5 h), then the temperature was lowered to 30°C over a 1-h period followed by induction with 0.25 mM isopropyl-␤-D-thiogalactoside and harvest after an additional 5 h. Cells from all SBD growths were lysed by sonication in a solution of 50 mM Tris-HCl, pH 8.0, 150 mM NaCl, 0.5% Triton X-100, and 1 mM EDTA and purified on glutathione-Sepharose 4B resin (Amersham Biosciences). SBD was cleaved from GST using thrombin and purified on a C4 HPLC column (Phenonmenex) using an acetonitrile gradient. The yield of the final product was 2 mg/liter of minimal medium. There were no significant changes in the HSQC spectra of SBD purified using this denaturing purification when compared with SBD purified using the native purification except for the absence of peaks corresponding to the additional residues in the Histagged SBD. Identity of the protein and isotope labeling efficiency were confirmed by mass spectrometry and amino acid analysis.
A pET3d Smad2 MH2 construct was a gift from Yigong Shi (Princeton University). MH2 was cotransformed with the Magic Plasmid, a gift from Cheryl Arrowsmith (University of Toronto), into BL21(DE3) cells. The Magic Plasmid contains tRNA genes for codons that are rare in E. coli. After induction with 1 mM isopropyl-␤-D-thiogalactoside, the protein was expressed for 16 h at 25°C, harvested, and purified immediately. Cells were lysed in 25 mM MES, pH 6.0, 5 mM dithiothreitol with Complete EDTA-free protease inhibitors (Roche Applied Science) and purified on a Mono S fast protein liquid chromatography column (Amersham Biosciences) using a NaCl gradient. The collected fraction was then purified on a phenyl-Superose column (Amersham Biosciences) and eluted from the column using 50 mM Na 3 PO 4 , pH 7.4, 5% ethanol. The protein was concentrated using MacroSep centrifugal concentrators (Pall Filtron). Protein quantification was carried out using the method of Gill and von Hippel (36).
NMR Spectroscopy-Assignment experiments were performed at 10°C on a Varian INOVA 500-MHz spectrometer equipped with a pulsed field gradient unit and a triple resonance probe. NMR spectra were processed and analyzed using nmrPipe/nmrDraw (37), NMRView (38), and PIPP (39) software. HNCO (40), HNCACB (41), HBCBCA-CONNH (42), HACAN (43), and CCC-TOCSY (44) experiments were used to assign the 15 N, 13 CЈ (carbonyl), 13 C␣, 13 C␤, 1 HN, and 1 H␣ nuclei. Nuclear Overhauser effect (NOE) data were collected using NOESY-HSQC (45) at 500 MHz on a 1 H, 15 N, 13 C-labeled sample with a mixing time of 150 ms. HSQC-NOESY-HSQC (46) were recorded on a Varian Inova 800 MHz spectrometer at 10°C with mixing times of 150 and 600 ms on a 15 N sample that was deuterated at all non-exchangeable positions. Assignment of resonances from both major and minor conformers corresponding to proline isomers was completed by traditional backbone assignment methods (47). The presence of a cis peptide bond was confirmed by the proline carbon side-chain resonance frequencies (48,49). Proline isomer ratios were measured by comparing well resolved peak volumes in the HBCBCACONNH and HSQC using PIPP. We did not account for possible differences in magnetization transfer rates in these experiments. The hydrodynamic radius of SBD was measured using pulsed field gradient diffusion experiments (50,51). The diffusion rate of the SBD was measured and compared with an internal standard, dioxane, a molecule of known size, to determine the hydrodynamic radius of the SBD.
Fluorescence Binding Studies-Fluorescence binding experiments were carried out at 16°C using an AVIV Ratio spectrofluorometer model ATF105 and a Microlab 500 Series automated titrator. Smad2 MH2 protein was extensively dialyzed into a filtered and argon-purged buffer containing 50 mM Na 3 PO 4 , pH 7.4, 100 mM NaCl, and 5% ethanol. Fluorescence levels were monitored as peptide was added to the MH2 sample using an excitation wavelength of 295 nm and an emission wavelength of 350 nm. Data were fit to the equation, where A is the total MH2 protein, B is the total ligand added, K d is the dissociation constant, F is the fluorescence, and F max is the fluorescence maximum.
GST Pull-down Assays-GST fusions with truncated and mutant SBDs were amplified using PCR, cloned into pGEX 4T1 (Amersham Biosciences) using the BamH1 and XhoI restriction sites, and confirmed by sequencing. GST fusion constructs included SBD (amino acids 669 -705), ⌬Rigid coil (amino acids 677-705), which is missing a portion of the rigid coil region, and ⌬␤-strand (amino acids 669 -694), which is missing the ␤-strand region. Additionally, several SBD fusions were generated with chimeric or mutant ␣-helix regions. The mutations in these constructs are shown in Table III. GST-SBD fusions were overexpressed in E. coli DH5␣ in LB medium containing 100 g/ml ampicillin to an OD 600 of 0.7 at 37°C. Protein expression was induced with the addition of 0.25 mM isopropyl-␤-D-thiogalactoside for 5 h. The bacterial pellet was lysed by sonication in a buffer containing 50 mM Tris-HCl, pH 7.4, 150 mM NaCl, 1 mM EDTA, 25 mM NaF, 10 mM sodium pyrophosphate, 1 mM phenylmethylsulfonyl fluoride, 1 mM Na 3 VO 4 , pepstatin (0.01 mg/ml), leupeptin (0.01 mg ml), antipain (0.01 mg/ml), benzamidine hydrochloride (0.1 mg/ml), soybean trypsin inhibitor (0.1 mg/ml), and 0.5% Triton X-100. After clearing by centrifugation the lysate was incubated with glutathione-Sepharose 4B Resin (Amersham Biosciences) for 1 h, after which the resin was washed with buffer containing 50 mM Tris-HCl, pH 7.4, 150 mM NaCl, 1 mM EDTA, and 0.1% Triton. The purified proteins were subsequently incubated with 293T cell lysates expressing FLAG-tagged Smad2 for 1 h then washed with a buffer solution of 50 mM Tris-HCl, pH 7.4, 150 mM NaCl, 1 mM EDTA, 1 mM phenylmethylsulfonyl fluoride, and 0.1% Triton. Protein complexes were released from GST resin by boiling in SDS sample buffer and separated by SDS-PAGE. Bound FLAG-Smad2 was visualized with anti-FLAG M2 monoclonal antibody (Sigma) and chemiluminescence as recommended by the manufacturer (ECL kit; Amersham Biosciences).

RESULTS
Analysis of MH2-interacting Regions-Because many modular binding domains bind to a recognizable sequence motif, we attempted to identify a consensus target motif for the Smad2 MH2 domain. To this end we tried to align the regions that have been shown to be necessary for interaction with Smad2 MH2 from seven different proteins. Because of the unrelated nature of the sequences, no meaningful alignment was possible; inspection of these regions (Fig. 1a) reveals that, although it is possible to identify similarities between pairs of sequences, there is no universal MH2 interaction motif.
One previously identified pairwise similarity is the Smad interaction motif (SIM) (52), shown in red (Fig. 1). The SIM has the sequence PNxxxxxahxxxIPPh (where a is an acidic residue, h is a hydrophobic residue, and x is any residue) and is shared by SARA and a subset of the Mixer family of transcription factors and partially shared by FoxH1. The Mixer SIM, like the SARA SBD, is sufficient for an interaction with the MH2 domains of Smad2 and 3 and likely occupies some of the same binding sites since they compete for binding to MH2. Several of these residues including Pro-672, Tyr-680, Pro-686, and Leu- 687 of SARA were shown to contact the surface of Smad2 MH2 (Fig. 1b). However, Asn-673, Glu-679, Ile-684, and Pro-685 do not contact the MH2 surface. Significantly, Mixer residues Asn-293 and Pro-305, which align with Asn-673 and Pro-685 in SARA, are required for a high affinity interaction with MH2, as mutations at these positions significantly decrease the ability of Mixer to interact with the MH2 (52). These observations indicate that the SIM does not fit the definition of a sequence recognition motif per se, since not all of the residues bind to the MH2. One possibility is that the Mixer SIM and SARA SIM bind in different conformations. Alternatively, they bind in the same conformation, and those residues that do not contact the MH2 surface yet are required for a high affinity interaction are involved in maintaining structural features of the SIM, which make it amenable to interacting with the MH2.
SARA SBD Is Natively Disordered-Because the affinity of an interaction between two proteins is dependent on the difference in energy between the bound state and the free state, we decided to investigate the unbound SBD to increase our understanding of the interaction between the MH2 and SARA SBD. Previous studies have shown that the SBD interacts with the MH2 in an extended conformation and does not form its own hydrophobic core. Therefore, the SBD must either undergo an unfolding transition before binding or exist in solution as a disordered region. An analysis of the sequence of the SBD using the program PONDR was used to assess whether the SBD has a propensity to be natively disordered (35). PONDR is a neural network program trained on a data base of disordered proteins and used to predict disordered regions of proteins based on their primary sequence. Disordered regions are correctly identified with 79% accuracy for stretches of 21 amino acids or longer. PONDR gives a convincing, uninterrupted prediction of disorder for the entire SBD (Fig. 1c). Consistent with this, the SBD sequence is of low complexity and is enriched in amino acid types thought to be disorder-promoting including nine Pro, seven Ser, four Ala, and four Gln residues and has a paucity of residues thought to be order-promoting such as Trp, Phe, Ile, and Leu (53).
PONDR was also used to predict the propensity of other Smad2-interacting regions to be disordered (Fig. 1c). Although both Mixer and FoxH1 are expected to bind to the MH2 in an extended conformation similar to the SARA interaction (52), neither Mixer nor FoxH1 SIMs are predicted to have significant lengths of disorder by PONDR. Although the SIMs might still be disordered, these results raise the possibility that they undergo an unfolding transition before binding to the MH2. Significantly, three of the other MH2-interacting regions of the partners TGIF, CBP, and AML1 are predicted to have disordered regions of 20 residues in length or more in the region that interacts with the MH2. Thus, the sequence comparison data and PONDR predictions suggest that the MH2 may have some propensity for binding to natively disordered or unfolded regions of interacting proteins.
NMR is an extremely useful tool for investigating disordered states. Its uses include but are not restricted to determining whether a protein is disordered, probing for elements of secondary structure, detection of long range interactions, and measuring hydrodynamic properties. We have used NMR approaches to determine experimentally whether SARA SBD is disordered and to further probe its structural properties. In disordered or unfolded proteins the amide protons are mostly solvent-exposed and experience a similar chemical environment, unlike amide protons in folded proteins. As a result they resonate in a narrow range between 8.0 and 8.8 ppm. The proton-nitrogen ( 1 H, 15 N) HSQC spectrum of SARA SBD revealed a narrow range of amide proton chemical shifts charac-teristic of disordered proteins (Fig. 2). Similar results were seen when a longer SBD comprising residues 663-751, which was purified under non-denaturing conditions, was used. Additionally, different buffers and temperatures were used to record spectra of this construct as described under "Experimental Procedures." These data provide strong evidence that the unbound SBD is disordered in solution.
In the crystal structure of the SBD bound to Smad2 MH2, the extended SBD forms three secondary structure elements, a proline-rich rigid coil region, an ␣-helix, and a ␤-strand. To determine whether these secondary structure elements are present in the free SBD, we analyzed the C␣ and C␤ chemical shifts (Fig. 3). C␣ chemical shifts more than 0.7 ppm above random coil values and C␤ chemical shifts more than 0.7 ppm below random coil are indicative of ␣-helices (54). The opposite holds true for ␤-strands. Our results, which have been corrected for nearest neighbor effects of alanine and proline (55), demonstrate a complete lack of ␣-helix in the free SBD, indicating that the ␣-helix must undergo a folding transition prior to binding to the MH2. In contrast, C␣ and C␤ chemical shifts in the region that forms a ␤-strand when bound to the MH2 are consistent with ␤-strand values. This region probably samples many different conformations but appears to have some propensity for , angles characteristic of a ␤-strand. The C␣ and C␤ chemical shifts are, thus, consistent with an extended structure.
Unlike chemical shift data, which reflect the populationweighted average of the ensemble of structures in the disordered state, NOEs provide a probe for specific interactions that may not be present in the majority of conformers. The NOE is a through space transfer of magnetization, which is populationdependent and distance-dependent to the negative sixth power. Consequently the presence of an NOE between two protons indicates that these atoms are close in space (less than 6 Å apart) for a period of time. We used protein deuterated at all positions other than the amide positions, which are exchangeable, to perform NOESY experiments with long NOE transfer times. Deuteration reduces magnetic relaxation rates, allowing one to increase the length of time for NOE transfer and permitting detection of weak NOEs from interactions that may be transient or long range. This methodology has successfully been used to detect interactions in disordered states (56 -58). In the case of the SBD we detected NOEs between consecutive residues and residues that were separated by one (i, i ϩ 2) or two (i, i ϩ 3) residues (Fig. 4). Weak NOEs between residues that are separated by two residues are found in the regions of the SBD immediately after the region that is ␣-helical in the bound state. This suggests that an ␣-helix or a turn may be transiently formed in this region. We were unable to detect any long range NOEs, indicating an absence of significantly populated secondary and tertiary structures in the free state of the SBD. A comparison of the intensities of the two sequential NOEs, ␣N and NN, using a 1 H, 15 N, 13 C-labeled sample showed that NOEs between amide protons were much weaker than NOEs between the amide proton and the C␣ proton of the previous residue. This provides additional evidence for , angles consistent with an extended structure and a lack of persistent ␣-helical structure.
To further support our conclusion that the SBD does not have a compact structure, we measured the hydrodynamic radius of the unbound SBD using pulsed field gradient diffusion experiments (50,51). To determine this we compared diffusion rates for SBD and dioxane, a molecule of known size. We obtained a hydrodynamic radius of 25.6 Ϯ 0.7 Å for the 60-residue SBD. The correlation between hydrodynamic radius (R h ) and polypeptide chain length (N) for polypeptides denatured in 6 M guanidinium chloride has been reported as R h ϭ (2.21 Ϯ 1.07)N 0.57 Ϯ 0.02 Å (51), which would give 22.8 Ϯ 11.0 Å for a 60-residue protein. For folded polypeptides the hydrodynamic radius is dependent on the shape of the molecule but generally fits the empirical equation R h ϭ (4.75 Ϯ 1.11)N 0.29 Ϯ 0.02 Å or 15.6 Ϯ 3.6 Å (51). Thus, the measured hydrodynamic radius of the SBD is significantly larger than that expected for a folded protein and above the norm for a polypeptide dissolved in 6 M guanidinium chloride. This corroborates the chemical shift and NOE data suggesting that the ensemble of SBD conformations includes many highly extended structures.
Proline Isomer Ratios Indicate Some Conformational Restriction-The peptide bond preceding a proline residue can exist in either the cis or trans conformation. The trans form of the bond is generally slightly lower in energy and, thus, is the preferred orientation. The percentage of peptide bonds in the cis conformation can vary between 6.0 and 37.7% in disordered peptides depending on the identity of the residue preceding the proline (59). In folded proteins the percentage of bonds in the cis conformation ranges from 1.8 to 12.4% because a single conformation is usually adopted that restricts the bond to one or the other isomer. Similarly, in disordered proteins conformational restriction may result in a reduced population of peptide bonds in the cis conformation. Comparison of the populations of cis and trans for the peptide bonds that precede prolines in the SBD with the values seen in random coil peptides suggests a limited form of conformational restriction (Table I). Both the cis and trans conformations can be accessed, but the population in the cis conformation is reduced in a number of cases. In particular, prolines 674 and 677 in the rigid coil region and prolines 701, 706, 713, and 720 in the C-terminal portion of the SBD show conformational restriction. Nonetheless, all of the prolines that we observed were able to access both the cis and trans conformation, confirming a lack of stable structure.
No Region of the SBD Makes a Dominant Contribution to the Interaction with Smad2 MH2-To determine whether a specific region of the SBD dominates the interaction with the MH2, we measured the affinities of peptides corresponding to the proline-rich rigid coil region, the ␣-helix, and the ␤-strand for the MH2 using intrinsic fluorescence (Table II). Although the full SBD bound with a dissociation constant of 240 Ϯ 30 nM, the affinities for any single element of the SBD were ϳ2500 -8500-fold lower. Of note, the interaction between the ␣-helix and MH2 was particularly weak and may reflect energy lost due to formation of the helix, which is not present in the free SBD and is, thus, presumably energetically unfavorable on its own. The data demonstrate that no region of the SBD makes a dominant contribution to the interaction with Smad2 MH2. We also examined the free energy of the interactions. Interestingly, the sum of the ⌬G values calculated for binding of the individual regions (Ϫ11.8 kcal/mol) was significantly more favorable than interaction between the full SBD and MH2 (Ϫ8.8 kcal/ mol). This may be due to the greater loss of configurational entropy associated with the full SBD binding to the MH2. Alternatively, joining these regions might introduce some sterically or electrostatically unfavorable interactions in the bound state. For example, the ␣-helical region might be forced into a slightly less favorable conformation on the MH2 surface due to being tethered at either end by the rigid coil and ␤-strand regions. Thus, a high affinity interaction between the SBD and MH2 is achieved through extensive contacts rather than a few high energy contacts.
To confirm that all three regions contribute to the interaction between the SBD and MH2 domain, we examined the interaction of full-length Smad2 with wild type SBD and a series of mutants using GST pull-down experiments (Table III and Fig.  5). An interaction between MH2 and wild type SBD was readily detectable; however, we were unable to detect an interaction between two truncated forms of SBD in which a portion of the rigid coil (⌬rigid coil) or ␤-strand region (⌬␤-strand) was removed. Because we could not remove the ␣-helical region without affecting the interaction of either the rigid coil or ␤-strand region, we constructed several GST-SBDs with mutated ␣-helical regions. In ⌬␣-helix A we made conservative substitutions in this region, whereas in ⌬␣-helix B we substituted with Ala and Gly. We could not detect an interaction with MH2 for either of these constructs. These data imply that all three regions of the SBD are required for a high affinity interaction with the MH2.

TABLE I
Proline isomer ratios of the SARA SBD, indicating some conformational restriction Experimentally determined percentages of cis are given together with the residue context (residue preceding the proline) and the expected cis population for a random coil when the proline is preceded by the given residue. A significantly different percentage in the cis conformation relative to the random coil provides evidence for conformational restriction at that site.

TABLE III Interaction of SARA SBD mutants with Smad2 MH2
The interactions between MH2 and SBD mutants were tested using GST pull-down experiments. Interactions are classified as positive (ϩ) or not detectable (Ϫ). The symbols in the first row delineate the secondary structure boundaries of the rigid coil, ␣-helix and ␤-strand observed in the crystal structures of the bound SBD. Results provide no evidence for significantly populated secondary or tertiary structures. NOEs were measured from a 1 H, 15 N HSQC-NOESY-HSQC of a 15 Nperdeuterated sample of SBD with a mixing time of 600 ms at 800 MHz (A) and a NOESY-HSQC of a 1 H, 15 N, 13 C sample of SBD with a mixing time of 150 ms at 500 MHz (B). Dashed lines represent ambiguous NOE data. No NOEs were observed between amide protons separated by more than three residues in the primary sequence.

FIG. 5. GST pull-down analysis of Smad2-SBD interaction.
Purified GST-SBD and mutated SBD were incubated with 293T cell lysates expressing FLAG-tagged Smad2. After washing, protein complexes were released from the GST resin by boiling in SDS sample buffer and separated by SDS-PAGE. Bound FLAG-Smad2 was visualized with anti-FLAG antibody. Representative experiments for the data in Table III are shown. The amount of Smad2 used in each pull-down is shown in the "Smad2 input" lane.
Although the interaction of MH2 with ⌬␣-helix A and ⌬␣helix B was undetectable, we noted that the ␣-helical region of Drosophila SARA is significantly different from its mammalian orthologs. Therefore, we examined additional ␣-helical mutants. dSARA-chimA and dSARA-chimB are chimeras in which the human ␣-helical region was substituted with the ␣-helical region from the Drosophila homologue. Because this region is shorter in the Drosophila protein, we made a long (A) and short (B) construct. In both cases the interaction was detectable. This suggests that there is considerable flexibility in the sequences that can interact with the corresponding MH2 surface. To confirm this we made a third construct, ⌬␣-helix C, in which we substituted a portion of an ␣-helix taken from Aspergillus oryzae ribonuclease T1 (60). We selected this sequence because it is known to form an independently stable ␣-helix. ⌬␣-Helix C also made a detectable interaction with Smad2 MH2. Altogether, these data suggest that although several of the residues in the wild type ␣-helix do interact with the MH2 (the four residues on the opposite face of the helix do not interact), the interaction is primarily hydrophobic and can presumably be satisfied by a number of different residues. Furthermore, removal of three residues in dSARA-chimB did not abrogate the interaction with the MH2, indicating that there is sufficient "slack" in the wild type SBD to accommodate removal of these residues or, alternatively, that dSARA does not form an ␣-helix. As a consequence of the large size of the interaction interface between MH2 and SBD and the lack of inherent structure in the SBD, there appears to be flexibility in the exact primary sequence of the SBD required for binding, at least in the ␣-helical region. The different sequences of other MH2 partners suggest that Smad2 has an ability to accommodate divergent sequences. DISCUSSION We have shown here that the free SBD is disordered in solution and has very little propensity for any stable secondary structure. NMR-based NOE, chemical shift, and pulsed field gradient diffusion measurements all confirm the disordered and extended nature of the SBD conformational ensemble. Additionally we have found that the rigid coil, ␣-helical and ␤-strand regions of the SBD all make significant contributions to the free energy of binding based on fluorescence binding and GST pull-down experiments. Importantly, no region appears to make a dominant contribution.
The MH2 domain of Smad2 recognizes a wide variety of protein ligands that do not share a common sequence motif. Our demonstration that the unbound SBD is disordered and that no region of the SBD energetically dominates the interaction with MH2 suggests that the MH2 domain, with its large binding area, is amenable to interacting with a diverse group of disordered hydrophobic ligands. X-ray crystallography studies have shown that the SBD of SARA binds to the MH2 domain of Smad2 in an extended conformation stretching around the MH2 and contacting a large primarily hydrophobic surface area on the MH2. Protein regions that adopt an extended conformation are potentially able to interact with a larger portion of this surface area than a folded protein containing a similar number of residues. This may give the MH2 a preference for binding disordered regions of proteins. PONDR predictions that CBP, TGIF, and AML1 have propensity for disorder and data implying that FoxH1 and Mixer also bind to Smad2 MH2 in an extended conformation support the idea that the MH2 has a large binding site with affinity for multiple disordered ligands.
The idea that a protein can have a surface that is a preferred site for binding multiple ligands is not unique to this study of Smad2 MH2. Delano et al. (61) conducted a study in which they examined protein and peptide interactions with the Fc fragment of human immunoglobulin G (61). Screening a library of random peptides for those that bound to the Fc fragment led to the identification of two peptides, both of which were found to bind the hinge region of the Fc fragment. Additionally, the hinge region interacts with four other proteins that are unrelated by structure or sequence. The hinge region of the Fc fragment was characterized as being an accessible hydrophobic surface area with few sites for polar interactions, which impose geometric constraints. Delano et al. (61) propose that these properties make the hinge region a preferred binding site for a wide variety of different hydrophobic ligands. We propose here that the MH2 domain has similar preferred binding sites across the portion of its surface that recognizes the SBD. These hydrophobic binding patches are able to recognize a wide variety of targets, which do not share a common sequence motif, provided that they have a threshold of appropriately spaced hydrophobic residues. The spacing of hydrophobic residues in the ligand with respect to the spacing of hydrophobic patches on the MH2 may be as important as the specific identity of the residue, thus conferring considerable plasticity on MH2 domain interactions. Consistent with this, we have found that no region of the SBD makes a dominant contribution to the interaction. Rather, the combination of several weak interactions leads to a high affinity interaction. This is also supported by the diverse sequences that bind to the MH2.
The free energy of interaction is the difference between the free energy of the complex and the unbound components, emphasizing the importance of the structures of the free components. Because the SBD binds to the MH2 domain in an extended conformation, it must proceed through an extended or disordered state before binding. If the SBD were natively folded it would have to proceed from a compact folded structure through an extended disordered state before binding the MH2. Thus, whether the SBD is natively folded or disordered, there is a disorder to order transition and a considerable loss of configurational entropy upon binding. However, a natively folded SBD would be less favorable because of the additional energy contribution associated with unfolding the SBD. Thus, the disordered state of the free SBD allows it to interact with a large surface area on the MH2 domain with no energetic penalty associated with an unfolding event.
Further analysis is required to determine whether these principles are also applicable to other MH2 interactions. Certainly the interaction between disordered proteins and the large hydrophobic surfaces on the MH2 surface can account for the exceptionally diverse binding capabilities of the MH2. Disordered ligands are flexible and can potentially access a larger proportion of the surface area of an interacting protein than can a structured protein. There is also more plasticity in the arrangement and identity of particular residues in the sequence of a disordered protein. Coupled with the loose requirement for appropriately spaced hydrophobic residues, this could translate into a large subset of amino acid sequences which can be recognized by the MH2. Future work will involve developing an algorithm to design MH2 binding partners based on this model. Our model also suggests how the affinity of interactions with the MH2 may be controlled. First, the affinity of a particular ligand for the MH2 may depend on the proportion of the MH2 binding areas that are occupied. For example, the FoxH1/ Mixer SIM may interact with the rigid coil binding area and the ␣-helix binding area but not the ␤-sheet binding site. Additionally, it is known that MH2 domains have binding surfaces outside of the areas recognized by the SBD, which are involved in binding other MH2 domains, receptors, and other interacting proteins. Therefore, there may be many possible extended binding sites on the MH2. This is relevant in the identification of the MH2 binding region in an interacting partner, as the minimum sequence required for a detectable interaction is not necessarily the full MH2 binding region. A second implication of our studies is that interaction with the MH2 may be controlled by the structure of the ligand. Not all of the MH2 ligands are predicted to have extensive natively disordered regions, and thus, an interaction with the MH2 may require an unfolding event, which could represent a layer of control for MH2 interactions. The model also emphasizes that specific residues that do not physically interact with the MH2 may be important because they function in maintaining the structure or disorder of the ligand.
It is noteworthy that SARA binds with high affinity to Smad2 and -3 but not to Smad1, -4, -6, or -7. Despite Smad2 MH2 being able to bind a diverse group of ligands, specificity is maintained. There are proteins that bind to Smad2 but not Smad1 and vice versa. Further work will be required to demonstrate whether MH2 domains from Smad1, -4, -6, or -7 have a similar mechanism of target recognition. It is clear that the complexity of responses to the relatively simple Smad pathway is governed by a large number of interacting proteins. The unique recognition mechanism employed by Smad MH2 domains may facilitate exquisite control of binding affinity with a diverse set of targets.