Interaction of Smad Complexes with Tripartite DNA-binding Sites*

The Smad family of transcription factors function as effectors of transforming growth factor-β signaling pathways. Smads form heteromultimers capable of contacting DNA through the amino-terminal MH1 domain. The MH1 domains of Smad3 and Smad4 have been shown to bind to the sequence 5′-GTCT-3′. Here we show that Smad3 and Smad4 complexes can contact three abutting GTCT sequences and that arrays of such sites elevate reporter expression relative to arrays of binding sites containing only two GTCTs. Smad3/4 complexes bound synergistically to probes containing two of the four possible arrangements of three GTCT sequences and showed a correlated ability to synergistically activate transcription through these sites. Purified Smad3 and Smad4 were both able to contact three abutting GTCT sequences and reporter experiments indicated that either protein could mediate contact with all three GTCTs. In contrast, the Smad4 MH1 domain was essential for reporter activation in combination with Smad1. Together, these results show that Smad complexes are flexible in their ability to interact with abutting GTCT triplets. In contrast, Smads have high affinity for only one orientation of abutting GTCT pairs. Functional Smad-binding sites within several native response elements contain degenerate GTCT triplets, suggesting that trimeric Smad-DNA interaction may be relevant in vivo.

Here we examine how Smad box number and arrangement affect the ability of Smad complexes to bind DNA and activate transcription. The results show that individual Smad complexes are capable of contacting up to three abutting Smad boxes. We also found that contact with three Smad boxes can stabilize binding and enhance transcriptional activation relative to contact with two Smad boxes. While Smads interact efficiently with only one orientation of two Smad boxes, there is surprising flexibility in the ability to interact with various orientations of three Smad boxes. Furthermore, a subset of triple Smad box arrangements that resemble sequences of some natural Dpp and TGF-␤ response elements exhibit a high degree of synergism in response to heteromeric Smad complexes. We also show that Smad1/4 complexes, although similar to Smad3/4 complexes in their ability to recognize Smad box sites, are more dependent on Smad4 for DNA contact. Together, these results show how Smad box number and orientation affects direct Smad-DNA contact and provides a framework for investigating the role of direct Smad contact with native response elements. AGACGTCTTTGG; RRR, ACTTGTCTGTCTGTCTTTGAATTCTTAGT-CTGTCTGTCTTTGG; RLL, ACTTGTCTAGACAGACTTGAATTCTTA-GTCTAGACAGACTTGG; LRR, ACTTAGACGTCTGTCTTTGAATTCT-TAAGACGTCTGTCTTTGG; LRmR, ACTTAGACATCTGTCTTTGAAT-TCTTAAGACATCTGTCTTTGG; RLmR, ACTTGTCTAGATGTCTTTG-AATTCTTAGTCTAGATGTCTTTGG; RLmL, ACTTGTCTAGA-TAGACTTGAATTCTTAGTCTAGATAGACTTGG.
To constitutively express proteins in Drosophila S2 cells we used the Act5C promoter plasmid pPacPL. Smad cDNAs used to generate effector plasmids were generously provided by Y. Zhang, R. Derynck, and J. Massague. To generate Smad3NLC, the Smad3 coding sequence, possessing a BamHI site immediately adjacent to the initiator ATG, was cloned into the BamHI and KpnI sites of pPac. The construct Smad3LC is identical to Smad3NLC except 142 amino acids were removed from the MH1 domain and a new BamHI site and ATG were introduced at the EagI site using a pair of oligos. The sequence between the BamHI and EagI sites is 5Ј-GGATCCATGTCGTCCCCGGCCG-3Ј. FLAG-Smad4NLC was built by 1) cloning FLAG-Smad4NLC from pCMV5 into the HindIII and BamHI sites of pBluescript KSϩ and then 2) transfer of the resulting HindIII (blunt)-NotI fragment into the EcoRV and NotI sites of pPac. Removal of the Smad4 DNA-binding domain for construction of FLAG-Smad4LC was by an internal deletion of 573 nucleotides between two XcmI sites within FLAG-Smad4NLC. The 3Ј overhangs generated by XcmI cleavage were removed with Klenow prior to religation to preserve the reading frame. The Smad1 coding sequence was introduced into the BamHI and KpnI sites of pPac to generate Smad1NLC. For the construct Smad1LC, 147 amino acids were removed from the amino terminus and a new ATG was introduced by polymerase chain reaction. Primers for polymerase chain reaction were TTTTCTGCAGTATTCATGCCTCAGCACAGCCTC (5Ј primer) and TC-CAATATGCCGCCTGGTGTTTTC (3Ј primer). Activated versions of the ActRIB, ActRIB(TD), and Thickveins, Tkv(QD) receptors were kindly provided by J. Massague and Y. Chen, respectively. Each was expressed using pPac.
Transient Transfections and ␤-Galactosidase Assays-For each reporter assay 8 ϫ 10 5 Drosophila S2 cells were transfected with a total of 1 g of DNA (50 ng of hsplacCaSpeR reporter, 50 ng of ActRIB(TD) or Tkv(QD) expression plasmid, 100 ng each appropriate Smad expression plasmid and pPac to bring total DNA to 1 g) using DOTAP (Sigma). After 36 h cells were lysed in 50 l of phosphate-buffered saline containing 0.1% Nonidet P-40. ␤-Galactosidase activity was assayed from 10 l of extract mixed with 90 l of 85 mM potassium phosphate (pH 7.3), 9 mM KCl, 1 mM MgCl 2 , 54 mM 2-mercaptoethanol, 5 mM CPRG (Roche Molecular Biochemicals). Activity was measured as the rate of color change over time at 595 nm using a microplate reader (Bio-Rad).
Gel Shifts-Smad3 and Smad4 were individually expressed in SF9 cells as His-tagged fusion proteins using a baculovirus expression system. Smad3 was coexpressed with the Alk5 receptor possessing the activating mutation. The Smads were purified to near homogeneity using Talon metal-affinity resin (CLONTECH).
For gel shift assays, purified protein was mixed with Ն0.01 pmol of probe in a buffer containing 20 mM Tris (pH 7.5) or 20 mM Hepes (pH 7.5), 100 mM NaCl, 2% Ficoll, 1 mM dithiothreitol, 0.01% Nonidet P-40, and 500 g/ml bovine serum albumin. Binding was allowed to proceed for 30 min at room temperature. Protein-DNA complexes were separated from free probe on 5% nondenaturing polyacrylamide gels crosslinked at a ratio of 60:1 (acrylamide:bisacrylamide). Dried gels were visualized with a Molecular Dynamics Storm Model 860.
When S2 extracts were used for gel shifts bovine serum albumin was omitted from the buffer and 6.6 g/ml dI-dC was included. Whole cell extracts were prepared from 1 ϫ 10 7 S2 cells transfected with a total of 7.5 g of DNA (2.5 g each appropriate Smad expression plasmid, 2.5 g of ActRIB(TD) or Tkv(QD), and pPac to bring total DNA to 7.5 g). After 36 h cells were lysed in 300 l of 50 mM Tris (pH 7.5), 100 mM NaCl, 0.5% Nonidet P-40, 1 mM EDTA, 2 mM dithiothreitol, 1 mM sodium orthovanadate, 1 mM benzamidine, 1 mM phenylmethylsulfonyl fluoride, and 1 g/ml pepstatin. Each 30-l binding reaction contained 5 l of extract. Reactions were allowed to proceed on ice for 45 min. Electrophoresis was done at 4°C for 135 min at 175 V. For supershifts, 1 l of the antibodies (2-3 g) was added to the reactions after 20 min. Smad3 was supershifted with monoclonal anti-Smad1/2/3 (Santa Cruz Biotechnology) while flag-Smad4 was supershifted with monoclonal anti-flag M2 (Sigma) or monoclonal anti-Smad4 (Santa Cruz Biotechnology). Protein expression levels were examined by Western blot using an ECL chemiluminescent kit (Amersham). Probes for gel shift were prepared by end labeling of oligos with 32 Fig. 1A shows that Smad-binding sites within native TGF-␤/activin response elements of PAI-1, collagenase, COL7A1, Mix.2, goosecoid, and JunB contain abutting perfect (solid arrows) and degenerate (dotted arrows) Smad boxes arranged primarily as direct repeats. We refer to this direct repeat arrangement as right-right (RR), as shown in Fig. 1B. Among the characterized TGF-␤/ activin targets, the right-left (RL) arrangement corresponding to the GTCTACAG sequence identified by binding site selection occurs only once in the goosecoid element, but occurs five times in the Dpp response element of tinman, including one example of a match to the consensus site. The goosecoid and JunB elements contain examples of three abutting Smad box/degenerate box sequences in an RRR direct repeat arrangement. COL7A1 contains an RR arrangement with an inverted third degenerate box, an arrangement we refer to as left-right-right (LRR). The LRR arrangement also occurs in Mad/Medea-binding sites of the vestigial, Ubx, and tinman Dpp response ele- GTCT Smad boxes (solid arrows) or related Smad box-like sequences (dashed arrows) can be identified in sites directly bound by Smads as demonstrated by gel shift, footprinting, and mutational analysis. Each is labeled "R" for rightward, or "L" for leftward. The vestigial (26), Ubx (32), and tinman (33) sequences are from Dpp responsive elements. PAI-1 (29), collagenase (15), COL7A (53), Mix.2 (46), goosecoid (36), and JunB (34) are responsive to TGF-␤ or activin. B, arrangements of Smad boxes in synthetic binding sites used to generate two-Smad box reporter constructs. C, configuration of Smad boxes (arrows) in the RR reporter. Smad boxes were positioned adjacent to a basal hsp70 promoter driving lacZ. Oligo-derived inserts, each encoding two sets of Smad boxes separated by 11 base pairs, were multimerized three times with the middle insert inverted relative to the flanking inserts. Spacing between Smad boxes of adjacent inserts is 8 base pairs. ments (LLR is LRR in reverse orientation), along with RL and LR pairs and one example of RLL (Fig. 1A). The LRR sequences of vg and Ubx can also be modeled as RR arrangements in the opposite direction.

Structure of Native Smad-binding Sites-
These multiple arrangements of Smad boxes and Smad boxlike sequences within functional response elements suggests that Smad complexes might interact flexibly with two or three abutting Smad boxes in various orientations. Such flexibility could perhaps derive from the linker that separates the DNAcontacting MH1 domains from the trimerizing MH2 domains.
Activation by Smad3 and Smad4 Is Dependent upon the Orientation of Paired Smad Boxes-To determine how Smad box orientation affects recognition by Smads we constructed a series of lacZ reporter genes in which a basal Drosophila hsp70 promoter was positioned adjacent to six pairs of Smad boxes arrayed in the RR, LR, or RL configurations ( Fig. 1, B and C). These reporter plasmids were co-transfected into Drosophila S2 cells together with effector plasmids that expressed human Smad3, Smad4, and an activated form of the ActRIB receptor, ActRIB(TD) (Fig. 2). ActRIB is specific for the activin/TGF␤ pathway (41) and was co-expressed to promote phosphorylation of Smad3 in S2 cells.
In the absence of ActRIB(TD) neither Smad3 nor Smad4 activated reporter transcription substantially above background, while co-transfection with both Smads resulted in 13-, 4-, and 80-fold activation of RR, LR, and RL, respectively, over no-Smad control transfections ( Fig. 2A). For RL, this represents a 24-fold stimulation of reporter activity by coexpressed Smad3 and Smad4 over the combined levels of activation observed when Smad3 and Smad4 are expressed individually.
Inclusion of ActRIB(TD) in transfections stimulated activation of all reporters by coexpressed Smad3 and Smad4 at least 5-fold (Fig. 2B, note difference in scales for ␤-galactosidase activity). In contrast to the results obtained without receptor, the RL reporter was strongly activated by individually transfected Smad3 or Smad4, while co-transfection with both Smads resulted in levels of induction comparable to levels observed for Smad3 alone. Activation of RL by Smad4 alone as a consequence of being coexpressed with ActRIB(TD) suggests an interaction with an endogenous r-Smad (42, 43) since Smad4 is not phosphorylated by receptors. Increasing the spacing between the RL Smad boxes by a single base pair abolished activation ( Fig. 2B inset, compare RL to RLϩ1). In contrast to RL, the RR and LR reporters responded synergistically to coexpression of Smad3 and Smad4 in the presence of Ac-tRIB(TD), although the overall levels of induction were 8-and 20-fold lower than for RL, respectively. The observed failure of RR and LR reporters to respond to Smad3 alone is consistent with previous reports that without Smad4, Smad3 has little effect on the PAI-1, collagenase I, or JunB promoters (15,29,34), each of which contains functional RR sites (Fig. 1). Together, these results indicate that the level of Smad-dependent transcriptional activation depends on both the orientation and spacing of Smad boxes. Furthermore, the RR and LR orientations of Smad boxes require both Smad3 and Smad4 for maximal transcriptional activation, while RL is able to respond to Smad3 alone.
All Orientations of Three Abutting Smad Boxes Mediate High Levels of Activation by Smad3 and Smad4 -As discussed above, potentially tripartite Smad-binding sites exist in both TGF-␤/activin and Dpp response elements. To determine whether interaction with a third Smad box would enhance activation, we constructed a set of reporters in which each binding site consisted of three Smad boxes. Each of the four possible arrangements of three Smad boxes was tested with ActRIB(TD) in the presence or absence of co-transfected Smads (Fig. 3A). Addition of a third Smad box to RR, either as RRR or LRR, elevated reporter expression in response to Smad3 plus Smad4 in comparison to the levels detected for RR and LR (e.g. compare Fig. 3A, bar 4, LR, RR, and LRR). For RRR and LRR, coexpression of both proteins produced responses 2-and 5-fold greater than the additive response of Smad3 and Smad4 expressed separately (e.g. compare Fig. 3A, LRR bar 4 to bars 2 and 3). Reporter activation was also elevated by addition of a third Smad box to RL, either as RLR or RLL, although the effect was less dramatic than for RRR and LRR. Like RL, RLR and RLL had a less than additive (synergistic) response to the coexpression of both Smad3 and Smad4 (in Fig. 3A compare bar 4 with bars 2 and 3 for RLR and RLL reporters).
While these data suggest that Smad3/4 complexes are capable of recognizing triple-Smad box sites, it was also possible that in each case the observed levels of reporter activity resulted from interaction with only the two outside Smad boxes.
To determine whether the middle Smad box was dispensable for activation we generated a set of reporter constructs in which the sequence of the middle Smad box was changed from GTCT to ATCT (Fig. 3B). The Smad3 MH1 domain has been shown to make specific contacts to the guanine at the first position in the Smad box (30) and mutation to adenine has been shown to disrupt binding of GST-Smad3MH1 or GST-Smad4MH1 in the context of an RL pair (28). We found that the mutated reporter constructs were not induced by co-transfection with Smad3 and Smad4 regardless of the orientation of the outside Smad boxes (Fig. 3B).
Activated Smad3/4 Complexes Bind Cooperatively to Two Arrangements of Triple Smad Boxes-To assess the effects of binding site organization on Smad DNA binding affinity, gel shift assays were performed using whole cell extracts from S2 cells co-transfected with Smad3, FLAG-Smad4, or both Smads. ActRIB(TD) was included in each transfection. Unlike the reporters, which had six Smad-binding sites, each gel shift probe contained only a single Smad-binding site consisting of two or three Smad boxes. The RR and LR probes were not bound by Smad3, Smad4, or both proteins in the same extract (Fig. 4A,  lanes 2-4 and 6 -8). In contrast, the RRR and LRR probes gave rise to weak gel-shift bands with Smad3 or Smad4 alone (Fig.  4A, lanes 16, 17, 20, and 21), and a stronger novel band when both proteins were coexpressed (asterisks in Fig. 4A, lanes 18  and 22). This result provides direct evidence that native Smad Smad3/4 complexes can bind cooperatively to three tandem Smad boxes, and suggests that the correlated effects on reporter activation described above are due in part to differences in DNA binding affinity.
Lighter exposure in the bottom half of Fig. 4A shows that the novel gel-shift band is also specific to Smad3/4 extracts in lanes 12, 26, and 30 that contain RL, RLR, and RLL probes, respectively. This novel band was eliminated by incubation with antibody against Smad3 or against the Flag tag at the amino terminus of Smad4 (Fig. 4A, lanes 13 and 14), evidence that both proteins were contained in this complex. Anti-Flag gave a stronger super-shifted band than anti-Smad3 (compare bands marked "S" in Fig. 4A, lanes 13 and 14), suggesting that anti-Smad3 might have acted primarily to disrupt formation of the Smad3/4 complex. Consistent with this, we failed to detect a super-supershift when both antibodies were included in the binding reaction (data not shown).
Although the stoichiometry of the Smad3/4 gel shift complex is unknown, the ability of the complex to bind RRR and LRR probes but not RR or LR probes suggests it consists of at least three subunits (compare lanes 18 and 22 with lanes 4 and 8). The slower mobility Smad4 band was weaker in the shift of the RRR probe (lane 18) suggesting that RRR has a higher affinity for the Smad3/4 complex than for Smad4 alone, consistent with the synergistic response to coexpression of Smad3 and Smad4 shown in Fig. 3A. A faint band from the control extract migrated at the same position as the Smad3 band and may represent binding by endogenous Drosophila Smads (compare lanes 9 and 10, for example).
The binding activity of Smad3 in whole cell extracts was approximately 50-fold lower than what was observed for Smad4. Western blot analysis (Fig. 4B) showed that this reflected a difference in binding activity since Smad3 protein was only about 2-fold lower in concentration than the Smad4 protein in these extracts. Relative to Smad4, the low DNA binding activity of Smad3 in whole cell extracts might be explained by incomplete activation by co-transfected ActR1B(TD) since the DNA binding activity of the Smad3 MH1 domain has been shown to be inhibited in non-activated full-length Smad3 generated by bacterial expression (29). Alternatively, the high DNA binding affinity of baculovirus-derived Smad3 (see below) may have been an artifact of its modified sequence (i.e. it was tagged at the NH 2 terminus with hexahistidine) or of abnormal post-translational modification resulting from overexpression.
It is not clear why Smad3/4 complexes migrate at a faster rate than complexes with only Smad3 or Smad4 but this ob-  4). A, gel shift assays in which 5 l of extract was used for each binding reaction. Each 32 P-labeled DNA probe contained a single binding site of two or three adjacent Smad boxes. Gel shift complexes were identified as containing Smad3 (3), FLAG-Smad4 (4), or both Smads (asterisk). For supershifts (S) of Smads bound to the RL probe, antibodies to Smad3 (3) or the FLAG epitope (F) were added to the binding reaction. Top panel is a longer exposure to visualize shifts of RRR and LRR probes. Note the faster mobility of bands containing Smad3 and Smad4 (asterisks) compared with band shifts with Smad3 or FLAG-Smad4 alone. Also note that RRR and LRR are bound much more weakly than RL by the Smad3/4 complex but are activated nearly as well in reporter assays. Likewise, FLAG-Smad4 binds RL, RLR, and RLL with much higher affinity than Smad3. B, expression levels of Smad3 and FLAG-Smad4 were compared by Western blotting using 18 ng of purified His-Smad3 (S3) or His-Smad4 (S4) to estimate the amount of Smad protein present in 2.5 l of extract. C, gel shifts using 50 ng or 250 ng of purified His-Smad3 or His-Smad4. Proteins were expressed and purified using a Baculovirus expression system. His-Smad3 was coexpressed with a constitutively active version of the Alk-5 receptor. servation may be related to the report that Smad2-Smad4 complexes appear smaller than homomeric Smad2 complexes when fractionated by gel filtration (39). The faster mobility indicates that the Smad3/4 complexes detected in gel shifts are probably not larger than homomeric Smad3 or Smad4 complexes, and are therefore not likely to be hexamers.
Interaction of Purified Smad3 and Smad4 Complexes with Three Abutting Smad Boxes-Direct binding to the GTCT arrays was also tested by gel shift assays using purified fulllength Smad3 and Smad4 generated using a baculovirus expression system. Smad3 was coexpressed with activated Alk-5 receptor to promote phosphorylation. Smad3 and Smad4 bound efficiently to the RL probe but not to the RR or LR probes (Fig.  4C). Each protein also bound with higher affinity to the triple Smad box probes LRR and RRR. These data show that activated full-length Smad3 is capable of binding to DNA directly with affinity and specificity for Smad box orientation that is similar to that of Smad4. The high affinity of Smad3 and Smad4 for RRR and LRR versus LR and RR (e.g. compare lane 23 to lanes 8 and 13) indicates that both proteins are capable of forming complexes that can contribute three MH1 domains to DNA contact. The Smad-DNA complexes migrated similarly regardless of whether the probe had two or three Smad boxes, evidence that each probe was bound by a protein complex of the same size. This indicates that the number of subunits present in each Smad complex is fixed, with at least three DNA-binding domains available to contact DNA.
Together, the results of the reporter and DNA binding assays are consistent in indicating that, when present in three-box sites, RR and LR pairs contribute to strong Smad-DNA interaction and transcriptional activation, both of which respond synergistically to the combined activities of Smad3 and Smad4.
Requirement of the MH1 Domain for Reporter Activation-The results presented above indicate that Smad3 and Smad4 are both capable of contributing to transcriptional activation through direct DNA contact. To test this, we expressed versions of Smad3 and Smad4 lacking the MH1 DNA-binding domain (Smad3LC and Smad4LC, respectively). When the mutant Smads were expressed individually or together, no induction of reporter activity was observed (Fig. 5, bars 3, 5, and 9). Western blots showed that Smad3LC and Smad4LC were present in cell extracts at concentrations just as high as the full-length proteins. Furthermore, co-transfection of Smad3LC dramatically stimulated the ability of full-length Smad4 to activate transcription (compare Fig. 5

, LRR bar 4 versus bar 3 versus bar 7)
. This potent stimulation is likely to have resulted from increased nuclear translocation and derepression of the MH2 trans-activation function (1,8,12). The inability of Smad3LC to activate transcription without coexpressed Smad4 provides additional evidence that reporter activation in response to fulllength Smad3 (i.e. without Smad4) results from Smad3-DNA contact. Co-transfection of Smad4LC with Smad3NLC stimulated transcription of reporters containing RL inverted repeat sites (RL and RLR) to levels similar to what was observed for full-length proteins (compare bar 8 with bar 6 for RL and RLR reporters). Thus, for RL and RLR, it appears that the Smad3 MH1 domain is sufficient for DNA contact. Conversely, coexpression of Smad4LC reduced the response of the RR, RRR, LR, and LRR reporters (Fig. 5, compare bar 8 with bar 6 for these reporters), indicating that these arrangements favor DNA contact by Smad4. Nonetheless, the combination of Smad4LC and Smad3NLC still resulted in levels of induction that were slightly higher than those observed for Smad3 alone (compare bar 8 with bar 3), possibly reflecting a weak contribution of the Smad4 MH2 domain toward stabilization of Smad complexes or to transcriptional activation. The failure of Smad4LC to enhance the activity of Smad3NLC is consistent with the ability of activated r-Smads to translocate to the nucleus independently of Smad4 (12), but also fits with evidence presented here that homomeric complexes of Smad3 are capable of activating transcription, particularly through recognition of RL-binding sites.
The failure of Smad3LC and Smad4LC to activate reporters demonstrates that DNA contact is mediated by MH1 domains of the exogenously expressed human Smads in this experimental system. The dramatic enhancement of reporter activation by co-transfection of Smad3LC with Smad4NLC confirms that Smad3LC is stable and capable of activation when complexed with Smad4. While the MH1 domain of either Smad3 or Smad4 is sufficient for recognition of any of the reporters, the RR, RRR, and LRR sites exhibit a stronger dependence on the Smad4 MH1 than do the RL, RLR, and RLL sites. This suggests that Smad4 may bind or activate RR, RRR, and LRR more efficiently than Smad3.
A gel shift assay using whole cell extracts of transfected S2 cells was used to determine whether Smad4LC forms a DNAbound complex in combination with Smad3 (Fig. 6). Smad4LC alone failed to form a complex with the RL DNA probe (lane 5). However, Smad4LC combined with Smad3 to form a complex intermediate in mobility between a complex containing Smad3 alone and one containing Smad3 plus full-length Smad4 (lane 9, 3/4LC arrow, compare with bands in lanes 2 and 7). This Smad3-Smad4LC complex was super-shifted to a slower mobility band by inclusion of anti-Smad4 antibody in the binding reaction (compare lane 10 with lane 9 of Fig. 6). It is not clear why the Smad3/4 complex migrates faster than complexes containing Smad3 or Smad4 alone, or why removal of the Smad4 MH1 domain decreases the mobility of this complex.
Smad1 Prefers Binding Sites with Three Smad Boxes-Smad1 mediates BMP-specific biological responses that differ from the TGF-␤/activin responses regulated by Smad2 and Smad3 (4). Nevertheless, Smad1 ␤-hairpin DNA contact residues are identical to those of Smad3, and the MH1 domains of both proteins bind to a GTCT probe with similar affinity (30). Similarity in Smad1 and Smad3 DNA binding specificity is also suggested by the responsiveness of a multimerized JunB promoter sequence, 4X(CAGACAGT), to both TGF-␤ and BMP2 (34), although DNA contact may be mediated by Smad4, as discussed below.
We tested the ability of Smad1 to activate the Smad box reporters in the presence or absence of Smad4. An activated form of the Drosophila BMP Type 1 receptor homolog Thickveins, Tkv(QD), was included in transfections to promote phosphorylation of Smad1. Overall, the results obtained from coexpression of Smad1 with Smad4 were similar to the results from coexpression of Smad3 with Smad4 (Fig. 7A). In both instances there was a preference for binding sites possessing the RL arrangement and a synergistic activation of the RRR and LRR reporters. The LRR reporter exhibited an 8-fold synergism when co-transfected with Smad1 and Smad4 (compare bar 4 with bars 2 and 3 for LRR) and also yielded the highest levels of activity with a 50-fold induction over the no-Smad control. In the absence of Tkv(QD), reporter expression was dramatically reduced for all reporters (Fig. 7B and data not shown).
We also examined the requirement for DNA binding by Smad1 using an MH1-deleted construct, Smad1LC. When fulllength Smad1 was co-transfected with Smad4LC and the RL reporter, a 15-fold reduction in activity was observed compared with transfections done with full-length Smad1 and Smad4 (Fig. 7C). This level of activation was lower than the level observed for Smad1 alone (Fig. 7A, RL bar 2), indicating that Smad4LC inhibited Smad1, possibly by forming complexes in which Smad4 fails to provide DNA contact. Conversely, cotransfection of Smad1 LC with full-length Smad4 resulted in a 6-fold elevation above the level of reporter expression obtained using both full-length proteins. Thus, while Smad1 is capable of activating through RL sites without Smad4, removal of its MH1 domain dramatically enhances its synergy with Smad4, possibly because an inhibitory MH1-MH2 interaction has been eliminated (23).
The ability of Smad1 to bind DNA was tested by gel shift using whole cell extracts from S2 cells (Fig. 8). Smad1 bound the RL probe weakly when expressed alone (lane 1). Like Smad3, coexpression of Smad1 with Smad4 resulted in a novel, faster migrating band (asterisk in lane 3) that was supershifted by antibodies specific for either Smad1 or Smad4 (S, lanes 6 and 9). The Smad1/4 complex also preferred RLR, RLL, and RL probes over LRR or RRR probes as was seen with Smad3/4 complexes (data not shown). DISCUSSION We have shown that individual Smad complexes are capable of contacting three abutting Smad boxes in all orientations. Smads have much higher affinity for three-box sites and RL inverted repeat two-box sites, than for two-box sites arranged in the tandem RR or LR orientations. These differences in binding affinity correlate with differences in reporter activation, indicating that Smad box number and arrangement are potentially important features of Smad response elements. The mechanistic basis for the differential affinity of Smads for LR, RR, and RL sites remains to be determined, but may reflect steric limitations imposed by MH2 oligomerization. Coexpression of Smad3 and Smad4 or of Smad1 and Smad4 has a strongly synergistic effect on the RR, RRR, and LRR reporters, but not on reporters driven by sites containing an RL arrangement. This synergistic effect depends on the Smad4 MH1 domain but not on that of Smad3, an indication that the Smad4 MH1 is the more effective partner at contacting RR, RRR, and LRR sites. Like Smad3, Smad1 activates the RR, RRR, and LRR reporters synergistically in combination with Smad4. However, unlike Smad3, Smad1 appears to be inhibited by Smad4LC, perhaps an indication that such complexes are un- able to contact DNA through Smad1. Thus, while Smad1 and Smad3 appear to have very similar preferences for Smad box organization, oligomerization with Smad4 appears to severely diminish the ability of Smad1 to contribute to DNA contact.
Although the crystal structure of the Smad4 MH2 domain reveals that it forms a trimer (38), the oligomerization state of the native Smad DNA binding complex has not been completely resolved. One model proposes that Smad homotrimers interact to form heterohexamers (38) while another model favors formation of heterotrimers of Smad4 and receptor associated Smads (39). Consistent with either model is genetic evidence for nonredundant function of Sma2, Sma3, and Sma4 in regulating Caenorhabditis elegans body size (44), and evidence that Smad2, Smad3, and Smad4 can form a single, functional complex (45). Here we demonstrate that Smad complexes can synergistically interact in vivo and in vitro with three GTCT sequences in RRR and LRR sites, indicating that these complexes contain a minimum of three subunits. Taken alone, this finding is consistent with either the heterohexamer or heterotrimer models. However, this synergism is retained in vivo when the Smad3 MH1 domain is deleted, indicating that RRR and LRR can be activated by Smad3LC-Smad4 complexes containing at least three Smad4 subunits. This observation is consistent with the heterohexamer model but not with the trimer model. An alternative explanation is that Smad3LC stimulates reporter activation solely by increasing the nuclear localization of Smad4, and that once inside the nucleus Smad4 can activate independently of Smad3. This explanation is consistent with the apparent ability of homomeric Smad3 and Smad4 complexes to contact all three GTCT sites of RRR and LRR probes in gel shift experiments.
How do the effects of Smad-binding site organization on Smad binding affinity and synergism relate to the sequences of native TGF-␤, activin, BMP, or Dpp response elements? As shown in Fig. 1, the majority of functional Smad-binding sites within TGF-␤/activin response elements appear to be composed of RR arrangements of GTCT or GTCT-like sequences, while the Dpp response elements of vg and Ubx contain RL, LR, and LRR arrangements. Our results show that, while less responsive than RL, RLL, or RLR to elevated levels of Smad3, Smad4, or Smad1, the RR, RRR, and LRR configurations exhibit a higher degree of synergism in response to coexpression of Smad4 with Smad3 or with Smad1. Thus RR, RRR, and LRR sites may perform some special role in responding specifically to the signaling-dependent formation of Smad3/4 or Smad1/4 complexes.
Smad-cofactor interactions are known to play an important role in the activation of some Smad targets (11,15,24,36,46). A single strong RL Smad-binding site is necessary but insufficient for activation when the cofactor FAST-1 is bound to a nearby site (28,47). Activation of the mouse goosecoid promoter by Smad2/4 and the cofactor FAST-2 has been shown to be dependent upon a FAST-2-binding site and nearby sites for Smad4 (36). The two regions footprinted by Smad4 contain GC-rich RRR, RR, and RL sites. Other TGF-␤/activin/Dpp responsive promoters have been shown to depend on non-RL Smad-binding sites and nearby cofactor-binding sites (15,32,48,49). It remains to be determined how Smad-binding site arrangement affects regulation within the context of these natural elements. In addition to the inherent differences in Smad binding affinity and synergism described here, Smadbinding site organization may be important for the topology of Smad-cofactor interactions. Cofactor interactions with the MH1 domain, e.g. the c-Jun-Smad3 MH1 interaction (15), are likely to be especially sensitive to MH1 orientation imposed by contact with DNA.
While it is remarkable that Smads are able to bind DNA through contact with three consecutive Smad boxes, there is precedent for such a trimeric interaction with DNA. Trimers of the heat shock factor are capable of stable contact with either two or three abutting repeats of the 5-base pair monomerbinding site for this inducible activator (50,51). A difference between the two is that HSF has similar affinity for pairs of binding sites that are inverted in either orientation, whereas Smads interact efficiently only with the RL configuration. A flexible linker also connects the POU domain and homeodomain of POU transcription factors, resulting in an analogous degeneracy in binding site recognition that allows POU protein function in a variety of distinct binding site and cofactor contexts (52). While the present study shows that Smads are capable of flexible interaction with tripartite binding sites, it will be important to determine whether, like POU proteins, such flexibility is significant in native regulatory contexts. Shown is the autoradiographic image of a gel shift experiment testing the ability of Smad1 and FLAG-Smad4 to bind to a labeled RL DNA probe, individually and together. DNA binding reactions were set up using 32 P-labeled RL probe and whole cell extracts from transfected S2 cells, as described in the legend to Fig. 4. Gel shift complexes were identified as containing Smad1 (1), FLAG-Smad4 (4), or both Smads (asterisk). Where indicated, binding reactions were incubated in the presence of polyclonal anti-Smad1 or monoclonal anti-FLAG antibodies (Santa Cruz Biochemicals; Sigma). Supershifted complexes are indicated by brackets.