Dimerization of Doublesex Is Mediated by a Cryptic Ubiquitin-associated Domain Fold

Male- and female-specific isoforms of the Doublesex (DSX) transcription factor regulate somatic sexual differentiation in Drosophila. The isoforms (DSXM and DSXF) share an N-terminal DNA binding domain (the DM motif), broadly conserved among metazoan sex-determining pathways. DM-DNA recognition is enhanced by a C-terminal dimerization domain. The crystal structure of this domain, determined at a resolution of 1.6 Å, reveals a novel dimeric arrangement of ubiquitin-associated (UBA) folds. Although this α-helical motif is well characterized in pathways of DNA repair and subcellular trafficking, to our knowledge this is its first report in a transcription factor. Dimerization is mediated by a non-canonical hydrophobic interface extrinsic to the putative ubiquitin binding surface. Key side chains at this interface, identified by alanine scanning mutagenesis, are conserved among DSX homologs. The mechanism of dimerization is thus unrelated to the low affinity domain swapping observed among ubiquitin-associated CUE domains. The unexpected observation of a ubiquitin-associated fold in DSX extends the repertoire of α-helical dimerization elements in transcription factors. The possibility that the ubiquitination machinery participates in the regulation of sexual dimorphism is discussed.

the DM domain in metazoan proteins related to sexual differentiation suggests that mechanisms of sexual dimorphism are in part universal (3).
The C-terminal domains of DSX F and DSX M (CTD F and CTD M ) are conserved among insect homologs. The domains contain a common dimerization element and sex-specific extensions proposed to mediate recruitment of transcriptional co-regulatory factors (7,8). The strength of dimerization (as inferred from DNA binding) is Ͻ1 nM (9). Such dimerization enhances specific DNA binding (9); a mutation in CTD F that blocks dimerization is associated with intersexual development (7). 5 The dimer does not contain a recognizable structural motif but has been predicted to form a coiled-coil (10). In this article we describe the crystal structure of a dimeric fragment of CTD F at 1.6 Å resolution and its scanning mutagenesis in a yeast two-hybrid (Y2H) system. The polypeptide (designated CTD F -p) spans residues 350 -412 and so contains both shared and sex-specific sequences. Surprisingly, the structure reveals a novel dimeric arrangement of ubiquitin-associated-domain (UBA) folds (Protein Data Bank code 1ZV1). To our knowledge, this ␣-helical motif, although widely conserved among pathways regulating DNA repair and subcellular trafficking, has not previously been found in a transcription factor. Dimerization is mediated by an extensive nonpolar interface conserved among DSX homologs. The structure of CTD F -p extends the family of transcription-related ␣-helical dimerization elements and raises the possibility that the ubiquitination machinery participates in the regulation of sexual dimorphism.

EXPERIMENTAL PROCEDURES
Protein Crystallization-CTD F -p (65 residues; GS followed by DSX F residues 350 -412) was designed based on Y2H studies (7,10) and expressed in Escherichia coli (strain B834(DE3)pLysS) as a thrombincleavable fusion protein and purified as described (11). For selenomethionine labeling, the protein was expressed in M9 minimal medium containing 50 mg/liter selenomethionine and all other amino acids (except methionine) at 40 mg/liter (11). Crystals were obtained by hanging-drop vapor-diffusion in 4-l drops containing equal volumes of protein stock (12 mg/ml) and reservoir buffer (1.8 M ammonium sulfate and 7% 2-propanol).
Data Collection and Structure Determination-Native data were collected at Advanced Photon Source beamline 14BMC at 100 K. Singlewavelength anomalous dispersion data were collected at NSLS beamline X9B at the selenium peak (0.9788 Å). Data collection statistics are given in TABLE ONE. Data were integrated and scaled with HKL2000 (12). Substructure determination and phasing were accomplished with the SHELX suite of programs (13). Initial model building employed Warp/ARP (14). Additional rounds of model building were performed using O (15). Initial refinement was accomplished using CNS, applying non-crystallography symmetry, overall B-factor corrections, and bulk solvent corrections (16). Final rounds of refinement were performed with the CCP4 program REFMAC5 (17). Accuracy of the model was assessed with DDQ (18); statistics are given in TABLE TWO. Figures were generated using PYMOL (19).

RESULTS
Overview of Structure-The crystal structure of the CTD F -p (residues 350 -412) was determined at a resolution of 1.6 Å. Representative electron density is shown in Fig. 2A. Values of R and R free (20.8 and 25.7%, respectively) in part reflect disorder of the N-and C-terminal segments. 6 The N-terminal five residues of CTD F -p (including two residues derived from thrombin cleavage of the fusion protein) are omitted from the model. The structure exhibits good stereochemistry, with 94% of non-Gly, non-Pro residues occupying the most favored region of the Ramachandran plot and the remaining 6% occupying the additionally allowed region. No residues are observed in generously allowed or disallowed regions.
Novel Dimer Interface-Dimerization is mediated by helices ␣2 and ␣3 (Fig. 2B). Whereas ␣1 and ␣3 define the surface of the dimer, ␣2 and ␣2Ј pack at the interface; they are parallel and nearly perpendicular to 6 The N-and C-terminal five residues of each protomer exhibit neither complete disorder nor two or more well defined conformations, resulting in uninterpretable positive   interactions are also observed (involving Asp-354, Leu-357, and Tyr-405Ј). The junction between sex-nonspecific and sex-specific portions of CTD F -p (G398) packs against ␣2Ј across the dimer interface. This junction is a site of mutation (G398D) associated with intersexual development (7). The overall structure is comprised predominantly of nonsex-specific residues and so is likely to be similar in CTD M . Alanine Scanning Mutagenesis-Y2H studies of variant domains were undertaken to test the contribution of interfacial side chains to dimerization (TABLE THREE). Bait and prey plasmids were constructed using the Gal4 system (Fig. 8A); photomicrographs of representative X-gal indicator plates demonstrate a range of ␤-galactosidase activities (white, light blue, and blue colonies; Fig. 8, B and C). A negative control was provided by G398D (7). Colorimetric phenotypes of Y2H yeast colonies were in each case verified in cell extracts by quantitative ␤-galactosidase assays.
17 Ala substitutions were tested; 3 substitutions in the core (Tyr-378, Ile-380, Ile-395) markedly impair ␤-galactosidase activity whereas 3 substitutions at the surface (W371A, Y400A, and N403A) yield no perturbations. Ala substitutions at three interfacial valines in ␣2 and ␣3 exhibit partial (Val-379 and Val-402) or full (Val-401) activity; the resulting packing defects may be offset by the higher helical propensity of Ala. R394A (expected to disrupt inter-and intramolecular salt bridges) and K382A (expected to disrupt a dimer-specific charge-stabilized hydrogen bond) markedly impair activity. E396A and E397A are partially tolerated, whereas D383A has no effect. Interestingly, the two prolines, although not features of canonical UBA or CUE domains, appear essential. G398A enhances ␤-galactosidase activity, presumably by stabilizing ␣3.
Putative Ub Binding Surface-A putative Ub binding surface (inferred based on known monomeric UBA and CUE domains; gold surface in Fig.  7A) does not overlap with the dimer interface of CTD F -p (green surface). It is, thus, possible that the CTD F -p dimer presents two Ub-binding sites, one on each side. Heteronuclear NMR titration experiments suggest weak binding of the domain to Ub at protein concentrations Ͼ100 M (supple-  mental Fig. S1). Alignment of one protomer with CUE2 as bound to Ub (21) yields a model of a complex between Ub and the CTD F -p dimer (Fig. 4C). This model suggests that potential UBA-Ub salt bridges are conserved: Asp-18 and Asp-40 of CUE2 align with Glu-365 and Glu-389 of CTD F -p. The model also permits a second Ub molecule to bind to the other protomer. We, thus, envisage that such a dimeric UBA fold could enable bidentate recognition of a poly-Ub chain.

DISCUSSION
The CTDs of DSX enhance DNA recognition by providing a strong dimer contact (7,10,22). Although the monomeric DM domain itself can bind DNA sites as a cooperative dimer (4,23), the CTDs enhance specific DNA binding by 35-fold (9). The biological importance of CTD dimerization is suggested by a mutation in CTD F (G398D) that blocks dimerization in association with intersexual development (7). Unexpectedly, the crystal structure of CTD F -p demonstrates that dimerization is mediated by a UBA-domain fold, previously unrecognized due to the absence of detectable sequence homology. The stability of the DSX F dimer (K d 10 Ϫ10 M; Ref. 9) reflects formation of an extensive hydrophobic interface flanked by intersubunit salt bridges and hydrogen bonds. The intersexual mutation G398D would insert an uncompensated charge into this interface. The CTD dimer, thus, extends to the realm of transcription the repertoire of UBA dimerization, previously implicated in cell-cycle FIGURE 3. Proline-induced kink in protomeric mini-core and ␣2-␣2 interaction. A, Pro-375 (red) induces a kink in ␣2 defining segments ␣2 A (green) and ␣2 B (blue). B, key residues in mini-core identified in prior Y2H studies (7). C, stereoview of ␣2-␣2Ј dimer interface. One protomer is shown in blue, and the other is in teal. Green segments represent the 3 10 helix (␣2 A ).    (7).  control (24) and receptor trafficking (25). 7 To our knowledge, this is the first description of the structure of a dimeric UBA fold. 8 As discussed in turn below, this structure represents a novel class of helical dimerization domains, rationalizes results of mutagenesis, and suggests a possible role for Ub in sex determination. Small ␣-helical dimerization elements are often observed in transcription factors, defining regulatory families (26). Examples include the leucine zipper ( Fig. 9A; Ref. 27), the basic-region helix-loop-helix motif ( Fig. 9B; Ref. 28), and the canonical four-helix bundle as exemplified by Lac repressor (Fig. 9C; Ref. 29). These structures share with CTD F -p the use of parallel ␣-helices and extensive interhelical contacts. In many (but not all) cases such modular motifs can mediate both homo-and heterodimerization and so enable combinatorial gene regulation (30). Although the prototypical bZIP transcription factor GCN4 forms only homodimers, for example, mammalian homologs (including proto-oncoproteins Fos and Jun) form networks of interacting proteins. UBA domains can likewise form homo-and heterodimers (24,31). Although DSX itself forms stable homodimers and is not known to form heterodimers with homologous DNA-binding proteins, it would be of interest should a family of UBA-containing transcription factors in Drosophila or other organisms be found to exhibit analogous combinatorial regulation. It is also possible that the DNA binding activity of DSX or  related factors can be repressed through formation of an inactive UBAmediated heterodimer. Proof of principle is provided by molecular genetic studies of transgenic flies in which non-native DSX F -DSX M heterodimers with altered gene regulatory properties are envisaged (32).
The present structure rationalizes the results of alanine-scanning mutagenesis (TABLE THREE), indicating that dimerization requires steric complementarity at a hydrophobic interface and is further stabilized by flanking salt bridges. Similar features stabilize the coiled-coil structure of leucine zippers (27,33). Our results are also in accord with previous Y2H studies in which error-prone PCR was employed to identify residues critical for dimerization (7). Although in that study the variant domains each contained two or three mutations (thus making uncertain the effect of any single substitution), the structural environments of these mutations are of interest. For L373Q and M377K, Leu-373 and Leu-373Ј pack against each other to seal one edge of the dimer interface (Fig. 3C). Met-377 contributes to the mini-core of the protomer and an edge of the dimer interface (Fig. 3B). For L381S, K382R, and Q313K, whereas Gln-313 is outside of the dimerization domain, Leu-381 packs within the protomer (Fig. 5, A and B); Lys-382 forms a dimer-related charge-stabilized hydrogen bond with Glu-397Ј (Fig. 6). Scanning substitution K382A in our hands likewise results in a loss of dimerization. For I395N, E418G, and R420P, Ile-395 packs within the mini-core of the protomer and at one edge of the dimer. Residues Glu-418 and Arg-420, not included in the crystallized fragment, are presumably disordered. It would be of future interest to extend this approach by random cassette mutagenesis; analysis of allowed and disallowed families of substitutions would further define the sequence requirements of UBA dimerization.
The presence of a UBA-like domain in DSX may represent the incidental recruitment of a common structural motif or indicate a functional role for ubiquitin (e.g. binding of mono-or polyubiquitinated proteins, engagement of the enzymatic machinery of ubiquitination and/or the proteasome) in DSX-mediated transcriptional regulation. Although no biological data are available, the striking structural similarity between CTD F -p and the classical UBA domain suggests a potential role for the ubiquitin system in sex-specific gene regulation. As demonstrated by Lipkowitz and co-workers (31) in studies of c-Cbl and Cbl-b, the sequence of a UBA-like domain does not predict whether or not it binds Ub. Although c-Cbl and Cbl-b share 85% sequence similarity, for example, only Cbl-b is able to bind Ub in vitro. Furthermore, the Cbl-b UBA domain contains two polar residues within the otherwise hydrophobic Ub binding patch characterized among Rad23-derived UBA domains. Consideration of the potential role of Ub binding by CTD F -p is, thus, tempered by the marked variability of UBA-like sequences and their heterogeneous biological functions. Indeed, although evidence for low affinity binding of Ub to CTD F -p is provided by preliminary NMR titration experiments (supplemental Fig. S1), the physiological significance of such binding remains unclear.
To our knowledge, the participation of the ubiquitination machinery in the Drosophila sex-determining hierarchy has not been previously suggested. Such involvement would nevertheless be plausible in light of growing evidence that ubiquitination and ubiquitinated proteins can play central roles in the regulation of eukaryotic gene expression. (i) Ub-triggered proteolysis can control transcription through "suicidal" regulation wherein each cycle of transcription is coupled to destruction of a specific transcription factor (34). (ii) Ubiquitination can regulate subcellular localization (35) and protein-protein interactions (36). (iii) Transcriptional initiation and functional mRNA processing can require degrons, proteolytic signaling elements within transcriptional activation domains that recruit Ub ligases (37). (iv) Non-destructive ubiquitination of activation domains can enable transcriptional activation (34). It is possible that one or more of these general processes operate in transcriptional regulation by the DSX isoforms but were missed in clas- . Female-and male-specific C-terminal extensions of DSX are shown in red and purple, respectively. DSX binds to DNA as a dimer (green spheres, DM domain; blue and teal, CTD-p). Top, female-specific complex of DSX F occupies dsx A as an adjoining bZIP factor binds to bzip1 (light blue ribbon). Recruitment of IX (arrow) enables synergistic activation of transcription. Binding of DSX F or DSX M displaces AEF1 from its target site aef1. In females, fat-body expression of DSX F is higher than that of AEF1, and therefore, in the presence of bZIP1, yolk proteins are expressed. Expression is by contrast repressed in ovary due to higher levels of AEF1, which displaces DSX F . Bottom, male-specific repression of yolk proteins occurs as binding of DSX M , which is 122 residues longer than DSX F (purple tail; Ref. 2), occludes bzip1 or inactivates bound bZIP1 (44). Also pictured is non-tissue-specific activator REF1. D, schematic models of sex-specific recruitment of IX (orange rectangles) by DSX F . A cylinder model of CTD F -p dimer is shown in blue and teal. Weak DSX F -IX interaction (top) would be strengthened by bidentate recognition of tethered Ub moieties (bottom, pink triangles), providing a mechanism of ubiquitination-coupled assembly of a preinitiation complex. sical genetic screens for impaired sexual differentiation due to the pleiotropic functions of the ubiquitination machinery.
Biochemical mechanisms of sex-specific gene regulation are not well characterized in Drosophila, and therefore, the relevance of Ub-related processes to DSX function cannot presently be evaluated. It is nonetheless intriguing that the topography and electrostatic potential surface of CTD F -p resemble those of the Ub binding surface of the CUE domain (Fig.  10). Given the stringent conservation of this surface among DSX homologs, we propose that the DSX UBA domain may contribute to the assembly of a specific preinitiation complex through non-covalent interactions with mono-or polyubiquitinated proteins. Such an interacting protein(s) is presently unidentified. One possibility is suggested by a model based on the possible ubiquitination of DSX F -associated transcriptional coactivator Intersex (IX; Ref. 8), the homolog of a mammalian Mediator subunit (38). A framework is provided by the Wensink model of sex-and tissue-specific expression of yolk protein genes in the fat body ( Fig. 10C; Refs. 39 and 40). In this model DSX F and DSX M occupy the same sites in the fat body enhancer; distinct protein-protein interactions lead to opposing gene-regulating functions (upper and lower panels of Fig. 10C). A key (and newly recognized) component is provided by IX (8); its sex-specific binding to DSX F (and so presumably to CTD F ) is required for female differentiation (32). In Y2H studies IX-DSX F binding is detectable but weak. The strength and avidity of such binding could be enhanced by dual recognition of IX and a tethered Ub molecule or Ub chain (Fig. 10D).
In the future the molecular functions of the DSX isoforms in Drosophila development may be addressed by structure-based molecular genetics. A critical test of whether DSX F functions in vivo as a Ub-binding protein, for example, could be provided by targeted mutations in its putative Ub binding surface that impairs Ub binding but not dimerization or binding of unmodified IX. Should such mutations be obtained in vitro, they might provide in vivo probes for the general involvement of Ub-dependent regulatory mechanisms (such as but not limited to bidentate IX recruitment; Fig. 10D). Of particular importance would be the developmental phenotype of mutant flies homozygous for a corresponding variant dsx allele. Possible impairment of somatic sexual differentiation would be of broad interest as a model for a UBA-associated genetic disease. 9 Deciphering the role of the transcription-associated ubiquitination machinery in developmental decisions represents an important general challenge. Should the UBA-like domains of DSX indeed function as Ub binding motifs, studies of sex-specific gene regulation in Drosophila would provide an opportunity to define the biochemical contribution of ubiquitination to the operation of a metazoan genetic switch.