Structure-Function Relationships in Human Testis-determining Factor SRY

Background: Testis-determining factor SRY provides a model of bent protein·DNA complex. Results: Mutation of an invariant Trp perturbs multiple biochemical, cellular, and transcriptional activities. Conclusion: Folding and function of a sequence-specific HMG box require a core “aromatic buttress.” Significance: Mutations in SRY causing human sex reversal probe the architecture and evolution of a DNA-bending motif. Human testis determination is initiated by SRY, a Y-encoded architectural transcription factor. Mutations in SRY cause 46 XY gonadal dysgenesis with female somatic phenotype (Swyer syndrome) and confer a high risk of malignancy (gonadoblastoma). Such mutations cluster in the SRY high mobility group (HMG) box, a conserved motif of specific DNA binding and bending. To explore structure-function relationships, we constructed all possible substitutions at a site of clinical mutation (W70L). Our studies thus focused on a core aromatic residue (position 15 of the consensus HMG box) that is invariant among SRY-related HMG box transcription factors (the SOX family) and conserved as aromatic (Phe or Tyr) among other sequence-specific boxes. In a yeast one-hybrid system sensitive to specific SRY-DNA binding, the variant domains exhibited reduced (Phe and Tyr) or absent activity (the remaining 17 substitutions). Representative nonpolar variants with partial or absent activity (Tyr, Phe, Leu, and Ala in order of decreasing side-chain volume) were chosen for study in vitro and in mammalian cell culture. The clinical mutation (Leu) was found to markedly impair multiple biochemical and cellular activities as respectively probed through the following: (i) in vitro assays of specific DNA binding and protein stability, and (ii) cell culture-based assays of proteosomal degradation, nuclear import, enhancer DNA occupancy, and SRY-dependent transcriptional activation. Surprisingly, however, DNA bending is robust to this or the related Ala substitution that profoundly impairs box stability. Together, our findings demonstrate that the folding, trafficking, and gene-regulatory function of SRY requires an invariant aromatic “buttress” beneath its specific DNA-bending surface.

tained within the sex-determining region of the Y chromosome (2). Sry encodes an architectural transcription factor (TF) 2 whose expression in the embryonic gonadal ridge activates a developmental program leading to testis formation (3). The key initial step is Sry-directed transcriptional activation of autosomal gene Sox9 in pre-Sertoli cells (Fig. 1A) (4). Expression of Sox9 in turn regulates a gene-regulatory network (GRN) that distinguishes between testicular and ovarian programs of gonadogenesis (5). Hormones secreted by the fetal testis, once formed, direct regression of female primordia (Müllerian inhibiting substance/anti-Müllerian hormone; MIS/AMH) (6) and external virilization (testosterone) (7). Clinical mutations in this pathway, including in the genes encoding SRY, SOX9, MIS/AMH, and the androgen receptor, are associated with disorders of sexual differentiation (DSD) (8). Here, we have exploited a DSD-associated mutation in SRY (9) to dissect a key structure-function relationship in the signature domain of an architectural TF. 3 Sry binds and bends specific DNA sites through a conserved protein motif, designated the high mobility group (HMG) box (10). In human SRY, the box is flanked by N-and C-terminal segments lacking conserved or recognizable structural motifs ( Fig.  1B) (11,12). The HMG box itself contains an N-terminal ␤-strand, three helices (␣ 1 , ␣ 2 , and ␣ 3 ), and a C-terminal basic tail (Fig.  1C). Its L-shaped structure is conserved within a eukaryotic superfamily of architectural TFs (13), which includes both sequence-specific and structure-specific families of DNA-bind-ing proteins (14). The discovery of Sry in therian mammals enabled identification of metazoan TFs containing Sry-related HMG boxes (designated Sox) (15). Broadly involved in the regulation of development (16), this family is of interest in relation to organogenesis (17), stem-cell biology (18), patterning of the brain (19), diverse human birth defects (20), and mechanisms of transcriptional deregulation in cancer (21,22). Structure-function relationships in human SRY are likely to generalize to the Sox family and thus provide insight into the evolution of an ancestral eukaryotic DNA-bending motif.
The HMG box contains major and minor wings, which together define an angular DNA-binding and DNA-bending surface (13,(23)(24)(25)(26)(27). Structures of specific DNA complexes containing the human SRY box or homologous SOX domains have been determined by multidimensional NMR spectroscopy (25,28) and x-ray crystallography (29). In such complexes, DNA bending is associated with partial intercalation of a protein side chain (designated the "cantilever") (30 -32) at the crux of this angular surface (33). The cantilever (Ile-68 in human SRY; consensus position 13 of the HMG box as defined by Werner et al. (25)) is part of a conserved "hydrophobic wedge" of major wing side chains (Met-64, Phe-67, and Trp-98; respective HMG box positions 9, 12, and 43) that insert into a widened DNA minor groove (28). Additional determinants of DNA bending are provided by the minor wing and its basic C-terminal tail (34). These DNA-binding and -bending elements are mutational hot spots for DSD-associated mutations in SRY and SOX9 (35). Corresponding mutations in other SOX factors cause diverse developmental abnormalities reflecting their individual biological roles (16,22). Left, stage-and cell type-specific inputs; right, selected male-specific outputs (including Mü llerian inhibiting substance/anti-mü llerian hormone (MIS/AMH), prostaglandin D 2 synthase (PTGDS), and fibroblast growth factor 9 (FGF9)). Positive regulation is indicated by arrow; MIS/AMH directs regression (Ќ) of female primordia. B, domain organization of human SRY with central HMG box (orange). De novo mutant W70L is indicated by blue triangle. Within the HMG box, the following domains are highlighted: N-terminal bipartite nuclear localization signal (NLS) (split green lines), four-residue nuclear export signal (NES) (gray lines), a basic C-terminal tail (olive block) that contributes to the kinetic stability of the bent DNA complex, and monopartite nuclear location signal within the tail (smaller green block). C, structure of L-shaped HMG box (left) and bound DNA (right). Flanking termini and motifs are color-coded as in B. Core aromatic residues are shown as sticks (blue); Trp-70 (consensus position 15) is indicated by an arrow. Helices ␣ 1 -␣ 3 (ribbons) are orange. Coordinates were obtained from PDB entry 1J46 (28). D, mammalian Sry residues 2-51 (consensus numbering; residues 57-106 in human SRY). Trp is invariant at box position 15 (blue box). Adjoining wedge motif residues are boxed in black; a conserved Gly at consensus position 40 (red box) abuts the wedge motif.
This study has focused on an invariant position of the SRY HMG box that underlies the hydrophobic wedge. Trp-70 (box position 15) is a site of a clinical mutation (Leu) that arose de novo in paternal spermatogenesis (9). This residue is of structural interest in relation to the core of the major wing (Fig. 2, A-C). Contained within helix ␣ 1 , the side chain of Trp-70 abuts two other aromatic side chains, Trp-98 in ␣ 2 and Phe-109 in ␣ 3 (respective box positions 43 and 54; Fig. 2D). This aromatic cluster is invariant among mammalian Sry sequences (Fig. 1D) and broadly conserved among metazoan Sox sequences (36,37). Because of such conservation and its DSD-associated mutation, we sought to investigate the contribution of Trp-70 to the structure and function of SRY.
This study focused on the in vitro properties of the variant SRY domains (the HMG box) in relation to the following: (a) its wild-type (WT) structure (25,28) and (b) the biological activities of the intact protein (38). The apparent lack of organized structure within the non-box segments of SRY (12), their general lack of conservation (11), and the similar DNA-binding and DNA-bending properties of intact protein and the isolated domain (39,40) support the logic of such a modular approach. 4 Our experimental design thus employed systematic mutagenesis in a yeast one-hybrid (Y1H) system (41), followed by complementary investigation of representative variants in vitro and in mammalian cell culture. Because box position 15 is part of its hydrophobic core, priority was given to nonpolar and aromatic substitutions with particular attention to DSD-associated mutation W70L. These studies culminated in analysis of transcriptional regulatory activities in an embryonic rat pre-Sertoli cell line (42), previously characterized as a model of the bipotential gonadal ridge at a stage of development just prior to its morphological differentiation (38). This model enables dissection of distinct cellular processes (including TF expression, proteasome-mediated turnover, and nuclear import) underlying the operation of a sex-specific GRN. Because SRY functions as a nuclear protein (43)(44)(45)(46), effects of the substitutions were evaluated with respect to the machinery of nuclear import (candidate mediators Exportin-4 (Exp4 (47)) and calmodulin (CaM (48)). 5 Remarkably, our results have demonstrated that Trp-70 contributes to multiple biophysical properties and diverse biochemical activities of SRY. Each of the 19 possible amino acid substitutions at this site impairs, to a greater or lesser extent, the function of the SRY DNA-binding domain in our Y1H system. Studies of representative variants demonstrated that the core tryptophan both contributes to the folding of the HMG box and enables critical cellular processes pertinent to biological function. Whereas significant residual function was observed among aromatic variants (W70F and W70Y) in accordance with sequence patterns in the HMG box superfamily (36), the clinical variant W70L and related aliphatic variant W70A were associated with pleiotropic molecular and cellular 4 Structural and biophysical studies of SRY and homologous SOX TFs by this laboratory (31,41,90) and other laboratories (28,91,92) have been restricted to their respective HMG box domains. In our hands, the intact human SRY readily aggregates in vitro leading to progressive loss of specific DNA binding activity (40). Such loss may underlie the higher sequence-or structure-specific DNA-binding affinities observed by Lippard and co-workers (39) in studies of the SRY HMG box relative to the intact protein. Such focus on isolated DNA-binding domains is broadly representative of an extensive literature concerning modular TFs containing conserved motifs (93). 5 Exp4 is a member of a class of proteins that mediate the export or import of cargo proteins (such as SRY) from the nucleus to the cytoplasm or from the cytoplasm to the nucleus. Although Exp4 was originally designated as an "exportin" in relation to its Ran-dependent role in the nuclear export of eukaryotic translational initiation factor 5A (94), Demmers and co-workers (47) have shown that Exp4 mediates nuclear import of SRY via interactions with the N-terminal segment of the HMG box (NLS residues 59 -77 in intact human SRY); an homologous interaction contributes to nuclear import of SOX9 (69). Nuclear exports of SRY and SOX9 are instead mediated by an interaction of CRM1 with a classical nuclear export signal (38,95 defects that together block SRY-directed transcriptional activation of Sox9. Although the ability of the SRY variants to direct sharp DNA bending in vitro was in each case retained, altered core packing led to mild (W70F and W70Y), moderate (W70A), or severe (W70L) perturbations to the lifetime of the bent protein⅐DNA complex. Deciphering the informational content of protein sequences can provide key insight into their function and evolution. Among the canonical families of eukaryotic DNA-binding motifs, aromatic side chains often provide conserved structural supports. The respective DNA recognition ␣-helices of homeodomains and zinc fingers, for example, are anchored by such core residues (49,50). The eukaryotic superfamily of HMG boxes is likewise remarkable for the broad conservation of an aromatic buttress beneath an angular DNA-binding surface (13,(23)(24)(25)(26)(27). Our findings, exploiting the growing human genome database, illuminate how this buttress contributes to the biophysical properties and biological activities of an ancestral DNA-bending motif. DSD-associated mutation W70L in SRY thus represents a molecular "perfect storm" 6 (51) that impairs gonadogenesis leading to a human malignancy (9).

EXPERIMENTAL PROCEDURES
Y1H Screening-Reporter strains of Saccharomyces cerevisiae (derived from strain YM4271) were as described previously (41). Expression of an SRY HMG box fusion protein (Fig. 3A, gray oval) was directed by plasmid pGAD-T7 (Clontech). Transformants were selected on minimal medium lacking Leu and uracil. Colonies were isolated and grown in liquid medium under the same selection. The extent of SRY-dependent reporter activation (␤-galactosidase (␤-gal)) was determined as follows: (i) a qualitative assay on selective plates containing a chromogenic substrate (80 g/ml 5-bromo-4chloro-3-indolyl-␤-D-galactopyranoside (X-Gal)) to which 5-7-l aliquots of overnight broth were spotted), and (ii) a quantitative assay in solution using ortho-nitrophenyl-␤-galactoside as described by the vendor (Clontech).
Protein Purification-WT and variant SRY domains of human SRY were expressed in Escherichia coli strain T LYS (New England Biolabs, Inc., Ipswich, MA) and purified as described (52). Purity was determined in each case to be Ͼ98% by SDS-PAGE. Results of matrix-assisted laser-desorption ionization time-of-flight mass spectrometry were in agreement with the expected values.
Circular Dichroism-Far-and near-ultraviolet (UV) CD spectra were obtained at 4 and 37°C in a 1-mm path length quartz cuvette using an Aviv spectropolarimeter equipped with titrating unit (Aviv Biomedical., Lakewood, NJ). The domains were made 25 M in 140 mM KCl and 10 mM potassium phosphate (pH 7.4; "standard buffer"). Thermal unfolding was monitored at a helix-sensitive wavelength of 222 nm. CD difference spectra were calculated as the buffer-corrected difference between the observed spectrum of a protein⅐DNA complex and the sum of the spectra of the free protein and free DNA site.
Protein Stability-Fractional protein unfolding was monitored as a function of guanidine hydrochloride concentration by CD ellipticity at 222 nm. The domains were made 5 M in standard buffer in a titrating cuvette. The same protein concentration was used in a titrant reservoir containing 7.8 M guanidine HCl in the same buffer. To probe effects of low concentrations of guanidine HCl (0 -1 M), titrations were also performed in which the titrant reservoir contained 1 M guanidine HCl. Thermal unfolding of free domain and equimolar protein⅐DNA complexes (25 m in "standard buffer") was monitored using a 12-bp consensus DNA duplex (5Ј-GTGATTGTTCAG-3Ј and complement); CD spectra (200 -320 nm) were measured from 4 to 90°C at 2.5°C increments Tryptophan Fluorescence Spectroscopy-Intrinsic tryptophan (Trp) fluorescence of free SRY and DNA complex domains was determined in standard buffer at 15°C at a concentration of 5 M. Emission spectra were acquired from 300 to 500 nm following excitation at 295 nm. In studies of protein⅐DNA complexes, the specific DNA site was included in the buffer for baseline correction. Extent of an inner filter effect under these conditions was estimated through control studies of the WT domain in the presence or absence of the free deoxynucleotides (dAMP, dCMP, dGMP, and dTMP) with a molar ratio equivalent to that in the  Table 1. specific DNA site (which contains eight AT bp and seven GC bp) and at a concentration such that absorbance at 295 nm was equal to that of the DNA site at 5 M. Standard errors in the measurement of integrated fluorescent intensities were estimated in five independent replicates of the spectrum of the WT domain.
Fluorescence Resonance Energy Transfer (FRET)-Protein-directed DNA bending was probed by steady-state FRET as described (34). This protocol employed a 15-bp DNA duplex of sequence 5Ј-TCGGTGATTGTTCAG-3Ј ("upper strand") and complement; consensus target site is underlined. Use of a 15-bp DNA site restricted protein binding to the 1:1 high affinity complex (41). To provide a fluorescent donor, the upper strand was extended at its 5Ј terminus by 6-carboxyfluorescein (6-FAM); the dye was flexibly linked to the DNA through a hexanyl linker. To provide a corresponding acceptor, the lower strand was extended at its 5Ј-end by tertamethylrhodamine (TAMRA), also via a hexanyl linker. The labeled DNA strands were purchased from Oligos, Etc., Inc. (Wilsonville, OR). Photophysical control studies verified the mobilities of the probes and excluded non-FRET-related mechanisms of donor quenching (34) For FRET studies, duplex DNA sites were made 3 M at pH 8.4 in 10 mM potassium phosphate, 10 mM Tris-HCl, 140 mM KCl, 1 mM EDTA, and 1 mM dithiothreitol ("FRET buffer"). With the exception of the W70L domain, studies of the protein⅐DNA complexes were undertaken under conditions in which the donor/acceptor-labeled DNA site was Ͼ98% bound, and so the contribution of residual free DNA to observed FRET efficiencies was negligible. To achieve Ͼ98% binding of the low affinity W70L domain, FRET studies were also performed at a combined protein and DNA concentration of 30 M by addition of 27 M unlabeled W70L domain⅐DNA complex, which shifted the equilibrium toward formation of the variant protein⅐DNA complex. Previous studies have established that the increase in FRET efficiency observed in this system on binding of the WT SRY domain corresponds to a DNA bend angle of ϳ80 o as inferred from permutation gel electrophoresis (34).
FRET-based DNA Binding Assays-Steady-state FRET was employed to determine protein-DNA dissociation constants (K d ); the DNA site was as above. Measurements were made in FRET buffer at 15°C and, where feasible, also at 37°C. Varying concentrations of the WT or variant SRY domain were titrated at a constant DNA concentration of 25 nM. Emission spectra were recorded from 500 to 650 nm following excitation at 490 nm. Estimates of K d values were determined by plotting the change in fluorescence intensity at 520 nm against total protein concentration. Data were fit with a single-site ligand-binding model (Equation 1) as described (53) using Origin 8.0 software (OriginLab Corp., Northampton, MA).
In this formalism, ⌬F is the change in donor fluorescence observed on addition of the SRY domain relative to the baseline DNA fluorescence; ⌬F 0 is the maximum fluorescence change obtained in a 1:1 protein⅐DNA complex; K d is the dissociation constant; D 0 is the concentration of DNA (25 nM); and S is the concentration of SRY domain. In the case of the W70L domain, whose specific DNA-binding affinity was significantly weaker than those of the other variants, the protein concentration range was extended to 3 M to achieve saturation.
Nonspecific DNA Binding-To test whether the binding of variant SRY domains to the consensus FRET DNA target site (above) represented a specific or nonspecific mode of DNA binding (31), the FRET-labeled protein⅐DNA complexes were challenged by addition of equimolar unlabeled DNA target site (15 bp) or a 10-fold molar excess of an unrelated 15-bp DNA site derived from the left operator site of bacteriophage (5Ј-ATCACCGCCAGTGGT-3Ј and complement; site O L 1) (54). The WT SRY domain exhibited no binding to this DNA site as assessed by a gel mobility shift assay.
Stopped-flow Kinetic FRET Assay-Rates of protein-DNA dissociation were measured with an Aviv double-mixing stoppedflow apparatus at a fixed temperature (6, 15, 25, or 37°C); the instrument contained a thermoelectric temperature controller. Fluorescence emission was monitored at 520 nm following excitation at 490 nm using an Aviv ATF 105 spectrofluorometer (34). In brief, a 20-fold molar excess (final stoichiometries) of unlabeled DNA in FRET buffer was employed to sequester the WT or variant SRY domain on rapid mixing. Estimates of dissociation rate constants (k off ) were obtained by fitting the traces to a mono-exponential equation; values represent the mean and standard error of 4 -5 replicates. Control studies of the WT domain indicated that similar time-dependent recovery of donor emission was observed irrespective of unlabeled DNA in the molar excess range 10 -50-fold relative to the specific protein-FRET-labeled DNA complex.
NMR Spectroscopy-15 N-Labeled WT and variant Sry domains were prepared by growth of an overexpression strain of E. coli (34) in minimal medium containing [ 15 N]ammonium sulfate as sole nitrogen source. The proteins were dissolved in a nitrogenpurged buffer containing 10 mM potassium phosphate buffer (10% D 2 O, pH 6.5) and 50 mM NaCl and placed in a 280-l Shigemi NMR tube. To obtain a spectrum of the unfolded state, the variant 15 N-labeled domain was also dissolved in phosphate-buffered saline solution containing 5.4 M urea (10% D 2 O, pH 7.4). Two-dimensional 1 H-15 N heteronuclear single-quantum coherence (HSQC) spectra were acquired at 25°C using a BRUKER AVANCE 700 MHz spectrometer.
Mammalian Plasmids-Plasmids expressing full-length human SRY or variants were constructed by polymerase chain reaction (PCR) (41). Following the initiator Met, the cloning site encoded a hemagglutinin (HA) tag in triplicate. In selected constructs, an element encoding a heterologous nuclear localization signal (NLS; sequence PRRRKV as derived from the large T antigen of simian virus 40 (SV40)) was inserted after the HA-related codons (38). Mutations in SRY were introduced using QuikChange TM (Stratagene). Constructions were verified by DNA sequencing.
Mammalian Cell Culture-CH34 cells (kindly provided by Dr. P. K. Donahoe, Massachusetts General Hospital, Boston) (38,42) were cultured in Dulbecco's modified Eagle's medium (DMEM) containing 5% heat-inactivated fetal bovine serum at 37°C under 5% CO 2 . For proteasome-inhibitor studies, transfected cells were maintained for 24 h in serum-free conditions and then treated with the proteasome inhibitor MG132 for 6 h followed by 18 h of incubation in 5% serum-containing medium.
Transient Transfection-Transfections were performed using FuGENE6 (Hoffmann-La Roche) as described by the vendor. After 24 h in serum-free medium, cells were recovered using fresh DMEM containing 5% heat-inactivated fetal bovine serum. Transfection efficiencies were determined by ratio of green fluorescent protein (GFP)-positive cells to untransfected cells following co-transfection with pCMX-SRY and pCMX-GFP in equal amounts (38). Subcellular localization was visualized by immunostaining 24 h post-transfection following treatment with 0.01% trypsin (Invitrogen) and plating on 12-mm coverslips. SRY expression was monitored by Western blot via its triplicate HA tag (see above).
Cycloheximide Chase Assay and Western Blot-24-h post-transfection, cells were split evenly into 6-well plates and treated with cycloheximide to a final concentration of 20 g/ml in DMEM for the indicated times; cells were then lysed by radioimmunoprecipitation (RIPA) buffer (Hoffmann-La Roche). After protein normalization, cell lysates were subjected to 12% SDS-PAGE and Western blot using anti-HA antiserum (Sigma) at a dilution ratio of 1:5000 with ␣-tubulin as loading control. Quantification was performed by ImageJ software (National Institutes of Health, website rsbweb.nih).
Transcriptional Activation Assay-Following transient transfection (see above), SRY-mediated transcriptional activation of Sox9 was measured in triplicate by quantitative real time Q-rtPCR (qPCR) as described (41). Cellular RNA was extracted using RNeasy (Qiagen, N.V., Hilden, Germany). The transfection protocols were performed with the following: (i) SRY expression plasmid only, which contains 1 g of WT or variant SRYencoded plasmid per million cells ("1ϫ" conditions) and (ii) a mixture of the empty mammalian parent (0.98 g) and target SRY-encoded plasmid (0.02 g), with the overall transfected mass of plasmids retained at 1 g ("50ϫ" conditions). Such dilution provided a control for potential artifacts of TF overexpression (38).
Immunocytochemistry-Transfected cells were plated evenly on 12-mm coverslips, fixed with 3% para-formaldehyde in phosphate-buffered saline (PBS at pH 7.4) on ice for 30 min, treated with cold permeability buffer solution (PBS containing 10% goat serum and 1% Triton X-100; Sigma) for 10 min, blocked with 10% goat serum and 0.1% Tween 20 in cold PBS (Sigma), and incubated overnight at 4°C with FITC-conjugated anti-HA antibody (diluted to 1:400 ratio; Santa Cruz Biotechnology, Santa Cruz, CA). After washing and 4Ј,6-diamidino-2phenylindole (DAPI) staining, cells were visualized by fluorescence microscopy. Nuclear localization was evaluated by the ratio of cells exhibiting nuclear HA-tagged SRY to the total number of GFP-positive cells.
Chromatin Immunoprecipitation (ChIP)-CH34 cells were transiently transfected with WT or variant SRY constructs, exposed to MG132, and subjected to ChIP. In brief, recovered cells were cross-linked in wells by formaldehyde, collected, and lysed after quenching the cross-linking reaction. Chromatin lysates were sonicated to generate 300 -400-bp DNA fragments and immunoprecipitated with anti-HA antiserum (Sigma) coupled with protein A slurry (Santa Cruz Biotechnology) after pre-clearing; a nonspecific antiserum (Santa Cruz Biotechnology) served as a control. After reversal of cross-linking at 65°C overnight, fragments were treated with proteinase K and RNase (Hoffmann-La Roche), followed by extraction with 1:1 phenol chloroform/isoamyl alcohol (CIAA) solution. A high fidelity PCR protocol was provided by the vender.
Co-immunoprecipitation (Co-IP) Assays-CH34 cells expressing HA-tagged SRY variants were treated with MG132 and lysed using complete Lysis-M buffer containing a protease inhibitor mixture as described by the vendor (Hoffmann-La Roche). Two co-IP targets were investigated. (i) In SRY-CaM studies, lysates were precipitated with monoclonal anti-HA-agarose beads (Sigma). Following 12% SDS-PAGE, Western blots employed an anti-CaM antiserum (Abcam, Cambridge, MA). Equal CaM loading was verified by Western blot. (ii) In SRY-Exp4 studies, transfected cells were co-transfected with pCMX-FLAG-human Exp4. MG132-treated cell lysates were immunoprecipitated by monoclonal anti-FLAG agarose using the vendor's protocol (Sigma). After analysis by 10% SDS-PAGE and electroblotting, hybridization solutions containing horseradish peroxidase (HRP)-conjugated anti-HA antiserum (Hoffmann-La Roche) was used to investigate Exp4-bound SRY. Anti-FLAG antiserum was used to monitor Exp4 expression; respective antisera against HA tag and ␣-tubulin (Sigma) provided SRY input and general loading controls.

RESULTS
Our study had three parts. We first screened all 19 possible substitutions at SRY residue 70 (box position 15) in a Y1H system in which expression of ␤-gal was dependent on high affinity binding of the SRY HMG box to consensus DNA target sites (triplicate 5Ј-ATTGTT-3Ј and complement) (41). We next characterized the biochemical and biophysical properties of four representative variant domains. Because box position 15 (helix ␣ 1 ) contributes to the hydrophobic core of the HMG box, these in vitro studies focused on nonpolar side chains of high ␣-helical propensity (55) but progressively smaller size as follows: Tyr, Phe, Leu, and Ala.
To relate structure to biological function, our final studies employed an embryonic pre-Sertoli cell line (rodent XY cell line CH34 (33,42) to probe, following transient transfection, SRYdirected transcriptional activation of the endogenous Sox9 gene (38,41). Employing qPCR, this assay measured the timedependent accumulation of mRNAs encoded by Sox9, the principal target gene of murine Sry in vivo (4). This cellular model also permitted analysis of protein turnover, nuclear localization, and occupancy of the testis-specific core enhancer of Sox9 (TESCO (4)) as described (38).
Y1H Screening Distinguished between Aromatic and Nonaromatic Substitutions-The yeast system employed the following: (i) plasmid-based expression of WT or variant SRY fusion proteins, and (ii) a chromosomally integrated reporter constructed to express ␤-gal under the transcriptional control of triplicate SRY target sites (Fig. 3A). An initial screen of all 19 possible amino acid substitutions was performed on X-Gal indicator plates (Fig. 3B). Whereas the WT fusion protein gave rise to a deep blue colony, the variant colonies were either light blue (Phe and Tyr) or white (the remaining 17 substitutions, including Leu and Ala). Quantitative assessment of selected ␤-gal enzymatic activities was undertaken in cell lysates (Fig.  3C). In this system, the low enzymatic activities associated with the W70L and W70A (consensus box position 15) fusion proteins were indistinguishable from that of an empty vector control (em in Fig. 3C). In accordance with visual inspection of the colonies, W70F and W70Y fusion proteins gave rise to enzymatic activities reduced by 50 -70% relative to WT.
Substitutions Weakened Specific Protein-DNA Affinities-Specific protein-DNA dissociation constants were measured based on steady-state FRET (34); these studies employed a similar 15-bp DNA duplex. Changes in distance between respective 5Ј-ends were monitored by changes in FRET efficiency between donor (6-FAM) and acceptor (TAMRA) (Fig. 3D). Dissociation constants were determined by titration of the 6-FAM/TAMRAlabeled DNA substrate (25 nM) with increasing amounts of WT or variant domains in the concentration range 1 nM to 3 M. At 15°C, the W70L domain exhibited the weakest affinity (K d 330(Ϯ30) nM) relative to WT (K d 14(Ϯ3) nM) ( Fig. 3E and Table  1). The affinity of the W70A domain was similar to that of the W70F domain (K d values 49(Ϯ5) and 53(Ϯ6) nM, respectively). The W70Y variant was least perturbed (K d 32(Ϯ3) nM). At 37°C, the WT K d was similar to its value at 15°C, whereas the aromatic variants exhibited 2-fold increased affinity (K d 24(Ϯ2) and 26(Ϯ4) nM, respectively) at the higher temperature. In contrast, at 37°C, the W70L and W70A domains exhibited lower affinities (Table 1).
Substitutions Perturbed Structure and Impair Stability-Structures of the free domains were investigated by far-UV CD (as a probe of secondary structure) and intrinsic Trp fluorescence (a probe of tertiary structure and quenching). The free W70L domain was also probed by heteronuclear two-dimensional 1 H-15 N NMR spectroscopy.
CD spectra of the free domains exhibited two classes of perturbations (Fig. 4A). Whereas the W70L and W70A domains were largely disordered at 4°C and essentially without organized secondary structure at 37°C (Fig. 4B), at 4°C the W70F and W70Y domains exhibited only a subtle attenuation of ␣-helix-associated features with decreased ratio of ellipticities at 222 and 208 nm (Fig. 4A); this perturbation was more marked at 37°C (Fig. 4C). Mean residue ellipticities at 37°C are given in Table 2. The temperature dependence of mean residue ellipticity at 222 nm indicated partial (W70F and W70Y) or marked (W70A and W70L) loss of thermal stability (Fig. 4, A-C).
Stabilities were further assessed by guanidine-induced unfolding at 4°C as monitored by CD. Whereas variants W70F and W70Y (Fig. 4D, blue and green lines) were only moderately less stable (leftward shift in denaturation transition) than WT (black), the W70L and W70A domains (red and gray) exhibited noncooperative unfolding transitions characteristic of nascent ␣-helical peptides or molten globules (Fig. 4D) (56,57). The latter deviation from classical two-state unfolding was especially prominent at low concentrations of guanidine HCl (0.015 to 0.9 M; inset in Fig. 4D) as no pre-transition baseline could be defined. Application of a two-state model to the aromatic variants yielded estimates of ⌬G u (Table 3). Relative to the baseline stability of the WT domain (⌬G u 3.0(Ϯ0.1) kcal/mol), decre-  ments in stability (⌬⌬G u ) were 0.6(Ϯ0.2) kcal/mol (W70F) and 1.0(Ϯ0.2) kcal/mol (W70Y). Two-state modeling of the aliphatic variants was not undertaken. These trends were recapitulated in studies of thermal unfolding (Fig. 4, E-H). The W70A and W70L domains exhibited incremental and progressive attenuation of residual ␣-helical structure with increasing temperatures in the range 4 -40°C, whereas the aromatic variants retained cooperative thermal unfolding transitions ( Fig. 4G) with reduced T m values ( Fig. 4H and Table 3). The partially folded state of the W70L domain was corroborated at 15°C by comparison of its 1 H-15 N HSQC fingerprint spectrum (blue cross-peaks in the two-dimensional NMR spectrum shown in Fig.  4I) to corresponding spectra of its unfolded state (in 5.4 M urea; red cross-peaks) and the WT domain under native conditions (black cross-peaks). Chemical shift dispersion in the W70L spectrum was thus intermediate between that of the WT domain under native or urea-unfolded conditions.
The SRY domain contains three Trp residues (Trp-70, Trp-98, and Trp-107; respective box positions 15, 43, and 52). Trp-70 and Trp-98 are largely buried within the major wing, whereas Trp-107 projects from its back surface. Hydrophobic core-specific insight was obtained from studies of intrinsic Trp fluorescence (Fig. 5, A  and B). The excitation wavelength (295 nm) and path length (0.3 cm) were chosen to minimize the DNA-dependent inner filter effect (see "Experimental Procedures"). Emission maxima ( max ) and integrated amplitudes are given in Table 4.
The Trp fluorescence emission spectrum of the WT domain contains a major contribution from the exposed and mobile indole ring of Trp-107 (box position 52) on the back surface of the domain. 7 The integrated intensity of the Trp fluorescence emission spectrum of the WT complex at 15°C was 2-fold less than that of the free domain, providing evidence for the quenching of one or more Trp side chains; such quenching is presumably due to stabilization of the domain on specific DNA binding leading to more complete desolvation of its hydrophobic core. In accordance with this interpretation, the maximum of the native WT spectrum (355 mm) was blue-shifted relative to its max in 8 M guanidine (364 nm, data not shown).
Comparative studies of Trp fluorescence demonstrated that the integrated intensities of the spectra of the free variant domains were each larger than that of the WT domain, despite the added presence of Trp-70 in the latter. Relative intensities followed the order W70L Ͼ W70A Ͼ W70Y ϭ W70F Ͼ WT (Fig. 5A). With the exception of W70L Ͼ W70A, this trend correlates with side-chain volume. A red shift in max was observed for aliphatic variants W70L and W70A (359(Ϯ0.2) and 357.8(Ϯ0.2) nm, respectively; ⌬ 4(Ϯ0.4) and 3(Ϯ0.4) nm relative to WT), whereas aromatic substitutions W70Y and W70F were associated with blue shifts ( max 340.2(Ϯ0.2) nm and 344.2(Ϯ0.2) nm; ⌬ Ϫ15(Ϯ0.4) nm and Ϫ11(Ϯ0.4) nm relative to WT). In control experiments (data not shown) Trp emission spectra of the denatured variant domains (unfolded in 8 M guanidine) were attenuated by one-third relative to WT in accordance with the number of Trp residues (two versus three); each exhibited the same emission maximum ( max 362 nm).
On binding to the specific DNA site, Trp emission spectra of the W70F and W70Y domains exhibited changes similar to those observed in the WT complex (Fig. 5B). Despite the absence of Trp-70, their bound-state integrated amplitudes were only slightly larger than that of the WT complex, suggesting that the residue-specific contribution of Trp-70 to the emission spectrum of the WT complex is Ͻ7% of the total fluorescence intensity ( Table 4). The spectra of the aliphatic variants retained the highest overall intensities. Unlike the free domains, however, relative intensities in the variant DNA complexes correlated with side-chain volume (W70A Ͼ W70L). In all cases, 7 The contribution of Trp-107 to the WT fluorescence spectrum was inferred from comparison of the WT domain and a W107D variant (data not shown).
Whereas the latter exhibited a native CD spectrum and native specific DNA-binding affinity below 30°C, its Trp emission spectrum was decreased by ϳ30% relative to WT. In 6 M urea, the spectrum of the variant was decreased by one-third as expected (two Trp residues versus three). a This is defined as the difference between the observed spectrum of the protein⅐DNA complex and the sum of the spectra of the free DNA and free protein (see Fig. 7). b The position and amplitude of the positive DNA band in the near-UV CD spectrum are sensitive to the geometry of the double helix (59); contributions from the protein are negligible in this region (see Figs. 5 and 6); values are Ϯ 0.5 nm. c For clarity, the CD spectrum of the free DNA site (15 bp; see "Experimental Procedures") was normalized to match those of the free SRY domain and the specific protein⅐DNA complex at the same concentration; values do not represent mean residue (or bp) ellipticity. d IA means inapplicable. e Because of its weaker binding, the sample labeled "W70L complex" was 85% bound and 15% free. The m value represents the slope d(⌬G u )/d(M) and correlates with extent of nonpolar surfaces exposed on denaturation. c C mid is defined as the denaturant concentration at which 50% of the protein is unfolded. d T m is the apparent midpoint unfolding temperature of the free domains. e ND means not defined as these values could not be determined due to absence of a pre-transition state and noncooperative transition. NOVEMBER 5C); the relative contribution of the DNA site to the spectrum in this region is smaller and of opposite sign (positive) (200 -225 nm subregion) (58). The spectrum of the WT complex thus resembled the sum of the spectra of the free domain and free DNA, leading to a difference spectrum that was without significant features (see Fig. 7A, below). By contrast, in the near-UV region (320 -250 nm) the DNA duplex makes the predominant contribution; its CD spectrum undergoes marked changes that resemble those associated with the canonical B 3 A transition (bracket in Fig. 5C) (59). Relative contributions from protein aromatic residues in this region, as inferred from the spectrum of the free domain (open circles in Fig. 5C), were negligible. Indeed, whereas the near-UV CD spectrum of the free DNA site (solid line in Fig. 5C) was typical of a mixed sequence DNA duplex (i.e. not a homopolymer), the spectrum of the bound DNA site (filled circles) exhibited A-like features, including a blue shift in band maximum (from 280 to 270 nm; Table 2). Such a spectral change has been observed in other SOX⅐DNA complexes (60) in accordance with the A-like structure of the bound DNA, which is under-wound and exhibits a widened minor groove relative to canonical B DNA (59).

Aromatic Buttress in an HMG Box
At 37°C the CD spectrum of the WT complex provided evidence for partial ␣-helical stabilization relative to the free domain (Fig. 5D) as observed in a previous study (41). Ellipticity at 222 nm is more negative in the complex than in the free domain (vertical gold arrow in Fig. 5D) despite negligible ellipticity at this wavelength in the spectrum of the free DNA site (solid line in Fig. 5D). Although a more significant contribution from the novel SRY-bound DNA conformation cannot be excluded, this possibility seems unlikely as reference CD spectra of canonical A, B, C, and Z DNA are weak in this region relative to that of an equimolar ␣-helical domain (59). Further-

TABLE 4 Trp fluorescence emission spectra
The excitation frequency was set at 295 nm to minimize the inner filter effect on addition of DNA; control studies of the WT SRY domain in the presence or absence of a solution of free A, T, C, and G deoxynucleotides (with ratio equal to that in the specific DNA site and at a concentration adjusted so that the absorbance at 295 nm was equal to that of the free DNA site at 5 M) indicated that under these conditions the inner filter led to a 2% attenuation of the Trp emission signal. Observed values were normalized to the Trp emission spectrum of the free WT domain and corrected for the inner filter effect. a Spectra were recorded with a step size of 0.2 nm, and so formally values are Ϯ0.2 nm. Five independent control spectra of the WT domain acquired with a step size of 0.2 nm yielded an estimated max ϭ 354.6(Ϯ0.2) (nm). b Integrated amplitudes of the Trp emission spectrum (see Fig. 5, A and B) were normalized using five independent spectra of the free WT domain, with an average integral of 46.6 Ϯ 0.7, set to a value of 100(Ϯ1.5)% (bold in column 4). c This value represents the integrated amplitude of the Trp emission spectrum, normalized relative to the spectrum of the WT domain, and divided by the number of Trp residues in the polypeptide; the WT value is thus 100/3 (33.3; bold in column 6). Standard errors were estimated to be Ϯ1.5% of the relative values shown based on five replicates of the WT spectrum. more, the difference spectrum (Fig. 7B, below) exhibited an overall ␣-helical signature in the 200 -250-nm region. Specific DNA-dependent accentuation of ␣-helical features in the CD spectrum of the WT complex at 37°C may in principle reflect either induced fit or conformational selection (62). The above DNA-dependent protein folding reflected partial thermal unfolding of the free WT domain at a temperature at which the specific DNA complex is stable. Relative thermal stabilities of the WT and variant DNA complexes were thus probed by monitoring ellipticity at 222 nm (Fig. 5E). Apparent midpoint unfolding temperatures (T m ) are given in Table 1; these values depend on the concentration of the complex (25 M in this study). Under these conditions, the T m of the WT complex was 62°C (black dashed line in Fig. 5E). T m values of the W70L and W70A complexes were markedly reduced (42 and 44°C, respectively; red and gray dashed lines in Fig. 5E), whereas those of the W70F and W70Y complexes were similar to that of the WT complex (59°C; blue dashed line in Fig. 5E).

Molecule
Differential perturbations to the thermal stabilities of the variant domains and their DNA complexes motivated assessment of specific DNA-induced protein folding in each case. Observed CD spectra are shown in Fig. 6 and calculated difference spectra in Fig. 7. ␣-Helical stabilization was especially pronounced in the W70L complex at 4°C (Fig. 6A). Its far-UV CD spectrum (red filled circles in Fig. 6A) and its calculated difference spectrum (open black triangles in Fig. 7C) exhibited a prominent ␣-helical signature at 208 and 222 nm. In the near-UV region, an A-like CD feature of the bound DNA (red filled circles in Fig. 6A) was observed, similar to that of the WT complex (black filled circles in Fig. 5C). In each case, the band maximum was blue-shifted relative to the spectrum of the free DNA site (brackets in each panel and Table 3).
At 37°C, each of the variant domains exhibited DNA-dependent induction or stabilization of ␣-helix (Fig. 6, B-E). Although DNA binding is weaker at this temperature (Table 1), CD studies were performed under conditions in which the variant domains were Ͼ95% bound as inferred from their respective dissociation constants. The W70L complex appeared to contain less ␣-helix than the W70A complex as inferred from the magnitude of the negative bands at 208 and 222 nm (Fig. 6, B and C). Far-UV CD spectra of the W70L and W70A complexes were in turn attenuated relative to spectra of the W70F and W70Y complexes (Fig. 6, D and E) in accordance with their respective T m values (Fig. 5E). As expected, based on disproportionate unfolding of the WT and variant free domains above 30°C, CD difference spectra at 37°C (Fig. 7, B, D, and F) in general exhibited more marked features than at 4°C (Fig. 7, A,  C, and E). The near-UV CD signature of an altered DNA structure was attenuated in the W70L complex at 4°C (red line in Fig. 7G) relative to the WT or W70F complexes (black and blue lines in Fig. 7G). At 37°C, both the W70L and W70F complexes exhibited attenuated near-UV signatures relative to that of the WT complex (Fig. 7H). Such attenuation, more marked than could be attributed to partial dissociation of the variant complexes, was not accompanied by a shift in band maximum toward that of the free DNA site.

Specific DNA Bending Was Retained in the Variant SRY
Domain⅐DNA Complexes-To evaluate SRY-directed DNA bending, equimolar solutions of the domain and the 15-bp FRET-labeled DNA duplex were prepared at a sufficiently high concentration (5 M) to ensure at least 98% binding of the Tyr, Phe, and Ala variants and 90% binding of the Leu variant at 15°C. Binding of the WT domain led to an increase in FRET efficiency from 10% (free DNA) to 47% (complex) in accordance with past studies (34). Similar increases in FRET efficiency were observed on binding of the W70Y and W70F domains (Fig. 8A). The variant W70L domain⅐DNA complex at a nominal concentration of 3 M exhibited an apparent FRET efficiency of 34% (Fig. 8B); however, because 10% of the DNA would be unbound at this concentration (given K d 0.3 mM; Table 1), this value represents an underestimate. To ensure at least 98% binding, steady-state FRET was re-measured at an effective concentration of 30 M as accomplished by addition of the corresponding unlabeled DNA⅐domain complex to a concentration of 27 M. Under these conditions, the FRET efficien- cies of the W70L and WT complexes were identical (Fig. 8C), indicating preservation of sharp DNA bending.
Substitutions Reduced the Lifetime of the Bent Domain⅐DNA Complex-Dissociation rates of the protein⅐DNA complexes were determined using a stopped-flow FRET-based assay. Preformed bent protein⅐DNA complexes were rapidly mixed with 20-fold excess of unlabeled target DNA (Fig. 8D). On mixing with the labeled complex, an increase in 6-FAM (donor) fluorescence was observed, reflecting the dissociation rate constant (k off ). Representative rates of WT, W70F and W70Y domain⅐DNA complexes at 15°C are illustrated in Fig.  8E; aliphatic variants are shown in Fig. 8F. Dissociation rates were determined at 6, 15, 25, and 37°C ( Table 5). The substitutions each led to accelerated dissociation. The extent of acceleration was greatest for the W70A variant with order of substituents as follows: Ala Ͼ Leu Ͼ Tyr, Phe Ͼ WT complexes. As expected, the protein⅐DNA complexes each exhibited faster rates of dissociation with increasing temperature. This temperature dependence was exaggerated in the case of the W70A complex, consistent with a lower free energy barrier to disassembly. For the unstable W70L complex above 15°C, k off could not be determined due to the limited time scale of our stopped-flow technique ( Table  5); only a lower bound of 0.001 s Ϫ1 could be estimated.
Inferred association rate constants at 15°C (calculated from K d ϭ k off /k on ) are 2.4(Ϯ 0.6) ϫ 10 6 M Ϫ1 s Ϫ1 (WT domain), 2.2(Ϯ 0.4) ϫ 10 6 M Ϫ1 s Ϫ1 (W70Y domain), and 1.3(Ϯ 0.2) ϫ 10 6 M Ϫ1 s Ϫ1 (W70F). More marked changes were observed for the aliphatic variants as follows: 4.9(Ϯ0.6) ϫ 10 6 M Ϫ1 s Ϫ1 (W70A) and 6(Ϯ0.6) ϫ 10 5 M Ϫ1 s Ϫ1 (W70L). That the W70A complex exhibited slightly more rapid disassembly than the W70L complex, despite its greater affinity, indicates that the association rate constant (k on ) of the Leu variant is disproportionately perturbed relative to that of the Ala variant. Moreover, relative to the WT domain, the increased association rate of W70A domain in part mitigated the effect of its accelerated disassembly on the equilibrium dissociation constant. These findings imply that pre-organized native structure in the major wing of the domain retards specific DNA association.

Variant Domains Distinguished between Specific and
Nonspecific DNA Sites-To evaluate nonspecific DNA binding, we challenged the FRET-labeled DNA complexes by addition of a competing unlabeled specific or nonspecific DNA duplex. An increase in 6-FAM (donor) fluorescence due to the addition of the nonspecific DNA site would signify competing nonspecific DNA binding. As shown in Fig. 9 (A, C, and E), for the aliphatic variants the increase (relative to WT) was negligible, providing evidence that the bent FRET-labeled complexes represent specific binding. In control studies employing an equimolar unlabeled specific DNA site, 6-FAM FRET emission was increased by 50% as would be predicted by re-equilibration of the domains between 6-FAM/ TAMRA-labeled and -unlabeled specific DNA sites (Fig. 9, B, D, and F).
Cellular Studies Demonstrated Perturbations in Multiple Biological Processes-Comparative CH34 cell-based studies of epitope-tagged human SRY and its variants (W70F, W70Y, W70A and W70L) were undertaken following transient transfection of full-length constructs. Transfections were performed with either the SRY expression plasmid alone (1 g of DNA per million cells; 1ϫ protocol) or a 1:50 mixture of the expression plasmid and its empty parent (combined DNA mass of 1 g; 50ϫ); such dilution was designed to mitigate potential effects of protein overexpression. 8 Whereas undiluted transient transfection of the WT SRY expression plasmid led, under these conditions, to accumulation of ϳ1 million SRY molecules per cell, its 1:50 dilution led to accumulation of ϳ1000 -10,000 molecules per cell 9 (38) as is typical of TFs that regulate cell-fate decisions (63). Because variability in levels of WT SRY expres- 8 Use of a strong viral promoter to drive expression of SRY in transient transfection studies (as employed here and by others (99)) typically leads to accumulation of ϳ10 6 protein molecules per cell, which on nuclear localization corresponds to a concentration of ϳ1 M. Such overexpression can alter the effects of TF mutations on the transcriptional regulation of target gene or reporter genes (38). 9 The number of WT or variant epitope-tagged SRY molecules per cell was estimated as described (38) by Western blot in reference to housekeeping control ␣-tubulin, assumed to be expressed at a level of 5 ϫ 10 7 molecules per cell (61). sion could occur when different aliquots of CH34 cells were employed at different dates (presumably reflecting differences in passage number since their original derivation), our comparative studies were restricted to side-by-side analysis of WT and variant SRY constructs on transient transfection of the same batch of CH34 cells on the same date. Under these conditions, the relative extent of variability of WT SRY accumulation (or that of a given mutant SRY), as evaluated following undiluted transfection by Western blot (directed against the HA tag of the expressed SRY) with normalization based on an ␣-tubulin housekeeping control, was less than 10%. Such reproducibility required stringent attention to the timing of transfection, length of cell culture, and their conditions.
Mutations at position 70 (HMG consensus position 15) can markedly impair SRY expression and nuclear localization. Whereas aromatic substitutions W70Y and W70F were well Overlapping spectra were observed except for the W70L domain⅐DNA complex (red) due to its partial dissociation under these conditions. B and C, comparison of FRET spectra of the W70L⅐DNA complex at a nominal concentration of 3 M (B) versus an effective concentration of 30 M (C). At the latter concentration (100-fold higher than the perturbed K d ) FRET efficiency is indistinguishable from that of WT (C). D, stopped-flow FRET assay. Stopped-flow apparatus coupled to the fluorimeter and experimental design to measure dissociation of the SRY⅐DNA complex. One syringe contained preformed SRY-labeled DNA complex; a second syringe contained unlabeled DNA in 20-fold excess. E and F, time-dependent increase in donor fluorescence (6-FAM) due to dissociation of 6-FAM/TAMRA-labeled DNA⅐SRY complexes. E, dissociation reactions of the aromatic variants W70F (blue) and W70Y (green) relative to WT (black). F, W70A (gray) and W70L (red) relative to WT (black) at 15°C. Measurements were taken for 180 s until plateau was reached. Dissociation rate constants (k off ) were based on a mono-exponential fitting values in Table 5. fluores. (arb. units), fluorescence arbitrary units.

TABLE 5 Dissociation rates and lifetimes of protein⅐DNA complexes
Respective dissociation rate constants (k off ) were measured by stopped-flow FRET; lifetimes were determined according to (1/k off ).    (Figs. 10A and 11A), aliphatic substitutions W70L and W70A led to ϳ8-fold reduction in the intensity of the corresponding HA-immunoblotted signal (Fig. 11B,  lanes 2 and 3) and partial impairment of nuclear localization (Fig. 10A, panels f and g). Native-like levels of expression of W70L and W70A SRY could be regained on treatment of the cells with proteasome inhibitor MG132 (lanes 6 and 8 in Fig.  11B relative WT control in lane 4) and maintained on modification of the constructs by inclusion of a exogenous NLS (NLS; green in Fig. 11B as defined in Fig. 11C). Although following rescue of expression by MG132, the unmodified W70L and W70A SRY constructs exhibited impaired nuclear localization at a level intermediate between those of control NLS-defective constructs R62G (47) and R133W (45) (Figs. 10A, panels d and  e; and 10B), on exogenous NLS fusion the W70L and W70A variants exhibited levels of nuclear localization (versus pan-cellular staining) similar to that of WT SRY ( Fig. 10B; illustrated in the case of W70L in panel j of Fig. 10A). In contrast, aromatic substitutions of W70Y and W70F have no significant effect on the pattern of subcellular localization (panels h and i in Fig. 10A and histogram in 10B). The above patterns of subcellular localization correlate with the relative strength of binding to epitope-tagged Exp4 (co-IP assay in Fig. 10C), whereas binding to endogenous CaM is unaffected by the substitutions at position 70 (Fig. 10D). Together, these findings suggest that an aromatic residue at position 70 contributes to the function of bipartite NLS1 in the HMG box and also contributes to the resistance of SRY to proteosomal degradation. Additional evidence that the W70L and W70A variants (but not the W70Y and W70F variants) undergo accel-  ). B, histograms describing nuclear (gray) and pan-cellular (white) patterns of variant SRY compared with WT in the presence of MG132, a proteasome inhibitor. For mutant W70L, nuclear accumulation was similar to WT with addition of exogenous nuclear localization signal derived from SV40 (W70LϩNLS, far right). C, Exp4 co-IP assay. Nonaromatic substitutions, W70A and W70L, exhibit strongly impaired Exp4-dependent precipitation as characterized previously in an unrelated inherited mutation (38); housekeeping gene ␣-tubulin is provided for loading control. D, calmodulin (CaM) co-IP assay. CaM interactions with SRY are not significantly affected in a mammalian cell based co-IP, independent of variant. *, p Ͻ 0.05; **, p Ͻ 0.01; ns, not significant. erated degradation was provided by analysis of time-dependent protein turnover following inhibition of ribosomal translation by cycloheximide (Fig. 11D).
The transcriptional regulatory activities of the SRY variants were assessed by quantitative rtPCR in relation to transcriptional activation of the endogenous Sox9 gene (38). Correspondence between this cell-based assay and in vivo function (i.e. in situ activation of a male-specific GRN in the bipotential XY gonadal ridge (64)) has previously been validated (38). Initial studies employed transient transfection of the undiluted WT and variant SRY expression plasmids in the absence of MG132 and exogenous NLS fusion (black bars in Fig. 11E). Under these conditions, the activities of W70Y and W70F SRY were indistinguishable from that of WT SRY, whereas W70L and W70A variants were completely without detectable activity relative to an empty vector control. To exclude the possibility that the high activities of W70Y and W70F SRY variants represented an artifact of TF overexpression (38), their potencies were reassessed on 50ϫ dilution (white bars in Fig. 11E, left). Such re-investigation confirmed the native-like activities of these variants; results were not affected by inclusion of an exogenous NLS in the construct (green-filled and open bars at right in Fig. 11E).
As expected, the inactive W70L and W70A SRY variants remained inactive on 50ϫ dilution of these respective expression plasmids. To test whether such complete inactivity was due solely to impaired expression and nuclear entry or represented an intrinsic molecular defect of the mutant protein structure, we "rescued" nuclear accumulation by combination of MG132-mediated proteosomal inhibition and exogenous NLS fusion. Under these conditions, the extent of expression and nuclear localization under 1ϫ conditions was similar to that of WT SRY (under 50ϫ conditions respective cellular levels of expression were likewise similar to the 50ϫ WT control, but nuclear localization could not be evaluated). Unlike the defective activity of inherited DSD-associated variant V60L as described previously (41), which was almost completely rescued by these manipulations (38), the W70L and W70A SRY variants remained completely inactive in accordance with their marked biophysical perturbations in vitro at 37°C (see above).
ChIP studies enabled evaluation of the occupancy of the WT or variant epitope-tagged SRY at TESCO regulatory sites of the endogenous Sox9 gene in rat chromosome 10; primer sets a and c (Fig. 12A) served as probes to monitor SRY occupancy, and set b (flanking a neighboring DNA segment lacking detectable  6 and 8, ϩMG132). C, diagram of full-length human SRY plasmids used in these studies; position of the exogenous NLS (for double-rescue experiments) is highlighted in sequences (green) below diagram. D, cycloheximide assay indicates that proteolysis is enhanced for the nonaromatic variants W70L and W70A (red and gray lines, respectively); however, W70F and W70Y (blue and purple lines, respectively) exhibit a similar cellular lifetime compared with WT (black line). E, relationship between Sox9 expression and SRY (WT and variant) at two doses indicated by solid (high concentration) and white (low concentration) transfected HA-SRY expression plasmid. qPCR data were obtained under standard conditions (left in black) and double-rescued (right in green) in the presence of MG132 with N-terminally fused SV40 nuclear localization signal (ϩNLS). **, p Ͻ 0.01; ns, not significant.
SRY-binding sites) provided a negative control. As in the above qPCR studies, SRY variants were constructed with an in-frame heterologous NLS (ϩNLS in Fig. 12B), and the transfected cells were treated with MG132 to equalize nuclear expression of the WT or variant SRY proteins. TESCO occupancy by W70F and W70Y SRY was indistinguishable (within error) from WT ( Fig.  12B and histogram in C). By contrast, the W70L and W70A variant proteins exhibited no detectable enhancer binding activity (Fig. 12B). In these studies, a control was provided by an unrelated inherited mutation (V60L), previously shown to exhibit partial TESCO occupancy under these conditions (38).

DISCUSSION
This study has focused on a de novo mutation in the HMG box of SRY associated with somatic sex reversal and gonadoblastoma (9). Because the mutation affects a core Trp residue that is invariant among Sry and Sox domains (consensus position 15 (36,37)) and is conserved as an aromatic residue (Phe or Tyr) among other families of HMG boxes (14), including Schizosaccharomyces pombe mating factor spMATA-mc (Fig. 13) (65), we exploited this clinical case report as a starting point for analysis of structure-activity relationships. The mutation (W70L) preserves the nonpolar character of this site but not its aromaticity. Although hydrophobic cores of globular proteins in general (such as globins (66) and lysozymes (67)) often tolerate interchange of aromatic and aliphatic side chains as alternative nonpolar packing schemes, we hypothesized that possible core packing in the HMG box is markedly restricted due to its small size, unusual angular shape, and functional role (as a DNAbending motif).
Our approach began with an SRY-directed Y1H system to screen all 19 possible substitutions at box position 15 (position 70 of human SRY). Remarkably, only Trp conferred a blue colony on X-Gal plates, indicative of robust ␤-gal expression. Powder-blue colonies were conferred by aromatic substituents Tyr and Phe; the remaining 17 substitutions gave rise to white colonies. These results were corroborated by quantitative assays of enzymatic activity in yeast extracts. Based on this screen, we purified representative variant domains for biophysical study; in order of Y1H activity and side-chain volume, these were the WT domain, aromatic variants Phe and Tyr, and inactive variants Leu and Ala. Although marked differences were observed in the biophysical properties of the free domains, each variant exhibited detectable specific DNA-binding and near-native DNA bending activity as probed by FRET.
Biological Implications-SRY provides a model for a genetic switch that regulates an organ-specific GRN. Because a Y1H system is artificial and unlikely to reflect mechanisms of architectural gene regulation through long range mammalian enhancer elements (such as the TESCO site of the mammalian Sox9 gene (4)), we extended our studies to the effects of the SRY mutations in the intact protein on transient transfection of an embryonic pre-Sertoli cell line (38). Previous studies have shown that this cell line recapitulates essential aspects of the SRY-dependent GRN in the differentiating XY gonadal ridge, including immediate transcriptional activation of Sox9, downstream activation of embryonic testis-specific Sox9 target genes (encoding fibroblast growth factor 9 and prostaglandin D 2 synthase; Fgf9 and Ptgds) (38).
Our studies employed the endogenous Sox9 gene of this pre-Sertoli cell line as a transcriptional target of the transfected TF. Use of an endogenous target gene offered the advantages of providing a native enhancer element (TES) with core elements (TESCO) amenable to ChIP analysis (4); its presumed chromatin structure and epigenetic marks are presumably similar to those found in the site and stage of SRY function in embryogenesis (42). This protocol also avoided potential artifacts associated with co-transfected reporter genes, such as nonphysiological promoter structure and elevated copy number. Although FIGURE 12. ChIP assays of Sox9 TESCO occupancy for human SRY variants. A, schematic model of rat Sox9 with testis-specific enhancer elements, including regions of TES and TESCO. TESCO fragments with SRY-binding sites are highlighted with gray boxes. Primer sets a and c associated with the fragments 4 and 8 (each contains two consensus SRY-binding elements highlighted with red dots) were probed for SRY occupancy; primer set b serves as negative control. B, representative gel shows PCR products of ChIP of SRY variants with fused exogenous SV40 NLS (green) and MG132 treatment. At left are nonspecific IgG controls; a negative control is provided by inactive human SRY variant I68A. C, histogram showing relative TESCO occupancies in each set by SRY variant proteins; the WT SRY signal is defined as 100%. Horizontal brackets designate statistical comparisons as in Fig. 10. our interests focused on the gene regulatory activities of the variant SRYs, potential confounding perturbations were assessed with respect to protein expression and subcellular localization. Indeed, robust cellular expression of the variant SRYs required an aromatic residue at position 70 (consensus position 15). Variants W70L and W70A by contrast exhibited diminished steady-state protein levels rescuable by proteasome inhibitor MG132; accelerated turnover of the latter mutant proteins was verified by cycloheximide pulse-chase assays. Furthermore, the W70L and W70A variants exhibited decreased binding to Exp4 (47), resulting in impaired nuclear localization.
To assess the intrinsic gene regulatory activities of these mutant TFs in a mammalian cell, we first defined conditions such that the levels of WT and mutant proteins were similar (via MG132 proteome inhibition) and the extent of nuclear localization was similar (via a fused SV40 NLS). Under these conditions, SRY-dependent transcriptional activation of the endogenous Sox9 gene was probed by qPCR at high and low levels of TF overexpression. The latter (corresponding to 1000 -10,000 protein molecules per cell (38)) was accomplished by dilution of the expression vector by its empty parent plasmid (thus preserving the total amount of DNA per transfection). Through these experimental maneuvers, we found that the aromatic variants W70Y and W70F retain native gene regulatory activity, whereas the aliphatic variants W70L and W70A are completely inactive even when robustly expressed and properly localized in the nucleus. Such inactivity correlated with negligible accumulation of the mutant proteins at TESCO sites as monitored by ChIP.
This study has also provided mechanistic insight into the bipartite NLS in human SRY (residues 60 -77 in intact human SRY; box positions 5-22) (68). This motif binds Exp4 and is conserved among SOX domains (47,69). Although such Exp4 complexes have not been characterized, analogous bipartite NLS motifs have been dissected in studies of importin-␣ (70) as exemplified by the NLS of cargo protein nucleoplasmin (71). The crystal structure of an Imp-␣⅐cargo peptide complex demonstrated interactions not only with the two clusters of basic residues but also with intervening nonpolar residues (72). In SRY and SOX domains, Trp-70 is located at position 8 of a 12-residue linker between basic clusters. Our findings that its substitution by Leu or Ala markedly impairs binding to Exp4 in association with defective nuclear import strongly suggest that this conserved aromatic ring contacts Exp4. Although it is formally possible that the effects of the Leu and Ala substitutions are indirect (i.e. mediated via transmitted structural perturbations associated with the partial unfolding of the free domains), this possibility seems unlikely given that partial unfolding of the WT domain may be required in the Exp4⅐cargo complex. Impaired Exp4 binding and nuclear entry have also been described on substitution of Val-60 and Met-64 (respective box positions 5 and 9) by other nonpolar side chains (38,40). Together, these observations suggest that the bipartite NLS of SRY presents a specific signature of nonpolar side chains as well as canonical clusters of basic residues as a motif for Exp4 recognition. It would be of future interest to determine the crystal structure of a complex between Exp4 and a cargo peptide derived from an SRY or SOX domain.
Despite their perturbed efficiency of nuclear import, the present substitutions did not impair binding of SRY to CaM, previously proposed to mediate nuclear localization (48,73). It is possible that CaM and Exp4 are each required for nuclear entry of SRY such that its impaired binding to either protein could reduce its nuclear import and accumulation, in turn attenuating transcriptional activation of Sox9.
Biophysical Implications-Each substitution at position 70 (box position 15) creates a cavity in the HMG domain. Cavitycreating mutations in the cores of globular proteins are in general destabilizing (74); the extent of destabilization depends the volume of the cavity, presence of polar groups, and potential for compensating changes in neighboring core positions (75). The potential cavity created at position 70 (when modeled with Gly in the absence of structural compensation) has a volume of 164 Å 3 ; in such rigid models, each of the variants would leave a packing defect in core of the HMG box ranging from 153 Å 3 (Ala) to 32 Å 3 (Tyr). Studies of the free domains demonstrated that although aromatic substituents Tyr and Phe enable nearnative folding and stability, aliphatic substituents Leu and Ala were associated with attenuated helical content and marked reductions in thermodynamic and thermal stabilities. Matthews and co-workers (76) have estimated that cavities in globular domains impose a free-energy penalty of 24 -33 cal/mol/ Å 3 . This empirical relationship suggests, for example, that the potential cavity associated with W70A (in an otherwise nativelike domain) would be destabilized by 3.7-5.0 kcal/mol. Because this estimate exceeds the baseline stability of the WT domain (3.0(Ϯ0.1) kcal/mol), it seems reasonable that the variant domain partially unfolds. Analogous partial unfolding was observed in response to an Ala core substitution in an engineered insulin monomer (77), whose stability is similar to that of an HMG box. Despite the larger side chain volume of Leu relative to Ala, the clinical W70L variant domain exhibited more profound biophysical perturbations as indicated by CD and Trp fluorescence. We speculate that its smaller predicted cavity penalty is offset by steric perturbations due to the nonplanar shape of the Leu side chain.
The above findings indicate that the segmental stability of ␣-helices in the HMG box is coupled to their coalescence to define a hydrophobic core as envisioned in diffusion-collision models of nascent protein folding (78). Because Leu and Ala exhibit high intrinsic ␣-helical propensity in isolated peptides (79), these variant domains may be viewed as equilibrium models of WT protein-folding intermediates (80). By analogy to partial folds associated with pairwise substitution of disulfidebridged cysteines by Ala in classical models of protein folding (such as bovine pancreatic trypsin inhibitor (80), ␣-lactalbumin (81), or insulin (82)), the further biophysical characterization of these variant SRY domains may provide insight into the folding landscape of the HMG box.
Protein-DNA recognition is often associated with disorderorder transitions (83). A long-standing paradigm has been provided by the DNA-dependent stabilization of the basic ␣-helical "scissors" of the leucine zipper/bZIP motif (84,85). Disorderorder transitions have also been described on binding of sequence-specific HMG boxes to DNA target sites (86,87). Although such findings have focused on the minor wing (whose mini-core is stabilized in the specific domain⅐DNA complex (86,87)), we sought to investigate whether specific DNA binding could also promote the native-like folding of a variant major wing destabilized by a cavity-associated mutation. Remarkably, the W70A variant SRY domain exhibited near-complete recovery of native ␣-helical content on specific DNA binding; the extent of DNA-dependent folding was especially marked at 37°C. By contrast, however, the W70L domain was refractory to such specific DNA-dependent folding despite its larger sidechain volume (and hence smaller residual cavity volume; 106 Å 3 versus 153 Å 3 ). We speculate that the nonplanar ␥-branch of the Leu side chain restricts alternative packing schemes, disallowing native-like DNA-dependent ␣-helical folding. Steric clash by one or the other ␦-methyl groups of Leu-70 presumably frustrates the specific DNA-bound conformation. Such frustration may underlie the foreshortened lifetime of the Leu-70 domain⅐DNA complex relative to the Ala-70 domain⅐DNA complex. Near-native accommodation of Tyr and Phe (and their occurrence in other classes of HMG boxes at consensus position 15) may reflect the planarity of these rings as well as the opportunities for favorable weakly polar interactions (including edge-to-face aromatic-aromatic core packing) provided by their aromaticity (88).
A surprising outcome of this study was the finding that SRYdependent DNA bending is robust to mutations (such as W70L and W70A) that profoundly impair the thermodynamic stability of the HMG box. Under conditions in which these variant free domains exhibit little organized secondary or tertiary structure (as probed by CD, fluorescence, and 1 H-15 N NMR spectroscopy), native-like bending of a specific DNA target site is maintained as probed by FRET. This assay, which employs a 15-bp DNA duplex labeled at its respective 5Ј-ends by 6-FAM (donor) and TAMRA (acceptor), has previously been shown to correlate with the results of permutation gel electrophoresis (52). We imagine that specific DNA-dependent protein folding restores a well organized bent protein-DNA interface. It would be of future interest to investigate by time-resolved FRET and distance-distribution analysis (34) whether the precision of such sharp DNA bending is likewise unaffected.
The counterintuitive robustness of protein-directed DNA bending to near-complete mutational unfolding of the SRY domain raises the intriguing possibility that the HMG box superfamily may contain members that are natively unfolded (89) and yet functional in gene regulation. In the case of SRY, two factors may have favored the evolution of an autonomous fold. First, destabilization of the HMG box at 37°C is associated with more rapid proteasome-mediated degradation of the intact variant protein in our rodent embryonic cell model. Such accelerated degradation may impair a genetic switch (by limiting enhancer occupancy) as exemplified by the mutation W70L clinical mutation. Second, native unfolding of a free polypeptide would be expected to reduce its specific DNA-binding affinity; the free-energy cost of specific DNA-dependent protein folding would reduce the otherwise favorable binding freeenergy of a prefolded motif. Such an offsetting cost may contribute to the reduced affinities of the present variant domains, especially at 37°C. Because DNA-dependent protein folding can enhance specificity despite lower affinities (89), it is nonetheless possible that this biophysical strategy may be utilized elsewhere in the HMG box superfamily.
Concluding Remarks-Specific residues in a protein may be conserved due to key roles in structure and function. In small motifs, such as the HMG box, a given side chain may play multiple roles, reinforcing its broad conservation among a family or superfamily of related folds. Trp-70 in human SRY (box position 15) provides a compelling example of such a residue. Clinical mutation W70L, originally described in association with gonadal dysgenesis and neoplasia (9), leads to a perfect storm of perturbations affecting domain folding, domain stability, resistance to proteosomal degradation, cellular trafficking, specific DNA-binding affinity, and enhancer occupancy. Amid this storm, selective preservation of specific DNA bending was unexpectedly observed. Although not sufficient to confer gene regulatory activity, we suggest that such coupling between DNA-dependent protein folding and sharp DNA bending may extend the repertoire of natively unfolded proteins to include anomalous members of the HMG box superfamily.
Clinical mutations in SRY and other TFs associated with birth defects and cancer provide valuable experiments of nature. Our studies of representative substitutions at box position 15 in SRY have thus highlighted the contribution of a conserved aromatic residue to the thermodynamic and kinetic stability of a bent protein⅐DNA complex. Analyses of these mutations in a cellular model of the differentiating gonadal ridge have illuminated the relationship between these biophysical features and the impaired transcriptional induction of the male program in a patient with Swyer syndrome and a gonadal malignancy.