Solution structure of the sequence-specific HMG box of the lymphocyte transcriptional activator Sox-4.

Two groups of HMG box proteins are distinguished. Proteins in the first group contain multiple HMG boxes, are non-sequence-specific, and recognize structural features as found in cruciform DNA and cross-over DNA. The abundant chromosomal protein HMG-1 belongs to this subgroup. Proteins in the second group carry a single HMG box with affinity for the minor groove of the heptamer motif AACAAAG or variations thereof. A solution structure for the non-sequence-specific C-terminal HMG box of HMG-1 has recently been proposed. Now, we report the solution structure of the sequence-specific HMG-box of the SRY-related protein Sox-4. NMR analysis demonstrated the presence of three alpha-helices (Val10-Gln22, Glu30-Leu41 and Phe50-Tyr65) connected by loop regions (Ser23-Ala49 and Leu42-Pro49). Helices I and II are positioned in an antiparallel mode and form one arm of the HMG box. Helix III is less rigid, makes an average angle of about 90 degrees with helices I and II, and constitutes the other arm of the molecule. As in HMG1B, the overall structure of the Sox-4 HMG box is L-shaped and is maintained by a cluster of conserved, mainly aromatic residues.

The cloning of the RNA polymerase I transcription factor UBF 1 (1) has originally led to the recognition of a novel type of DNA-binding domain, the so-called HMG box. The HMG box was named after its homology with high mobility group (HMG)-1 proteins and is defined by a loose consensus sequence of about 80 amino acids (2). At this moment, more than 60 proteins with one or more HMG boxes have been reported. An evolutionary study of the HMG box family indicated that two major subfamilies can be discriminated (3). One of these subfamilies contains proteins with a single HMG box, which binds with high sequence specificity to variants of the DNA sequence (A/T)(A/T)CAAAG. Members of this subfamily include products of the mammalian sex determinator Sry and related Sox genes (Sry HMG box-containing genes) (4,5), the Schizosaccharomyces pombe transcription factor Ste11ϩ (6), the lymphoid factors TCF-1 (7,8) and LEF-1 (9,10), and the products of several fungal mating type genes such as Mat-Mc of S. pombe (11) and Mt a1 of Neurospora crassa (12).
DNA binding occurs in the minor groove, as was shown for TCF-1, LEF-1, Mat-Mc, SRY, and Sox-4 by methylation-and diethyl-pyrocarbonate carboxylation interference footprinting and T(C/A)I nucleotide substitutions (13)(14)(15)(16) and is accompanied by the induction of a strong bend in the DNA helix (14, 16 -18). A bend-swap experiment demonstrated that LEF-1 and its specific DNA-binding motif can functionally replace bending induced by the integration host factor at the attP locus in phage integrase reaction (16).
The other subfamily includes proteins with multiple HMG boxes and with a rather nonspecific affinity for DNA, such as the HMG-1 and -2 proteins (19), UBF (1) and mtTF1 (20). Characteristic of these HMG boxes is their affinity for the cis-platinated -GG-adduct in DNA (21,22) and cruciform DNA (23,24), independent of sequence determinants. This suggested that the non-sequence-specific HMG boxes recognize DNA structure instead of DNA sequence (25).
Circular dichroism measurements and secondary structure prediction methods indicated a high ␣-helical content for sequence-specific HMG domains (17). This is consistent with NMR studies on the tertiary structure of the second HMG box of HMG1 (26,27) and HMG-D (28). The 60-amino acid core of these non-sequence-specific HMG boxes consists of three ␣-helices, which form an unusual L-shaped molecule. The angle between the two arms is 70 -80°and is defined by a cluster of conserved, aromatic residues (26 -28). Based on an identical secondary structure observed for the HMG box of Sox-5, a similar L-shaped structure has been suggested for this sequence-specific HMG domain (29).
Hydrophobic interactions of the HMG box of SRY with DNA by partial isoleucine side chain intercalation predicts the positioning of an ␣-helix into a widened minor groove and might account for sequence specificity and DNA bending (30) Using the solution structure of rat HMG1B (26) a model for the SRY-DNA complex was proposed (31).
Since a detailed structure for a sequence-specific HMG box has not yet been determined, we have pursued the elucidation of the NMR solution structure of the HMG box domain of the lymphocyte transcriptional activator Sox-4. This HMG box shows high sequence-specific binding toward the AACAAAG DNA-binding motif with a K d of 10 Ϫ11 M (15). The biological significance of the Sox-4 gene has recently been underscored in a gene disruption experiment. Mice carrying two null alleles of Sox-4 fail to develop functional valves in the heart and have a severe block in early lymphoid development. 2 The NMR data indicate that the secondary structure of Sox-4 HMG box is closely related to that of Sox-5 (29) and that the overall fold compares well with that of HMG1B (26,27) and HMG-D (28).

MATERIALS AND METHODS
Plasmid Construction-The Sox-4 HMG box was cloned by PCR from pSox-4 DNA using the primers 5Ј-ATACATATGGCTAAGACGC-CCAGTGGCCAC-3Ј and 5Ј-CCCGGATCCTACGACCTTCTTTCG-3Ј and inserted between the NdeI and BamHI sites of pET-3c (32,33). The identity of the subcloned HMG-box fragment was confirmed by DNA sequencing. The resulting plasmid was transformed into Escherichia coli strain BL21(DE3).
Production and Purification of the Sox-4 HMG Box Peptide-The production and purification of the Sox-4 HMG box peptide was basically done as described for the HMG box of TCF-1 (17). For the production of unlabeled HMG box peptide the transformed cells were grown at 37°C in LB, while for the production of uniformly 15 N-labeled protein the cells were grown in minimal medium containing 15 NH 4 Cl. Both media contained 100 g/ml ampicillin. In the midlog phase the cells were induced with 0.3 mM isopropyl-1-thio-␤-D-galactopyranoside. After 3 h of induction the bacteria were harvested by centrifugation (20 min, 4000 ϫ g, 4°C ) and resuspended in ice-cold lysis buffer (50 mM Tris-HCl, 1 mM EDTA, 10% glycerol, 250 mM NaCl, 5 mM dithiothreitol, 4 mM CaCl 2 , 40 mM MgCl 2 , 0.5 mM phenylmethylsulfonyl fluoride, pH 8.0). Next, Triton X-100 (final concentration 0.1% (v/v)) was added, and the cells were lysed by sonification (10 ϫ 2 min, 4°C). To reduce the viscosity of the cell lysate, the DNA was broken down with DNase I (10 g/ml) for 15 min at room temperature. The cell debris was pelleted by centrifugation (15 min, 15,000 ϫ g, 4°C). The DNA in the supernatant was precipitated with polyethyleneimine (final concentration 0.2% (v/v)). The HMG box peptide was collected in a 60% (NH 4 ) 2 SO 4 precipitation. The precipitate was resolved in 50 mM Tris-HCl, 1 mM EDTA, 10% glycerol, 50 mM NaCl, 1 mM NaN 3 , 5 mM dithiothreitol, and 0.5 mM phenylmethylsulfonyl fluoride, pH 7.5, and dialyzed against the same buffer at 4°C. The dialyzed protein solution was applied to a 30 ϫ 1-cm Accell Plus CM cation exchange column (Waters). The Sox-4 HMG box was eluted from the column with a 0.1-1 M NaCl gradient. The Sox-4 HMG box fractions were pooled, concentrated, and taken up in the desired buffer by Amicon ultrafiltration.
SDS-Polyacrylamide Gel Electrophoresis-The purity of the isolated protein was checked by SDS-polyacrylamide gel electrophoresis on a Pharmacia PhastSystem using precast 20% SDS-polyacrylamide gels. The gels were developed by silver staining.
Protein Sequencing-The identity of the isolated protein was also checked by analysis of the first 10 amino acids of the protein sequence (34) using an Applied Biosystems model 476A protein sequencing system.
Gel Retardation Analysis-The biological activity was tested in a gel retardation experiment. For this purpose annealed oligonucleotides were labeled by T4 kinase with [␥-32 P] ATP. The probes were purified by nondenaturing polyacrylamide electrophoresis. For a typical binding reaction, 10 ng of purified protein was incubated in a volume of 15 l containing 10 mM Hepes, 60 mM KCl, 1 mM EDTA, 1 mM dithiothreitol, and 12% glycerol. After a 5-min incubation at room temperature, probe (10,000 -20,000 cpm, equalling 1 ng) was added, and the mixture was incubated for an additional 20 min. The samples were than electrophoresed through a nondenaturing 8% polyacrylamide gel in 0.25 ϫ TBE at room temperature.
Circular Dichroism Experiments-CD measurements were performed on a Jasco-600 spectropolarimeter equipped with a temperature-controlled water bath. The CD signal was calibrated with d-10 camphor sulfonic acid (35). The spectrum represents an average of 10 scans. The CD spectra were fitted as described elsewhere (36). The CD measurements were done at 293 K in 10 mM sodium phosphate, 100 mM NaCl, and 1 mM sodium azide, pH 7.4. The protein concentration of the CD sample was 48 M.
NMR spectra were recorded on 500 and 600 MHz Bruker AMX spectrometers at 293 and 298 K. All spectra were required with solvent suppression during relaxation delay. NOESY spectra (37) were recorded with a mixing times of 100 and 150 ms. TOCSY spectra (38) were recorded with a clean MLEV17 pulse sequence (39) and spin-locking times of 20, 40, 60, and 85 ms. For these two-dimensional spectra 512 t 1 increments each consisting of 96 transients per FID of 2048 data points were collected. Two-dimensional 15 N-1 H HSQC spectra were collected with 121-360 t 1 increments consisting of 2-144 transients per FID of 1024 data points. Three-dimensional 15 N-1 H NOESY-HSQC spectra of 184 (t 1 ) ϫ 64 (t 2 ) ϫ 1024 (t 3 ) datapoints and 8 transients/FID with a mixing time of 150 ms and three-dimensional 15 N-1 H TOCSY-HSQC spectra of 160 (t 1 ) ϫ 64 (t 2 ) ϫ 1024 (t 3 ) data points and 24 transients/ FID with spin-locking times of 50 ms and a clean MLEV17 pulse sequence were recorded. Pulsed field gradients were used for artifact suppression (40). Fast exchange of amide protons with water were identified from the difference of a NH sensitivity-enhanced 15 N-HSQC experiment with and without presaturation (41). In this experiment 160 (t 1 ) ϫ 1024 (t 2 ) points were collected. 15 N backbone dynamics were determined using 1 H-15 N heteronuclear NOE experiments (42,43). Gradient sensitivity-enhanced T 1 measurements (41,43) were done with relaxation times of 6, 12, 18, 24, 36, 54, 72, 96, 120, 150, and 192 ms. The 15 N magnetization was spin-locked in the transverse plane during the relaxation period using a spin-lock field strength of 2.5 kHz. Spectra with 160 (t 1 ) ϫ 1024 (t 2 ) data points were acquired.
The spectra were processed on a Silicon Graphics workstation using the TRITON NMR software package developed at the Bijvoet Center, University of Utrecht. The two-dimensional spectra were processed using a /2-shifted sine-bell window for t 1 and a /3-shifted squared sine-bell window for t 2 . The t 1 data of the two-dimensional spectra were zero-filled to 1024 points. The three-dimensional spectra were processed using a /2.5-shifted sine-bell window for t 1 , a /2-shifted sinebell window for t 2 , and a /2.5-shifted squared sine-bell window for t 3 . The t 1 and t 2 data of the three-dimensional spectra were also zero-filled to 256 and 128 points, respectively. Fourth-order polynominal base-line corrections were applied in each frequency domain (44). The 1 H chemical shift values were calibrated using the H 2 O resonance with a chemical shift of 4.81 relative to 3-(trimethylsilyl)propionate at 293 K; the 15 N chemical shift values were referred to the 15 NH 4 Cl signal at 22.3 ppm at 293 K. The spectra were analyzed using the program ALISON developed at the Bijvoet Center, University of Utrecht (45).
For the generation and analysis of Sox-4 HMG box structures In-sightII version 2.2.0␤ (Biosym Technologies Inc., San Diego, CA) was used. For distance geometry calculations we used the program DGII (46). Triangle smoothing for sequential pairs of residues with a wobble of 10°for the peptide bond planarity was used in generating the distance bounds matrix. The structures were embedded by prospective metrization in four dimensions using a uniform probability distribution for selecting trial distances. The fit of the embedded structures was improved by a weighted least-square fit of the distances in the newly embedded coordinates to the distances in the trial distance matrix using 10 Guttman transformations with constant distance weights. For optimization the structures were submitted to 10,000 iterations of simulated annealing using an initial energy of 2500 kcal/mol, a maximum temperature of 200 K, a time step of 0.2 ps, and atomic masses of 1 kDa. Finally, the structures were submitted to 2500 iterations of conjugate gradient energy minimization.
The structures were refined further by restrained energy minimization and molecular dynamics using Discover version 2.8 (Biosym Technologies Inc., San Diego, CA). The protocol consisted of an energy minimization phase using 500 iterations of steepest descent and 3000 iterations of conjugate gradient minimization, followed by molecular dynamics at 311 K of 10,000 iterations of 0.5 fs and a final energy minimization of 500 iterations of steepest descent and 2500 iterations of conjugate gradient minimization. The consistent valence force field was used without cross-correlation terms and Morse potentials. In the calculations, the weighting factors of all physical terms were set to 1, and a distance restraint force constant of 300 kcal⅐mol Ϫ1 ⅐Å Ϫ2 with a maximum force of 2000 kcal⅐mol Ϫ1 ⅐Å Ϫ2 were used. The peptide bonds were forced to trans with a force constant of 60 kcal⅐mol Ϫ1 ⅐rad Ϫ2 .
The stereochemical quality of the structures was checked with the program PROCHECK (47).

Expression, Purification, and Characterization of the HMG Box of Sox-4
The HMG box of murine Sox-4 (amino acids 59 -135) (15) was produced in a T7-based expression system (32,33). For this purpose a pET-3/Sox-4 HMG box plasmid was constructed. The identity of the inserted Sox-4 HMG box fragment was con-firmed by DNA sequencing. This recombinant plasmid was transformed into E. coli BL21(DE3), where the Sox-4 HMG box was overexpressed after isopropyl-1-thio-␤-D-galactopyranoside induction (Fig. 1A). The overexpressed HMG box peptide was purified to homogeneity in a single-step cation exchange chromatographic run. A typical elution profile is presented in Fig. 1B. The procedure yielded 1-2 mg of Sox-4 HMG box protein/liter of bacterial culture with a purity greater than 95% as judged from a silver-stained SDS-polyacrylamide gel. The identity of the first 10 amino acids of the isolated peptide was confirmed by protein sequencing. The DNA binding activity of the protein was established in a gel retardation assay (Fig. 1C). Fig. 2 shows the CD spectrum of the HMG box peptide of Sox-4. Deconvolution of the spectrum predicted a secondary structure with 54% ␣-helix, 11% ␤-sheet, and 35% random coil. A similar high ␣-helical content was observed for the HMG boxes of TCF-1 (17), HMG1B (26,27), HMG-D (28), and Sox-5 (29).

NMR Measurements
Assignment-Unlabeled as well as uniformly 15 N-labeled Sox-4 HMG box samples were used for NMR spectroscopy. The NMR data were collected at pH 6.5 and at temperatures of 293 and 298 K. Conditions with a pH lower than 6.5 resulted in precipitation of the protein, while at temperatures above 298 K the protein starts to unfold. The predominantly ␣-helical nature of the protein results in a limited chemical shift dispersion. This causes a severe overlap in the amide and fingerprint region of the NOESY and TOCSY spectra and makes it difficult to assign these spectra completely. However, the 15 N signals of the various residues are well separated in the two-dimensional 15 N-1 H HSQC experiment (Fig. 3). Therefore, we collected three-dimensional 15 N-1 H NOESY-HSQC and three-dimensional 15 N-1 H TOCSY-HSQC data at 293 and 298 K. The majority of sequential assignments of the amino acid spin systems was found by comparison of the amide region of three-dimensional 15 N-1 H NOESY-HSQC and three-dimensional 15 N-1 H TOCSY-HSQC spectra. In some cases additional information of NOESY, TOCSY, and/or two-dimensional 15 N-1 H HSQC spectra was helpful. First, stretches of spin systems with sequential 15 NH-15 NH and C ␣ H-15 NH NOE contacts were identified in the three-dimensional 15 N-1 H NOESY-HSQC spectrum (Fig. 4). Assignment of these spin systems to specific residues was done by comparison of the amino acid side-chain resonances in the three-dimensional 15 N-1 H NOESY-HSQC, three-dimensional 15 N-1 H TOCSY-HSQC, NOESY, and TOCSY spectra. Two-dimensional spectra were especially helpful for the assignment of the side chains of the aromatic residues. Using this strategy we were able to assign more than 80% of the backbone resonances of the HMG box. The 15 N and 1 H chemical shift data are deposited at the BioMagResBank (University of Wisconsin, Madison) (Table I). Although we collected data sets at two temperatures (293 and 298 K), some residues at the N and C termini could not be identified, due to flexibility and/or overlap. Also, the assignments of Asn 28 , Ala 29 , Lys 47 , Ile 48 , and Pro 49 , which are located in the two loop regions, could not be established.
The observation of stretches of strong d NN (i, i ϩ 1) and weak d ␣N (i, i ϩ 1) connectivities in combination with d ␣N (i, i ϩ 3), d ␣N (i, i ϩ 4), and ␣␤(i, i ϩ 3) contacts (48) in the threedimensional 15 N-1 H NOESY-HSQC and NOESY spectra provided evidence for the existence of three ␣-helical regions in the Sox-4 HMG box. The ␣-helices are formed by residues Val 10 -Gln 22 , Glu 30 -Leu 41 and Phe 50 -Tyr 65 (Fig. 5). Based on these NMR data an ␣-helical content of 53% was calculated for the Sox-4 HMG box. This is consistent with the analysis of the CD spectrum of the Sox-4 HMG box (Fig. 2), which revealed an ␣-helical content of 54% (see "Circular Dichroism").
Fast and intermediate exchanging NH protons with water were identified from the difference of a NH sensitivity-enhanced 15 N-HSQC experiment with and without presaturation. Fast exchanging NH protons are mainly found outside the helical regions with exception of helix III, which also contains a number of fast and intermediate exchanging NH protons (Fig. 5). A more or less similar distribution of mobile backbone NH protons was observed in a heteronuclear NOE experiment (Figs. 5 and 6A). These observations are in accordance with a less rigid and more exposed character of helix III. The most instable region of helix III is Glu 55 -Arg 56 -Leu 57 -Arg 58 -Leu 59 as  is indicated by a patch of fast exchanging and mobile NH backbone protons. However, helix III is not flexible as indicated by the T 1 relaxation times (Fig. 6B). The different time scale of NH exchange, 1 H-15 N NOE, and NH T 1 relaxation explains the seemingly contradictory results. Possible salt bridges between Arg 17 and Glu 21 in helix I, between Arg 34 and Glu 30 in helix II, and between Lys 60 and Asp 64 in helix III might contribute to helix stabilization (49). Loop regions are located between Ser 23 and Ala 29 and between Leu 42 and Pro 49 (Fig. 5). The N-terminal residues Asn 6 -Met 9 as well as the C-terminal amino acids between Pro 66 and Pro 72 have an extended conformation, as was indicated by the observation of strong sequential d ␣N and weak d NN contacts and the absence of most medium range NOE contacts (48). Turns involving 4 residues are characterized by a strong d NN (3,4) (2,4) contact in this sequence, since the C ␣ H proton of Pro 24 was not assigned. It is noted that this sequence has a type I turn structure in the final model (see later). The unassigned residues Arg 3 -Asn 6 and Arg 73 -Lys 77 at the N and C termini are most probably flexible and unstructured.
Tertiary Structure-Long distance NOEs were only observed between a limited number of residues, which are located in the hydrophobic core of the HMG box peptide. Ala 7 , Phe 8 , Met 9 , Val 10 , and Trp 11 are located at the N-terminal end of helix I and contact Trp 39 and Leu 41 at the C-terminal end of helix II. Amino acids Phe 50 , Glu 53 , and Ala 54 located in the N-terminal end of helix III show NOEs with Val 10 and Trp 11 of helix I. All interresidue NOE cross peaks were classified according to their intensities as strong, medium, or weak. The corresponding distance restraints were 1.8 -2.75 Å (strong), 1.8 -3.75 Å (medium), and 1.8 -5.25 Å (weak). The three-dimensional structure was calculated using these experimental restraints in a distance geometry (DG) calculation followed by restrained molecular dynamics and energy minimization calculations. The distribution of the NOE distance restraints against the residue number is shown in Fig. 7. In total 50 DG structures were generated. The 14 structures with highest values of the DG error function were discarded. The 36 remain-  ing structures were submitted to a three-phase protocol consisting of an energy minimization run (3500 iterations), molecular dynamics (5 ps, 311 K), and a final energy minimization step (3000 iterations). From the resulting structures those with the lowest energy (Ͻ3000 kcal/mol) and with Յ6 distance violations of Յ0.1 Å were selected. The stereochemical quality of the structures was evaluated with the program PROCHECK (47). Those structures with D-amino acids and/or cis peptide bonds were also discarded. A final set of 15 structures is presented in Fig. 8. The overall structure of the Sox-4 HMG box is L-shaped (Fig. 8 A similar pattern is observed when the RMSD value of the C ␣ backbone atoms is plotted against the residue number (Fig. 9). In accordance with the T 1 relaxation times (Fig. 6B), these data indicate that in these computations helix III forms a helical element whose position varies relative to helix I and II. This variation is caused by the absence of long range NOE contacts between helix III and the other parts of the Sox-4 HMG box. Residues Ala 7 , Phe 8 , Met 9 , Val 10 , and Trp 11 , Trp 39 , Leu 41 , and Phe 50 form a hydrophobic core and stabilize the structure of Sox-4 HMG box. Note that these residues with the exception of Lys 41 are conserved within the HMG box family (2). DISCUSSION Here, the NMR solution structure of the sequence-specific HMG box of Sox-4 is presented. The overall L-shape structure compares well with that reported for the non-sequence-specific HMG boxes of HMG1B (26,27) and HMG-D (28), which recognize structural features of DNA (25). As in the HMG1B and HMG-D, three ␣-helical regions dominate the HMG box structure of Sox-4. The sequential positions of helix I and II coincide with the corresponding helices in HMG1B (26,27) and HMG-D (28). Helix III is positioned between proline 49 and 66 and is 4 residues shorter than helix III of HMG1B (26,27) and HMG-D (28). Apparently, this results from the helix-breaking Pro 66 , which is unique to the sequence-specific HMG boxes (2) but is replaced by a structurally neutral alanine in HMG1B (26,27) and by lysine in HMG-D (28). The helices I and II are followed by loops that start with type I turns (Ser 23 -Met 26 after helix I and Leu 42 -Ser 45 after helix II). The presence of such turns was not reported for HMG1B (26,27) and HMG-D (28).
The overall HMG box fold is stabilized by a hydrophobic core involving residues Ala 7 , Phe 8 , Val 10 , and Trp 11 , Trp 39 , Leu 41 , and Phe 50 . With the exception of Leu 41 , these residues are conserved within the HMG box family, irrespective of their binding specificity (2). The structure of this hydrophobic core should be considered as the HMG box "signature." The mechanism of binding to DNA is fundamentally different for the two types of HMG boxes. The non-sequence-specific HMG1B box binds to preexisting structures (25), such as cruciform DNA (23,24) and DNA bent by the cis-platinum -GGadduct (21,22). The binding of the HMG box proteins to cruciform DNA has not been reported to induce conformational changes in the DNA. Therefore, it is likely that the rigid HMG1B-type box fits directly onto these unusual DNA structures. This is in contrast with the sequence-specific HMG box proteins, that alter the DNA conformation significantly. The binding of a monomeric sequence-specific HMG box to the minor groove of a straight DNA helix (13-16) introduces a sharp bend (on the order of 90°) in the DNA helix as determined in circular permutation assays (14, 16 -18). This is supported by the dispersion of the 31 P resonances in the SRY-DNA complex (31).
Exchange of the N-and C-terminal regions of the sequencespecific HMG box of hLEF-1 with those of non-sequence-specific HMG1B showed that the sequence specificity of hLEF-1 is maintained by the N-and C-terminal residues (51). Mutation of  (Fig. 9). Gly 36 is located in helix II, and Lys 47 is positioned in the loop region between helix II and III. (Fig. 9). Mutations in other parts of the HMG box such as F109S (Phe 50 ) in SRY (55) and V316L (Met 20 ), and Y346S (Phe 50 ) in LEF-1 (56) do not influence the binding properties. However, they can still disrupt the biological function of the protein as is demonstrated by the presence of the F109S (Phe 50 ) mutation in SRY of sex-reversed XY female (55). Of special interest is mutation M64I (Met 5 ) in SRY, which shows an almost normal DNA-binding affinity, but decreases the DNA bending with 20° (55).
The side chain of Ile 68 (Met 9 ) in the N-terminal region of the sequence-specific HMG box of SRY intercalates partially from the minor groove side between the two central AT base pairs of its d(AACAATCA)⅐d(TGATTGTT) heptamer motif (30,31). Note that in murine SRY and Sox-4 this interacting Ile is replaced by Met. With this information a model for the SRY-DNA complex was constructed (31). Here, the concave surface of the HMG box of SRY, whose structure was based on the NMR solution structure of the non-sequence-specific HMG1B box, faces the bent DNA with helix I, which is docked in a widened minor groove.
The effect of the mutations located in the N-terminal region