![]()
|
|
||||||||
J. Biol. Chem., Vol. 277, Issue 45, 43463-43473, November 8, 2002
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
From the
Received for publication, May 10, 2002, and in revised form, August 22, 2002
Sex-specific gene expression in Drosophila
melanogaster is regulated in part by the Doublesex (DSX)
transcription factor. Male- and female-specific splicing isoforms share
a novel DNA-binding domain, designated the DM motif. This domain is
conserved among a newly recognized family of vertebrate transcription
factors involved in developmental patterning and sex determination. The DM motif consists of an N-terminal zinc module and a disordered C-terminal tail, hypothesized to fold on specific DNA binding as a
recognition Sex determination in metazoans is regulated by diverse pathways
(1). Formation of the mammalian testis and subsequent male development
are initiated by Sry (2-5), a single-copy gene on the Y
chromosome that encodes a high mobility group
(HMG)1 box (6, 7). Distinct
genetic mechanisms operate in Drosophila melanogaster
(8-14) and Caenorhabditis elegans (15, 16) wherein sex is
determined by the X:autosome ratio, a process linked to X-dosage
compensation (1, 17). Although the pathways of the fly and worm (Fig.
1A) seem otherwise unrelated,
a cysteine-rich DNA-binding domain (the DM motif; Fig. 1B)
is conserved within downstream transcription factors Doublesex (DSX)
and MAB-3, respectively (18). This domain exhibits a distinctive
pattern of cysteines and histidines (boxed in Fig.
2A) (18-20). These residues
participate in two intertwined Zn2+-binding sites (Fig.
1B) (21). The DM motif defines a newly recognized family of
metazoan transcription factors (22).
DM genes are conserved among vertebrates (23). A subset (designated
Dmrt1, -2, -3, etc.; see Ref. 23) is
expressed in the differentiating gonad and is associated with human sex
reversal (46, XY gonadal dysgenesis; see Ref. 24). Additional DM
homologs of unknown function are encoded in the C. elegans
genome (middle group in Fig. 2A) (25). Although
the biochemical properties of these DM-related proteins have not been
characterized, the DM domains of DSX and MAB-3 exhibit specific
zinc-dependent DNA binding (19, 20). Mutations in these
domains are associated with intersex development (arrowheads
in Fig. 2A) (26-28). Phenotypes correlate with loss of DNA
binding activity (18, 19). DSX mutations in the invariant cysteines or
histidines impair both zinc coordination and DNA recognition (19).
Point mutations in the DM domains of human DMRT1 or
DMRT2 (linked genes on chromosome 9p) are apparently rare in
patients with unexplained sex reversal or intersex phenotypes,
presumably due to functional redundancy (24). Deletions spanning
DMRT1 and DMRT2 are nonetheless associated with
46, XY gonadal dysgenesis, giving rise to intersex phenotypes with high
risk of gonadoblastoma (the 9p syndrome; see Refs. 29-36). Selective
deletion of the murine Dmrt1 gene by homologous
recombination causes XY gonadal dysgenesis and azospermia without sex
reversal (37). There is no apparent phenotype in the female mouse.
The DSX DM domain contains an ordered moiety and disordered tail (21).
In 1H NMR studies the ordered moiety (spanning DSX residues
40-80, including invariant cysteines and histidines; Fig.
1B) exhibits marked dispersion of chemical shifts, whereas
the tail (residues 81-105) is manifest by an envelope of poorly
resolved spin systems at near-random-coil chemical shifts. The zinc
module consists of tetrahedral CCHC and HCCC metal-binding sites; the
two sites are contiguous within a single hydrophobic core. The tail is
hypothesized to function as a nascent recognition This study investigates structure-function relationships in the DSX DM
domain. Deletion of the tail is shown not to alter the folding of the
zinc module but precludes formation of a specific dimeric complex. DNA
binding assays employ the dsxA control element and
corresponding DNA half-site. Selected residues important for specific
DNA binding are identified by site-directed mutagenesis. Two
mutagenesis strategies are employed. The first approach focuses on the
disordered C-terminal tail wherein charged or polar side chains
(arginine or glutamine) are substituted by alanine; site-directed
mutagenesis is effected in Escherichia coli. The second
approach employs chemical protein synthesis by native ligation (43);
arginine or lysine residues in the zinc module are substituted by
norleucine. This strategy is meant to distinguish between the possible
structural role of the aliphatic side chain (retained in norleucine)
and the functional role of the positive charge ( Chemical Protein Synthesis--
Wild-type and norleucine analogs
were prepared by solid-phase peptide synthesis and native fragment
ligation (Fig. 3A) (43). The
N-terminal peptide consisted of DSX residues 35-67; the C-terminal peptide consisted of residues 68-105 (DM domain) or 68-86
(DM Bacterial Expression--
The DSX cDNA coding region (female
isoform) was kindly provided by P. Wensink (Brandeis University). A DNA
segment encoding the wild-type DSX DM domain (residues 35-105) was
recloned by polymerase chain reaction into an overexpression plasmid in
which a tac promoter (inducible by
isopropyl- Protein Purification--
Recombinant proteins were
purified from E. coli lysates by Co2+ affinity
chromatography (Clonetech, Inc.) using an imidazole elution gradient.
Following thrombin digestion using bead-immobilized enzyme, cleaved DM
domains were purified by reverse phase-HPLC (Vydac C4 column, 1 × 25 cm). The yield in each case was ~2 mg/liter of fermentation.
Molecular masses of purified proteins were verified by mass
spectrometry to exclude proteolytic degradation or inadvertent mutation. None of the alanine substitutions significantly altered the
solubility or chromatographic properties of the variant proteins.
DNA Binding Assays--
The sequence of the DNA site (29 bp) was
5'-GTGCACAACTACAATGTTGCAATCAGCGG-3' and complement
(3'-CACGTGTTGATGTTACAACGTTAGTCGCC-5'; dsxA site in boldface). The sequence of a consensus
half-site (10 bp) was 5'-AGCTACATTG-3' and
complement (critical bases in bold) (19, 20). A nonspecific DNA control
site (17 bp) was provided by DNA Binding Cooperativity--
A formalism is obtained from the
method of Senear and Brenowitz (46) based on classical thermodynamic
models (47, 48). In brief, the monomeric DM domain binds to the
pseudo-palindromic half-sites of the dsxA DNA target
with intrinsic association constants k1 and
k2, respectively, and cooperativity parameter
k12. The concentration of labeled DNA probe is
much less than the range of protein concentrations tested. At a given
protein concentration [P] the fraction of DNA sites that are unbound
(F0), singly occupied
(F1; labeled C1 in the present GMSA studies), or
doubly occupied (F2; labeled C2) is given by
Equations 1-3 (46),
Circular Dichroism--
Zinc-dependent protein
folding of DM Nuclear Magnetic Resonance Spectroscopy--
For NMR
spectroscopy, solutions were purged with N2 and contained 2 mM deuterated dithiothreitol (Cambridge Isotopes, Inc., Woburn, MA). Spectra were recorded in 50 mM deuterated
Tris-HCl (pH 6.5, pD 6.1, or pD 8.0; see Fig. 7 legend) and 5 mM deuterated dithiothreitol at a Zn2+:protein
ratio of 2.5:1 in 90% H2O and 10% D2O at
25 °C.
The DSX DM domain contains an ordered moiety (the zinc module) and
disordered C-terminal tail, proposed to fold on specific DNA binding as
a DNA recognition The 1H NMR spectrum of DM
Sex-specific Gene Regulation
THE DOUBLESEX DM MOTIF IS A BIPARTITE DNA-BINDING DOMAIN*
§,
§¶,
,
, and
**
Department of Biochemistry, Case Western
Reserve University School of Medicine, Cleveland, Ohio 44106 and
Gryphon Sciences, Inc.,
South San Francisco, California 94080
![]()
ABSTRACT
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES
-helix. Truncation of the tail does not perturb the
structure of the zinc module but impairs DNA binding and
DNA-dependent dimerization. Chemical protein synthesis and
alanine scanning mutagenesis are employed to test the contributions of
13 side chains to specific DNA binding. Selected arginine or lysine
residues in the zinc module were substituted by norleucine, an
isostere that maintains the aliphatic portion of the side chain
but lacks a positive charge. Arginine or glutamine residues in the tail were substituted by alanine. Evidence is obtained that both the zinc
module and C-terminal tail contribute to a bipartite DNA-binding surface. Conserved arginine and glutamine residues in the tail are
required for high affinity DNA recognition, consistent with its
proposed role as a nascent recognition
-helix.
![]()
INTRODUCTION
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES

View larger version (31K):
[in a new window]
Fig. 1.
Overview of DM motif. A, sex
determination pathways of the fruit fly (a) and the nematode
C. elegans (b). For clarity, other target genes
of tra factors in respective branched pathways are not
shown. For review see Ref. 1. B, ribbon model (stereo pair)
of DSX zinc module (residues 40-80) based on NMR studies (21). Helical
segments consist of residues 46-50 and 71-79. Cys and His side chains
are shown; two zinc ions are shown as spheres. Dashed
line at the C terminus indicates the beginning of disordered tail.
Coordinates have been deposited in the Protein Data Bank (accession
code 1LPV).

View larger version (48K):
[in a new window]
Fig. 2.
DM sequences and DNA binding
studies. A, alignment of metazoan DM sequence motifs.
Cysteines and histidines that coordinate Zn2+ are aligned
as two intertwined binding sites (boxes), site I and site II
(see Fig. 1B). Sites of substitutions employed in the
present study are shown in red (impaired DNA binding),
green (unimpaired DNA binding), or gold
(Arg-46; slightly decreased DNA binding with confounding structural
perturbation). Stable
-helical elements are highlighted by
magenta ribbons above the DSX sequence. Nascent C-terminal
-helix is indicated by a black dashed extension.
Arrowheads without parentheses indicate
sites of point mutations in dsx or mab-3
associated with intersex development; substitutions in
parentheses indicate variants characterized only by
biochemical assays (19). DSX, DM domain in D. melanogaster (GenBankTM accession number M25292).
MAB-3a and MAB-3b, The first and the second DM
domains in C. elegans protein (Z99278). Other
C. elegans DM sequences: F10C1.5,
cosmid F10C1 (U49831); C34D1.1, cosmid C34D1 (Z78060);
K08B12, cosmid K08B12 (U97001); F13G11, cosmid
F13G11 (Z83317); T22H9.4, cosmid T22H9 (AAC69225);
CAA93739 (Z69883); and CAA21612.1 (AL032637.1).
Vertebrate DM sequences: hDMRT1 and hDMRT2, human
homologs on short arm of chromosome 9 (DMRT1h AF130728; DMRT2h
AF130729). Other homologs: mDmrt1 (murine, AF202778),
pDmrt1 (porcine, AF216651), and cDmrt1 (chicken,
AF123456). TERRA, DM domain of zebrafish terra
(AF080622). Sequences shown include genes (such as TERRA; see Ref. 70)
not known to be involved in sex determination. B, deletion
of C-terminal tail impairs DNA binding and cooperativity. Lane
1, intact DM domain at 16 nM concentration in low
ionic strength assay buffer. Lane 2, free fbe
probe (29 bp). Lanes 3-12, successive DM
concentrations
14, 28, 42, 56, 92, 130, 180, 280, 370, and 960 nM.
C, alanine scanning mutagenesis of Arg and Gln residues in
C-terminal tail of DSX DM domain. Wild-type control is provided in
lane b. Alanine substitutions at residues 79, 80, 81, 90, 91, and 93 impair specific DNA binding to dsxA. By
contrast, alanine substitutions at residues 74, 95, 98, and 99 do not
impair specific DNA binding. Composite GMSA gel showing specific
fbe gel shift at DSX domain concentration 140 nM. Percentages of DNA probe shifted to the C1 and C2 forms
were as follows: wild type (wt) (C1 4% and C2 38%); R74A
(3 and 27%); R79A (non-detectable and 10%); Q80A (2 and 3%); R81A (2 and 11%); R90A (< 1 and 4%); R91A (1 and 8%); Q93A (1 and 5%);
Q95A (7 and 60%); Q98A (2 and 55%); and R99A (6 and 30%). Apparent
binding in C is weaker than in prior reports (11, 19, 21)
due to higher [KCl] and poly [d(I-C)] concentrations.
-helix (21).
Mutations in either zinc-binding site or tail (R91Q; see Ref. 19) can lead to an intersex phenotype. The DSX DM domain, itself monomeric, binds as a dimer to a specific target site, designated
dsxA, within the fat body enhancer (38, 39), a well
characterized genetic response element with sex-specific regulation
(40, 41). Critical base pairs in a consensus target site (as defined by random binding site selection) are palindromic about a central AT bp
(20, 28). Studies of DNA analogs suggest that the motif binds in the
DNA minor groove. Despite such groove targeting, DSX-induced DNA
bending is negligible as inferred from permutation gel electrophoresis
(21). Absence of sharp DNA bending stands in contrast to the marked
electrophoretic anomalies induced by binding of HMG boxes (including
SRY; see Refs. 7 and 42). The DNA-binding properties of DSX are
proposed to underlie its function in combinatorial gene regulation (21,
38, 39). To our knowledge, the structure of a DM-DNA complex has not
been determined.
-amino group of
lysine or guanidinium group of arginine; absent in norleucine). Of the
13 substitutions tested, 8 significantly impair specific DNA binding
and 5 do not. Together, evidence is obtained that protein-DNA contacts
are made by side chains in both the zinc module and tail to define a
bipartite DNA-binding motif. Multiple conserved arginine and glutamine
residues in the tail are required for specific DNA binding, consistent with its proposed role as a nascent DNA-recognition
-helix.
![]()
EXPERIMENTAL PROCEDURES
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES
fragment; outlined in schematic form in Fig.
3A). Ligation products were purified by reverse phase-HPLC
(Fig. 3, B and C). Peptide fragment synthesis
employed an Applied Biosystems 430A synthesizer. t-Boc amino
acids were used with the following protection: Arg(tosyl), Asn(xanthyl), Asp(OcHxl), Cys(4MeBzl), Glu(OcHxl), His(DNP), Lys(2ClZ), Ser(Bzl), Thr(Bzl) and Tyr(BrZ). N-terminal peptides (DSX residues 35-67 and variants) were synthesized on a C-terminal
thioester-generating resin. C-terminal peptides (DSX residues 68-105
and 68-86) were respectively synthesized on a
Boc-Glu(OcHxl)-OCH2-Pam resin and Boc-Gln-OCH2-Pam resin. Full-length polypeptides were
prepared by native chemical ligation (43). Electrospray mass spectra of
domains with and without addition of 2 eq of Zn2+ (Fig.
4) yielded values differing by 125-126
daltons, consistent with twice the atomic mass of zinc (130.76 Da)
minus six, the number of cysteines from which sulfhydryl protons are
removed on metal binding. The following four DM analogs were prepared: R46Z, K57Z, K60Z (using variant N-terminal fragments; where Z designates norleucine), and R91Q (using a variant C-terminal fragment). A 19-residue C-terminal tail peptide (DSX residues 87-105; sequence TALRRAQAQDEQRALHMHE) was also synthesized. The peptide contained an
N-terminal acetyl group; its observed molecular mass (2303 ± 0.6 Da) is in accord with its calculated mass (2303.6 Da). The residue
numbers refer to native DSX sequence; DSX residue 35 thus corresponds
to residue 1 of DM consensus sequence (Fig. 2A). Use of
native DSX numbering facilitates correspondence with prior genetic
analysis (19).

View larger version (31K):
[in a new window]
Fig. 3.
Chemical protein synthesis by native
ligation. A, schematic overview of ligation scheme whereby
an N-terminal peptide (33 residues) is joined to native or truncated
C-terminal peptides (38 or 19 residues, respectively) to yield DM or
DM
polypeptides. The cysteine participating in the
ligation reaction is highlighted (filled oval);
the remaining five cysteines are indicated by open
ovals. B, rp-HPLC chromatograms showing purity and
elution positions of DSX DM domain (a) or DM
fragment (b). C, example of ligation reaction in
which purified N-terminal peptide thioester (a) is coupled
to purified C-terminal 19-mer (b) to yield ligated product
(c) (labeled P, asterisk) and a
mixture of free C- and N-terminal peptides (labeled C and
N in c). The ligation product is readily purified
by HPLC (asterisk in d).

View larger version (20K):
[in a new window]
Fig. 4.
Electrospray mass spectrometric analysis of
synthetic ligation products: A, intact DM domain;
B, DM
fragment. Spectra were obtained
in the presence (red) or absence (black) of
Zn2+ ions at a stoichiometry of 2 zinc ions per
polypeptide. Mass differences (125 and 126 Da) are in accord with
atomic mass of two zinc ions (130.72 Da) minus six sulfhydryl protons
lost on metal binding. Upper panels contain raw data;
lower panels show inferred molecular masses.
-thio-galactopyranoside) drives expression of an
N-terminal His6-tagged fusion protein. The fusion protein
contains an N-terminal staphylococcal nuclease domain followed by a
thrombin-sensitive linker and C-terminal DSX domain (44). The purified
DSX domain contains two N-terminal non-native N-terminal amino acids
(Gly-Ser) derived from the thrombin site. This expression system is a
modification of a plasmid originally designed by Markley and co-workers
(45). Alanine scanning mutations were introduced into the DSX coding
region in phage M13mp19RF by oligonucleotide-directed mutagenesis by
polymerase chain reaction and recloned into the above expression
plasmid. Fidelity of mutagenesis was verified in each case by DNA
sequencing. None of the mutations altered the efficiency of
overexpression in E. coli.
phage operator site
OL1 (5'-TACCACTGGCGGTGATA-3' and complement).
Immediately prior to DNA binding studies, the proteins were reduced in
50 mM dithiothreitol in 50 mM Tris-HCl (pH
8.0), repurified by rp-HPLC as above, lyophilized, and reconstituted with a 20% excess of ZnCl2 in the assay buffer (below).
Specific DNA binding was monitored by gel mobility shift assay (GMSA;
10% acrylamide with 29:1 bisacrylamide) run in 45 mM Tris
borate (pH 8.0) without EDTA (omitted to avoid competitive chelation of
zinc ions) at 200 V and 4 °C. In studies of alanine scanning mutants (Fig. 2C) and norleucine analogs binding (Fig. 8) reactions
were conducted in 20 mM Tris-HCl (pH 7.4), 150 mM KCl, 5 mM MgCl2, 0.1 mM ZnCl2, 5% glycerol, 33 µg/ml bovine serum
albumin, and either 0.08 (Fig. 8) or 0.10 µg/µl (Fig.
2C) poly(dI-dC) competitor DNA. In studies of
DM
fragment (Fig. 2B) and half-site binding (Fig. 6), binding reactions were conducted in a lower ionic strength buffer consisting of 10 mM Tris-HCl (pH 7.4), 50 mM KCl, 0.1 mM ZnCl2, 5% glycerol,
33 µg/ml bovine serum albumin, and 0.06 µg/µl poly(dI-dC) DNA (unlabeled). The concentration of
33P-labeled DNA was less than 1.5 nM. GMSA
studies of biosynthetic alanine variants included control lanes
containing the wild-type biosynthetic domain; GMSA studies of synthetic
variants included control lanes containing the wild-type synthetic
domain. Quantification of GMSA bands was obtained using the
PhosphorImager software package as described by the vendor (Amersham Biosciences).
(Eq. 1)
(Eq. 2)
The formalism may be simplified by the assumption that
k1 = k2 = k,
i.e. the two half-sites have similar intrinsic affinities. Under these conditions, the cooperativity parameter is given by Equations 4 and 5 (46),
(Eq. 3)
(Eq. 4)
where F1,max is the maximal value of the
singly-occupied complex observed in the GMSA titration. The low values
of F1,max observed in the present studies (see
"Results") indicate a high degree of positive cooperativity
unaffected by the norleucine substitutions. In this limiting case, the
fold change in intrinsic dimer-specific association constants
(Ka = k1k2k12 = k2k12) effected
by the substitution is essentially equal to the ratio of protein
concentrations where F2 = 0.50 or in the case of
the K60Z variant (for which 50% occupancy was not achieved) where
F2 = 0.25. Decrements in the apparent
protein-DNA association constant due to mutation can be ascribed
primarily to perturbation of DNA contacts only if the mutation does not
proportionately perturb the cooperativity factor.
(Eq. 5)
fragment and norleucine domains was
evaluated by circular dichroism (CD) and proton nuclear magnetic
resonance (NMR) at 600 MHz as described (21). CD spectra were obtained
using an Aviv spectropolarimeter equipped with temperature control. In
CD studies of half-site binding (Fig. 6), the DSX DM and DNA
concentrations were 10 µM in 50 mM Tris-HCl (pH 7.4) and 50 mM KCl. The light path length for the CD
cell was 1 mm. Protein concentrations were determined by UV absorbance and verified by quantitative amino acid analysis. In studies of protein-DNA interactions CD difference
spectra2 (designated
1 and
2) were calculated to prove induced helical structures; mean residue
ellipticities are relative to the protein component.
![]()
RESULTS
TOP
ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
REFERENCES
-helix (21). Binding of the DSX DM domain to
dsxA leads to discrete 1:1 and 2:1 complexes
(labeled C1 and C2 in lane 1 of Fig.
2B). The predominance of the 2:1 complex under conditions
containing appreciable free DNA demonstrates its cooperative assembly
(cooperativity factor k12 >1; see
"Experimental Procedures"). Previous mutagenesis studies have
focused on the importance of the conserved cysteine and histidine
residues involved in zinc coordination (19). Deletion of the tail
(yielding fragment DM
; residues 35-86) leads to
>100-fold reduction in specific DNA binding; only a weak 1:1 band is
observed (labeled C1* in lanes 3-12 in Fig.
2B). Retention of a low affinity 1:1 complex indicates that DM
retains a portion of the DNA-binding surface.
Addition of an isolated C-terminal peptide (residues 87-105; 19 residues) at concentrations up to 100 µM does not restore
formation of a specific dimeric complex. No DM
complex
is observed in control GMSA studies of the
operator site
OL1 (not shown).
is similar to that
of the intact domain (Fig. 5) (21),
demonstrating native folding of the zinc module in the fragment.
Spectra of the intact domain and DM
fragment exhibit
identical chemical shift dispersion but differ by the presence (DM) or
absence (DM
) of broad amide resonances near random-coil
frequencies3
(asterisk in a and d of Fig. 5). These
are assigned by elimination to the C-terminal tail. The pattern of
nuclear Overhauser enhancements (NOEs) between side chains in the
metal-binding sites (including contacts between histidine and cysteine
side chains; b and e of Fig. 5) and between amide
protons in
-helices (dNN contacts; c and
f) is essentially identical. An upfield-shifted methyl
resonance is observed in each case, assigned to the
Ile-54-
CH3. This core side chain packs against the
imidazole ring of His-59 (NOEs labeled in b and
e). Differences in chemical shifts between DM and
DM
spectra are small (< 0.1 ppm) and localized near the
site of truncation.

View larger version (34K):
[in a new window]
Fig. 5.
NMR studies of DM domain (A)
and DM
fragment
(B). Truncation of C-terminal tail does not
affect folding of zinc module but removes envelope of near-random-coil
resonances from NMR spectrum. a and d,
one-dimensional spectra of exchangeable NH and aromatic resonances at
pH 6.5. Spectra highlight the presence (intact DM domain; a)
or absence (DM
; d) of a broad envelope of
amide resonances near 8.1 ppm (asterisks); these unresolved
resonances exhibit prominent water exchange cross-peaks in NOESY
spectra (not shown) and are assigned to nascent helix in C-terminal
tail. Amide resonances in both domains are notable for resolved
downfield shifts of cysteate residues Cys-70, Cys-73, and Cys-47.
b and e, two-dimensional NOESY spectra in
D2O (pD 6.1) of contacts in metal-binding sites between
side chains of histidine and cysteine. Comparison of amide region of
NOESY spectra in H2O is remarkable for similarity of
chemical shifts among side chains in (Cys-44, His-50, His-59, Cys-63,
and Cys-67) or adjoining (Ile-54) the zinc-binding sites. The
methyl resonance of Ile-54 is shifted to high field (0.63 ppm)
presumably due to the aromatic ring current of His-59; prominent NOEs
are observed between these side chains. c and f,
dNN contacts in helical segments (DSX residues 46-80
(red) and 71-79 (green) at pH 6.5; DM consensus
residues 12-16 and 37-45) are outlined. The NOESY mixing
times were in each case 175 ms.
Single amino acid substitutions extrinsic to the core of the metal-binding sites enable the importance of individual side chains in DNA recognition to be tested. Because of the different structural characteristics of the zinc module and disordered tail, we employed two different approaches. The first is alanine scanning mutagenesis of the tail. Such an approach has been widely applied to nascent DNA recognition elements (49-51). Because alanine contains a small side chain of high intrinsic helical propensity (52), such substitutions are unlikely to perturb the bound structure of the tail or introduce steric obstacles to induced fit. The second approach employed chemical protein synthesis to introduce norleucine at sites of conserved lysine or arginine side chains in the zinc module. Use of a non-standard amino acid was motivated by analysis of the NMR structure of the zinc module; the long methylene chains of some basic side chains pack near the metal-binding sites while their terminal positive charges are exposed to solvent. Norleucine (unlike alanine) in principle retains such hydrophobic packing, enabling the specific role of the positive charge to be evaluated.4 Because the DM polypeptide is too long for conventional peptide synthesis, the variant proteins were prepared by native peptide ligation (43) as illustrated in Fig. 3 (see "Experimental Procedures").
Alanine Scanning Mutagenesis of Tail-- Systematic alanine substitution of 10 arginine or glutamine residues in the tail was effected by site-directed mutagenesis (residues Arg-74, Arg-79, Gln-80, Arg-81, Arg-90, Arg-91, Gln-93, Gln-95, Gln-98, and Arg-99; highlighted in color in Fig. 2A). Recombinant proteins were purified to homogeneity, and GMSA studies of each were undertaken as a function of protein concentration. The results are summarized in a composite gel in which representative lanes containing the same protein concentration, extracted from individual gels, are compared (Fig. 2C). Percent shifts of the single bound complex (designated C1) and doubly bound complex (C2) are given in the legend. A striking correlation is observed between critical sites and degree of sequence conservation. Conserved basic or carboxamide side chains in two patches contribute to DNA recognition (positions 79-81 and positions 90, 91, and 93; highlighted in red in Fig. 2A), whereas substitution of non-conserved side chains does not impair specific DNA binding (positions 74, 95, 98, and 99; green in Fig. 2A). Position 91 is a site of intersex substitution R91Q (19); GMSA studies of this variant indicate a decrement in binding more marked than that of the R91A substitution (not shown). Although small variations occur in the relative intensity of the intermediate 1:1 band (C1 in Fig. 2C), deleterious substitutions do not affect this ratio (to the extent that the 1:1 band can be seen), implying that decreased specific DNA binding is not secondary to decreased cooperativity.5 That multiple alanine substitutions in the tail impair specific DNA binding strongly supports the hypothesis that the tail functions as a DNA recognition element.
Folding of the Tail on a DNA Half-site--
CD spectra of the DM
polypeptide in the presence and absence of ZnCl2 (at a
stoichiometry of two zinc ions per polypeptide) demonstrate the
presence of a metal-dependent folding transition (Fig.
6A). Binding of
Zn2+ accentuates the partial
-helical propensity of the
apodomain in accord with the solution structure of the motif (Fig.
1B) (21). The dimeric protein-DNA complex formed on binding
of the DSX DM domain to dsxA has been investigated
previously by CD. Spectra are remarkable for additional specific
DNA-dependent
-helical difference features and thermal
stabilization of the C-terminal tail as an induced
-helix (21). Such
induced fit can in principle occur at the protein-DNA interface or
secondary to DNA-dependent dimerization. The DSX domain
forms a low affinity 1:1 complex on binding to a
dsxA half-site
(5'-AGTACATTG-3' and complement; Fig.
6B). Spectra of the free domain, free
dsxA half-site and 1:1 complex at 37 °C are shown
in Fig. 6C. An
-helical CD difference feature (designated
1 in Fig. 6D) is obtained by subtracting the
CD spectrum of the free DNA from that of the complex. The relative
magnitude of this feature (but not its detailed shape) is similar to
that of the 2:1 dsxA complex as shown by comparison
of difference spectra
2 (open squares in Fig.
6E). A DNA-related difference feature is observed in the
near-UV region (250-300 nm), presumably due to widening of the minor
groove. This feature is similar at 4 and 37 °C (dashed line versus open squares in Fig.
6E), unlike the far-UV difference feature, which is
attenuated at low temperature due to base-line folding of the free
tail. Formation of a 1:1 half-site complex is sufficient to confer
thermal stability between 4 and 40 °C as indicated by the slope of
[
]222 versus temperature (Fig.
6F); the anomalously steep slope of the free domain
(dashed line in Fig. 6F) reflects non-cooperative
thermal fraying of nascent helical structure in the tail.
DNA-dependent folding and stabilization of the tail can
thus occur in the absence of DNA-dependent dimerization. Together with the results of alanine scanning mutagenesis (above), these data strongly support the hypothesis that the tail functions as
an induced recognition
-helix.
|
Non-standard Mutagenesis--
Like the tail, the DSX zinc module
contains multiple basic residues: seven arginines and five lysines. The
positions of these side chains in the NMR ensemble are shown in Fig.
7. The abundance of basic side chains
suggests that the zinc module may also contact DNA, consistent with the
residual specific DNA binding affinity of the DM
monomer
(above). To test this hypothesis, norleucine substitutions (designated
Z) were introduced at positions Arg-46, Lys-57, and Lys-60. Among DM
sequences residue 46 (DM consensus position 12) is conserved as lysine
or arginine (Fig. 2A) and is a site of intersex mutation
(Arg
Trp) in MAB-3 (18). This side chain is well defined in the
solution structure of the free domain (side chain root mean square
deviation 0.94 Å) and of limited solvent accessibility (24% relative
to an extended peptide) (21). Residue 57 (DM consensus position 23) is
also conserved as lysine or arginine, whereas residue 60 (consensus
position 26) is usually lysine but is not conserved within two
divergent C. elegans sequences in which alanine or serine is
observed (Fig. 2A). The side chains of Lys-57 and Lys-60 are
less well ordered in the solution structure (respective side chain root
mean square deviations 1.77 and 2.17 Å) (21). Lys-57 is largely
exposed (relative solvent accessibility 82%), whereas Lys-60
projects into solution and is disordered.
|
Native zinc-dependent folding of the norleucine analogs was
verified by CD and 1H NMR spectroscopy. Evidence of a
native-like fold is provided by aromatic and aliphatic NMR spectra,
which in each case exhibit a similar pattern of chemical shifts.
Resolved markers are provided by aromatic spin systems as outlined in
the wild-type TOCSY spectrum (Fig. 7B); these resonances are
not perturbed by truncation of the C-terminal tail (Fig.
7C). Small non-local perturbations are observed in the
spectrum of the R46Z analog (Fig. 7, D and
E, arrow and asterisk) but not other
analogs. Results of GMSA assays and plot of percent shifts are shown in
Fig. 8. The norleucine substitutions do
not significantly affect the relative intensities of the 2:1 and 1:1
shifted bands, indicating that positive charges at these sites are not
required for DNA-dependent dimerization. In each case the
maximal fractional occupancy of the singly bound complex (C1) is small;
values of F1,max (see "Experimental
Procedures") are 5 (wild type), 8 (R46Z), 4 (K57Z), and 7% (K60Z).
Due to errors in integration arising from non-uniform background and
trailing edges of bands, we estimate uncertainties of ±2% in these
values. The cooperativity factor k12 is in each
case greater than 100.
|
Under these conditions relative dimer-specific association constants
may be estimated by comparison of the protein concentrations at which
50% of the labeled DNA is shifted to the dimeric complex. Inspection
of Fig. 8E thus indicates that substitutions K57Z and R46Z
impair specific DNA binding only to a small extent (decrements of 1.5- and 2-fold, respectively). Substitution K60Z more significantly impairs
specific DNA binding; although 50% occupancy of the dimeric complex
was not achieved in the protein concentration range tested, 10-fold
higher concentrations were required to obtain 25% occupancy. The
similar F1,max values of the wild-type and K60Z
domains indicate that this perturbation is not primarily due to
impaired cooperativity. In light of the native-like NMR spectrum of the
K60Z analog, we speculate that the positive charge of Lys-60
participates in a salt bridge to the DNA backbone. The small decrements
in binding of the R46Z and K57Z analogs may reflect either similar loss
of local contacts, structural perturbations in the zinc module, or (in
the case of R46Z) a subtle effect on cooperativity. In any case
markedly impaired binding of the K60Z norleucine analog, taken together
with the results of tail deletion and scanning mutagenesis, suggests
that both parts of the DM motif contact DNA.
| |
DISCUSSION |
|---|
|
|
|---|
DSX contains an N-terminal DM domain and a C-terminal dimerization
domain (11, 28). This modular organization is reminiscent of phage
repressor (53); the N-terminal domain in each case mediates DNA
recognition, whereas the C-terminal domain in each case enhances
dimerization and enables cooperative binding to pairs of DNA sites (12,
54). Mutations in either DSX domain are associated in vivo
with intersex phenotypes (19, 28). The DM domain (but not other regions
of the protein) is conserved among a newly recognized family of
transcription factors. Prototype members of this family are provided by
orthologs doublesex (dsx; see Ref. 26) in
D. melanogaster and mab-3 in C. elegans (55). These sex-determining genes function downstream of
inequivalent "master" regulatory factors to define one branch of a
ramifying pathway (Fig. 1A). Remarkably, dsx and
mab-3 in part regulate analogous dimorphic tissues (40, 56,
57). The present study has investigated by mutagenesis the role of 13 side chains on the surface of the DSX zinc module and in its C-terminal
tail. The results suggest that both structural elements contact DNA and
substantiate the hypothesis that the tail functions as a recognition
-helix.
The DSX DM domain contains a distinctive zinc module with intertwined
CCHC and HCCC zinc-binding sites (21). CD studies suggest that a
C-terminal tail, disordered in the free domain, is stabilized on DNA
binding. The present results demonstrate that such induced fit occurs
in a monomeric half-site complex and is thus independent of
DNA-dependent dimerization. It is not known whether the
tail forms a single contiguous
-helix or two or more helical
segments separated by turns or bends. Unlike classical zinc fingers and
zinc modules (58, 59), the DM domain binds in the minor groove of DNA
(21). Unlike SRY and other architectural motifs targeted to the minor
groove (42, 60), the DM motif does not induce sharp DNA bending. Our
results demonstrate that two patches of conserved arginine and
glutamine side chains in the tail are necessary for high affinity DNA
recognition. The first patch (RQR; DSX residues 79-81 at DM consensus
positions 45-47) is precisely conserved among mammalian
Dmrt genes (Fig. 2A). The second patch
(RRAQ; DSX residues 90, 91, and 93 at DM consensus positions 56, 57, and 59) is conserved as
RRQQ among mammalian Dmrt genes.
Such conservation suggests that structure-function relationships in DSX
generalize to the human proteins. It is noteworthy that the center of
the first patch (residue 80) is separated from the Arg-90 by 10 residues, which would correspond to three turns of a single contiguous
-helix.
Tail sequences vary among invertebrate DM sequences. Whereas the first
patch is not conserved in some C. elegans DM sequences (Fig.
2A), the second patch is not conserved in one of the two DM
domains in MAB-3.6 Such
divergence suggests that DM domains can exhibit inequivalent sequence
specificities or atomic mechanisms of recognition. In fact, MAB-3 and
DSX exhibit distinct (but related) DNA binding specificities as defined
by random-binding site selection (20). The biological target sites of
MAB-3 are not well characterized. Interestingly, the male-specific
isoform of DSX functions in a mab-3
XO worm
(ordinarily a chromosomal male with an intersex phenotype; see Ref. 55)
to rescue male features (18). Such complementation is consistent with
overlapping DNA binding specificities and demonstrates that downstream
mechanisms of gene regulation are in part conserved (18). It is not
known whether the divergent tails of MAB-3 domain a or other
DM domains contact DNA and, if so, whether their mode of binding
differs from that of a consensus DSX-like domain.
The present study differs in part from that of a previous analysis of
the DSX DNA-binding domain (10, 12). Wensink and co-workers (10)
expressed a His6-tagged fragment of DSX (residues 39-104;
72 residues including N-terminal tag) in a baculoviral system and
characterized the oligomerization- and DNA-binding properties of a
partially purified preparation. Whereas the present domain (residues
35-105) was found to be monomeric by equilibrium ultracentrifugation
and NMR spectroscopy in the 50 µM to 2 mM concentration range (21), the His6-tagged fragment was
observed to oligomerize at lower concentrations as inferred from
glutaraldehyde cross-linking experiments (12). GMSA studies with a
dsxA probe, conducted in 25 mM HEPES
buffer (pH 7.6) containing 100 mM NaCl, revealed formation
only of a 2:1 complex; a 1:1 complex corresponding to band C1 in the
present studies was not observed. The shape of the DNA-binding isotherm
was sigmoidal, indicating positive cooperativity. Analysis of the
thermodynamic coupling between dimerization of the free domain and
specific DNA binding was performed based on the model shown in Fig.
9A (12). Despite the
difference in appearance between this model and that of cooperative DNA
binding (Fig. 9B), the thermodynamic implications are
similar. In particular, the ratio of the dimerization constant
(K1; 430 nM) to the dimer-specific
dissociation constant (K2; 0.48 nM) is formally equivalent to the cooperativity factor
k12 in the Senear-Brenowitz formalism (46). We
may therefore apply Equation 5, see "Experimental
Procedures," to estimate the corresponding F1,max in a cooperative model (Equation 6),
|
(Eq. 6) |
|
The DM domain provides an unusual example of a minor groove DNA-binding
motif that employs a flexible basic tail. The use of flexible basic
regions to recognize specific DNA sequences is well characterized among
major groove DNA-binding motifs. Such basic regions fold on DNA to form
-helices (61). Examples include the leucine zipper/bZIP and
helix-loop-helix motifs (62-65). The following similarities and
differences are noteworthy.
(i) The basic arms of bZIP and the basic arm/helix-loop-helix motif domains are positioned by a structured dimerization element that does not itself contact DNA, either a parallel-coiled coil (the leucine zipper) or parallel four-helix bundle (62-65). In contrast, the present studies suggest that the DSX DM zinc module itself contacts DNA.
(ii) Among major groove DNA-binding proteins, conserved arginine and asparagine side chains play critical roles in DNA recognition, often making specific hydrogen bonds at the edge of base pairs or contacting the DNA backbone within an extended network of hydrogen bonds (58). The minor groove of DNA is by contrast more hydrophobic than the major groove and lacks as distinctive a pattern of base-specific functional groups (66).
(iii) We propose that the aliphatic portions of lysine and arginine
side chains in DSX may make van der Waals interactions within a widened
minor groove, whereas their basic groups can interact with the DNA
backbone. Such interactions are seen in the minor groove T domain-DNA
complex (67) and in one segment of the 
resolvase
structure8 (68). Conserved
glutamine residues in the DSX tail may contact either DNA bases or
backbone. Base-glutamine contacts, although common among major groove
protein-DNA complexes, have seldom been observed in minor groove
complexes.9
(iv) CD difference features in the near-UV (250-320 nm), ascribed to changes in DNA structure, are observed in both major groove and minor groove protein-DNA complexes but differ in detail. Structural interpretation of such differences will require a more extensive data base of spectra and high resolution crystal structures.
We imagine that the DM-DNA complex, like other minor groove complexes,
is stabilized by electrostatic interactions and hydrogen bonds within
an overall framework of non-polar contacts. It is intriguing that the
DSX tail contains a series of non-polar side chains (VMAL; DM consensus
positions 48-51) between the two basic patches identified herein. It
is not known whether the tail contributes to an induced dimer contact
as well as to the protein-DNA interface. Although truncation of the
tail blocks DNA-dependent dimerization, none of the
residues tested herein are critical to the stability of the induced
dimer interface. It is possible that impaired cooperativity of
DM
is secondary to its very weak DNA affinity.
The present study illustrates the complementary utility of biosynthetic
expression and total protein synthesis by native ligation of peptide
fragments (43). The latter synthetic methods may prove useful in
structural genomics (69). This technology permits introduction of
non-standard amino acids and may enable preparation of proteins or
analogs refractory to biosynthetic expression. Cysteine-rich motifs are
in particular attractive targets as the cysteines providing convenient
sites of peptide ligation. Application to the DM motif allowed rapid
preparation of the DSX analogs containing norleucine side chains as
isosteric replacements for the aliphatic portions of lysine or
arginine. The K60Z substitution impairs specific DNA binding, whereas
K57Z is well tolerated. The R46Z analog exhibits structural
perturbations, confounding local interpretation of its small change in
DNA binding affinity. In the native NMR structure Arg-46 packs against
His-50 in metal-binding site II and contacts the conserved aromatic
side chain of Phe-65 to seal one edge of the hydrophobic core (21). In
the future it will be of interest to investigate whether the aliphatic
portion of this invariant arginine is integral to the stability of the
metal-binding sites. An integrated understanding of structure-function
relationships in the DM domain will require structural dissection of
determinants of zinc-dependent protein folding, DNA
recognition, and cooperativity.
| |
ACKNOWLEDGEMENTS |
|---|
We thank S. B. Kent for advice regarding peptide synthesis and native ligation chemistry; N. B. Phillips for advice regarding protein purification; L. Han for assistance with DNA binding assays; Q.-X. Hua and W. Jia for assistance with CD and NMR spectroscopy; H. T. Keutmann for amino acid analysis; G. Reddy for mass spectrometry; R. Singh for gel quantification and assistance with the revised manuscript; E. Collins for preparation of the manuscript; and members of the Weiss laboratory for advice and discussion.
| |
FOOTNOTES |
|---|
* This work is a contribution from the Cleveland Center for Structural Biology and was supported in part by a grant from the National Institutes of Health (to M. A. W.).The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The atomic coordinates and the structure factors (code 1LPV) have been deposited in the Protein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers University, New Brunswick, NJ (http://www.rcsb.org/).
§ Both authors contributed equally to this work.
¶ Present address: Array BioPharma, 3200 Walnut St., Boulder, CO 80301.
** To whom correspondence should be addressed. Tel.: 216-368-5991; Fax: 216-368-3419; E-mail: weiss@biochemistry.cwru.edu.
Published, JBC Papers in Press, August 26, 2002, DOI 10.1074/jbc.M204616200
2
Two different types of difference spectra are
shown in Fig. 6. The first, designated
1 (D), is defined
as the difference between the CD spectrum of the protein-DNA complex
and the CD spectrum of the free DNA. The second, designated
2
(E), is defined as the difference between the spectrum of
the complex and the sum of the spectra of the free DNA and free protein.
3
The NMR spectrum of DM
contains
unusually sharp resonances assigned to its truncated tail (residues
80-86). Motional narrowing of the foreshortened tail contrasts with
conformational broadening of the intact tail due to intermediate
exchange among nascent helical structures (21).
4 Use of a non-standard amino acid was motivated by our recent experience in studies of insulin wherein alanine scanning mutagenesis (71) was confounded at a key site by non-local structural perturbations (72). Such limitations were overcome by non-standard mutagenesis (73). Although it is not known whether alanine substitutions in the zinc module would likewise perturb its folding, the side chain of Arg-46 appears integral to the structure, and R46A would be predicted to create a crevice.
5 Attenuation of the 1:1 band in GMSA studies of alanine variants could be due either to enhancement of cooperativity or to kinetic instability of the variant 1:1 complexes in the course of electrophoresis. The latter could be an indirect consequence of a mutation at the protein-DNA interface.
6 Whereas DSX contains a single DM domain and binds to its DNA target site as a dimer (10), MAB-3 contains two DM domains and binds as a monomer (20). The MAB-3 DM domains are each required in vivo and are proposed to function coordinately in DNA binding as an "internal dimer" (21). It is not known whether each domain contacts DNA.
7 Although Wensink and co-workers (10) observed no discrete C1 band between the free probe and the dimeric complex, they noted that the presence of DNA radiolabel between these bands of integrated intensity was less than 5%. It is possible that these counts in part represent background due to rapid dissociation of a singly bound species.
8
The crystal structures of the T domain and

resolvase each exhibit docking of an
-helix within a widened
minor groove (67, 68). Like the DM domain, the T domain does not induce
sharp DNA bending. Neither complex employs glutamine at a minor groove interface.
9 HMG boxes contain glutamines that contact DNA phosphates or deoxyribose moieties (74-77). Sequence-specific HMG boxes contain a conserved asparagine (consensus position 10) whose carboxamide function contacts edges of base pairs in the minor groove. This interaction contributes to sequence specificity (7).
| |
ABBREVIATIONS |
|---|
The abbreviations used are: HMG, high mobility group; bZIP, basic arm/leucine zipper motif; GMSA, gel mobility-shift assay; NOE, nuclear Overhauser enhancement; NOESY, NOE spectroscopy; rp-HPLC, reverse phase-high performance liquid chromatography; SRY, sex-determining region of the Y chromosome; DSX, Doublesex.
| |
REFERENCES |
|---|
|
|
|---|
| 1. | Cline, T. W., and Meyer, B. J. (1996) Annu. Rev. Genet. 30, 637-702[CrossRef][Medline] [Order article via Infotrieve] |
| 2. | Berta, P., Hawkins, J. R., Sinclair, A. H., Taylor, A., Griffiths, B. L., Goodfellow, P. N., and Fellous, M. (1990) Nature 348, 448-450[CrossRef][Medline] [Order article via Infotrieve] |
| 3. | Gubbay, J., Collignon, J., Koopman, P., Capel, B., Economou, A., Munsterberg, A., Vivian, N., Goodfellow, P., and Lovell-Badge, R. (1990) Nature 346, 245-250[CrossRef][Medline] [Order article via Infotrieve] |
| 4. | Koopman, P., Munsterberg, A., Capel, B., Vivian, N., and Lovell-Badge, R. (1990) Nature 348, 450-452[CrossRef][Medline] [Order article via Infotrieve] |
| 5. | Koopman, P., Gubbay, J., Vivian, N., Goodfellow, P., and Lovell-Badge, R. (1991) Nature 351, 117-121[CrossRef][Medline] [Order article via Infotrieve] |
| 6. | Sinclair, A. H., Berta, P., Palmer, M. S., Hawkins, J. R., Griffiths, B. L., Smith, M. J., Foster, J. W., Frischauf, A. M., Lovell-Badge, R., and Goodfellow, P. N. (1990) Nature 346, 240-244[CrossRef][Medline] [Order article via Infotrieve] |
| 7. | Bewley, C. A., Gronenborn, A. M., and Clore, G. M. (1998) Annu. Rev. Biophys. Biomol. Struct. 27, 105-131[CrossRef][Medline] [Order article via Infotrieve] |
| 8. | Nagoshi, R. N., McKeown, M., Burtis, K. C., Belote, J. M., and Baker, B. S. (1988) Cell 53, 229-236[CrossRef][Medline] [Order article via Infotrieve] |
| 9. | Burtis, K. C., Coschigano, K. T., Baker, B. S., and Wensink, P. C. (1991) EMBO J. 10, 2577-2582[Medline] [Order article via Infotrieve] |
| 10. | An, W., Cho, S., Ishii, H., and Wensink, P. C. (1996) Mol. Cell. Biol. 16, 3106-3111[Abstract] |
| 11. |
Cho, S.,
and Wensink, P. C.
(1997)
J. Biol. Chem.
272,
3185-3189 |
| 12. | Cho, S., and Wensink, P. C. (1998) Biochemistry 37, 11301-11308[CrossRef][Medline] [Order article via Infotrieve] |
| 13. | Li, H., and Baker, B. S. (1998) Development 125, 2641-2651[Abstract] |
| 14. | Marin, I., and Baker, B. S. |