Structure of the Unphosphorylated STAT5a Dimer*

STAT proteins have the function of signaling from the cell membrane into the nucleus, where they regulate gene transcription. Latent mammalian STAT proteins can form dimers in the cytoplasm even before receptor-mediated activation by specific tyrosine phosphorylation. Here we describe the 3.21-Å crystal structure of an unphosphorylated STAT5a homodimer lacking the N-terminal domain as well as the C-terminal transactivation domain. The overall structure of this fragment is very similar to phosphorylated STATs. However, important differences exist in the dimerization mode. Although the interface between phosphorylated STATs is mediated by their Src-homology 2 domains, the unphosphorylated STAT5a fragment dimerizes in a completely different manner via interactions between their β-barrel and four-helix bundle domains. The STAT4 N-terminal domain dimer can be docked onto this STAT5a core fragment dimer based on shape and charge complementarities. The separation of the dimeric arrangement, taking place upon activation and nuclear translocation of STAT5a, is demonstrated by fluorescence resonance energy transfer experiments in living cells.

STAT 4 (signal transducer and activator of transcription) proteins mediate the signaling of cytokines and a number of growth factors from the receptors of these extracellular signaling molecules to the cell nucleus. Dependent on the receptor type STATs are specifically phosphorylated by receptor-associated Janus kinases, receptor tyrosine kinases, or cytoplasmic tyrosine kinases (1). The phosphorylated STAT molecules dimerize by reciprocal binding of their SH2 domains to the phosphotyrosine residues. These dimeric STATs translocate into the nucleus, bind to specific DNA sequences, and regulate the transcription of their target genes.
Seven mammalian STATs have been identified. Their structural organization is known in molecular detail from several crystal structures. At the N terminus they contain a helical domain, which mediates cooperative binding of STATs to sequential DNA binding sites (2). The structures of the phosphorylated STAT1 and STAT3 core fragments, lacking the N-terminal domain as well as the C-terminal transactivation domain, were solved in complex with DNA (3,4). They consist of an N-terminal large four-helix bundle, a central IgG-like domain, which constitutes the actual DNA binding domain, a helical, so-called linker domain, and the SH2 domain. The phosphotyrosine residue is located in all mammalian STATs within the 30 amino acids C-terminal to the SH2 domain. In most mammalian STATs the C-terminal part is constituted by a mostly disordered transactivation domain, which mediates the interactions of STATs with other components of the transcription machinery. Three-dimensional information for this domain so far has only been obtained from the complex of a short fragment of the STAT6 transactivation domain with the PAS-B domain of the nuclear receptor coactivator 1 (5).
STAT3 homodimers and STAT3-STAT1 complexes have been coimmunoprecipitated from untreated cells (6,7). Next, large multiprotein complexes containing STAT3 were identified (8). In addition, FRET measurements with STAT3 (9) demonstrated that at least a part of it should already be dimeric before activation. Bioluminescence resonance energy transfer measurements additionally suggested that STAT3 should undergo strong conformational changes upon phosphorylation (10). Recently the crystal structure of unphosphorylated STAT1, lacking only the transactivation domain, has been solved (11). In this structure the N-terminal domains seem to stabilize the interaction between the core fragments, well in agreement with these biochemical and cell biological data. The connecting region between the N-terminal domain and the core fragment is not visible in the STAT1 structure, resulting in two possible dimerization modes of the two proteins because of their tetrameric arrangement in the crystal. In one mode the two core domains are arranged in an antiparallel fashion, employing the four-helix bundle domain and the ␤-barrel domain of the monomers to form the interface. The second mode of dimerization involves a parallel arrangement of the STAT1 core fragments, which puts the SH2 domains on the same end of the dimer. Functional relevance has been attributed to both dimerization modes with respect to interaction with the cytoplasmic domains of cytokine receptors and dephosphorylation by phosphatases (12).
Here we present the crystal structure of an unphosphorylated mouse STAT5a core fragment, lacking both the N-terminal and the transactivation domains. The structure of this fragment shows an antiparallel dimerization mode very similar to the one found for STAT1. We present a docking model for the N-terminal domains onto this dimer assuming preservation of the overall 2-fold symmetry. We also performed live cell FRET measurements, which demonstrate the separation of the N-terminal domains following activation and nuclear translocation of STAT5a. pET32a (Novagen) to express the STAT5a fragment without a tag. This expression construct was transformed into the Escherichia coli strain BL21(DE3). Protein expression was carried out at 23°C after induction with 0.5 mM isopropyl ␤-D-thiogalactopyranoside. Cells were harvested 8 h after induction and purified according to the protocol established for STAT3␤ (13). Briefly, after cell lysis nucleic acids were first removed by precipitation with polyethyleneimine (0.1% final). Next the STAT5a fragment was precipitated with 35% saturated (NH 4 ) 2 SO 4 . Then the protein was sequentially purified by hydrophobic interaction on a phenyl Superose column (Amersham Biosciences) and by affinity chromatography on a heparin column (Amersham Biosciences). Before crystallization a final gel filtration on a Superdex200 column (Amersham Biosciences) was performed.
Crystallization and Data Collection-STAT5a crystallized at 20°C in hanging drops containing equal volumes of well solutions (1.2 M (NH 4 ) 2 SO 4 , 50 mM HEPES, pH 7.0, 20 mM MgCl 2 , and 10% glycerol) and protein solution containing 7 mg/ml protein (in 20 mM HEPES, pH 7.0, 200 mM NaCl, 10 mM MgCl 2 , 5 mM dithiothreitol, 0.5 mM phenylmethylsulfonyl fluoride). Crystals grew over 2 weeks and were harvested and flash-frozen in liquid nitrogen in a freezing buffer containing 30% glycerol, 1.2 M (NH 4 ) 2 SO 4 , 50 mM HEPES, pH 7.0, 20 mM MgCl 2 . Data were collected at BESSY Berlin (BL2, Mar image plate detector) and processed using the program XDS (14). Statistics for data collection and processing are summarized in TABLE ONE.
Structure Determination and Refinement-The STAT5a core fragment crystallizes in the space group C2, with 3 molecules per asymmetric unit. The structure has been solved by molecular replacement with PHASER (15) using the STAT3␤ model (Protein Data Bank entry 1BG1) after removing the DNA and water molecules. Density modifications were performed with RESOLVE (16) allowing good quality maps because of the high solvent content (72%). Refinement took advantage of 3-fold noncrystallographic symmetry and was performed at all stages with REFMAC5 (17) to a final R-factor/R-free of 0.266/0.299 (TABLE  ONE). Model building was carried out with Coot (18) and XtalView (19). The final model contains no Ramachandran violations and has 80.2% of dihedral angles in the most favored, 17.6% in the additionally allowed, and 2.2% in the generally allowed regions (20). The molecular figures were created with Pymol (pymol.sourceforge.net/), and the electrostatic potential surfaces were generated with DELPHI (21).
Protein-Protein Docking-The docking has been done using Deep-View/Swiss-PdbViewer 3.7 (www.expasy.org/spdbv) for pre-orienting the two dimers. The final orientation has been obtained with the docking function of Hex 4.2 (22) with the search rotation range of 45 for both dimers.
Live Cell FRET Microscopy Experiments-NIH3T3 cells were plated in 6-well plates and cultured in Dulbecco's modified Eagle's medium supplemented with 10% fetal calf serum. After 24 h cells were transfected with plasmids coding for CFP-STAT5a, YFP-STAT5a (23), the prolactin receptor, CFP and YFP, or with the positive control construct pECFP-15AA-EYFP using Lipofectamine according to the manufacturer's recommendations. Cells co-expressing CFP and YFP were used as a negative control. After expression for 16 -20 h, FRET measurements were performed using an inverted microscope (Axiovert200, Zeiss) equipped with a 37°C thermostated incubator, a 5% CO 2 -air mix system, and a 532-nm laser for acceptor photobleaching. CFP was detected using a filter set composed of an excitation filter of 436/20 nm, a dichroic beam splitter of 455 nm, and an emission filter of 480/40 nm; for YFP an excitation filter of 500/20 nm, a dichroic beam splitter of 515 nm, and an emission filter of 535/30 nm were used. The FRET filter set consisted of an excitation filter of 436/20 nm, a dichroic beam splitter of 455 nm, and an emission filter of 535/30 nm. Images were acquired using a cooled charge-coupled device camera and the Metamorph software. The results were analyzed using the program Imspector (provided by Andreas Schönle). For analysis, the images were corrected for background, pixel shift, and CFP bleed-through into the YFP channel. The normalized FRET values (N FRET ) were calculated according to the following equation (24), N FRET ϭ (I FRET Ϫ I YFP ϫ y Ϫ I CFP ϫ z)/v(I YFP ϫ I CFP ) where I FRET , I YFP , and I CFP correspond to the FRET, the CFP, and the YFP images, respectively, and y and z correspond to YFP and CFP bleed-through emissions into the FRET images. In this way the FRET signal was also normalized for the protein expression levels.
Electrophoretic Mobility Shift Assays-HeLa cells overexpressing STAT5a and the prolactin (PRL) receptor were stimulated with PRL (5 g/ml) for 10 min or left untreated. Whole cell extract was prepared as described previously (25). A double-stranded oligonucleotide probe containing a single STAT5 binding site from the ␤-casein promoter was end-labeled using [␥-32 P]ATP and T4 polynucleotide kinase. The binding reaction for the whole cell extract was performed as described previously (26) using 8 g of whole cell extract and 50,000 cpm (corresponding to ϳ0.5 ng) of labeled probe. In the binding reaction for the recombinant STAT5a core fragment 0.3 g of the protein was also combined with 8 g of whole cell extract from unstimulated cells to allow for comparable assay conditions. Competition experiments were performed by adding unlabeled oligonucleotides representing the STAT5 binding site of the ␤-casein promoter or the STAT3 binding site of the c-fos promoter (SIE).

RESULTS AND DISCUSSION
Overall Structure-The core fragment of unphosphorylated STAT5a (residues 128 -712) used for crystallization was chosen from an alignment with the STAT3 fragment that had been crystallized in complex with DNA (4). Crystals of this core STAT5a fragment diffracted to 3.21-Å resolution using synchrotron radiation. In the crystal this R free is the same as R, except for a 9% subset of all reflections. c Fractions of residues in most favored/allowed/generously allowed/disallowed regions of the Ramachandran plot according to PROCHECK.
unphosphorylated STAT5a fragment is packing as dimers with dimensions of 166 ϫ 65 ϫ 45 Å 3 (Fig. 1A), whereas in solution, based on sizing by gel filtration chromatography and dynamic light scattering, the unphosphorylated core STAT5a fragment is most likely monomeric (data not shown). A view perpendicular to the 2-fold axis of the dimer (Fig. 1B) shows a boat-like shape, with the linker domain and the SH2 domain of each monomer pointing to the same side of the dimer. Each monomer bears the general architecture already found in the phosphorylated STAT core fragments (Fig. 1A): an N-terminal four-helix bundle (residues 138 -331), an eight-stranded ␤-barrel (residues 332-470), an ␣-helical linker domain (residues 471-592), and a SH2 domain (residues 593-712). The residues 128 -137, 423-432, and 691-712 are not ordered, and they are not included in our model (the integrity of the purified STAT5a fragment was verified by mass spectrometry). The r.m.s. deviation of a superposition of the STAT5a monomer with the monomers of STAT1 and STAT3 is only 2.1 and 2.3 Å, respectively (supplemental Fig. 1a). A significant difference between the STAT5a and STAT1/STAT3 monomers is the bend of the long helices ␣1 and ␣2 in STAT5a compared with STAT1/STAT3 (secondary structure elements are denoted according to the STAT3␤ structure), probably as an effect of the dimerization. These two helices are also considerably longer in STAT5a. Next, in the ␤-barrel domain of STAT5a the long loop connecting strands ␤c and ␤x has been flipped over in the direction of the basal portion of the ␤-barrel. In its opposite orientation this loop interacts with DNA in the STAT1 and STAT3 structures. In STAT5 it may only have a stabilizing role and might not be essential for DNA recognition. Also the loop connecting strands ␤a and ␤b, which is involved in DNA binding in STAT3, is much shorter in STAT5a. Dimerization-Despite the limited resolution of the structure, the quality of the electron density ( Fig. 2A) is sufficient to identify a number of weak polar interactions and weak hydrogen bonds between the monomers, which are consistent with the hydrophilic nature of the four-helix bundle as well as the ␤-barrel. Thus the dimer is mostly stabilized by an elaborate network of interactions between the ␤-barrel domains, which constitutes the central portion of the interface. These contacts are mediated through the side chains and backbones of the residues Asn-361, Val-362, His-363, Met-364, and Asn-365 belonging to the loop joining ␤b-␤c as well as Ser-449 and Ser-452 belonging to the loop joining ␤f-␤g (Fig. 2B). Additional weak H-bonds connect residues located on helices ␣1 and ␣3 of the N-terminal four-helix bundle and residues on strand ␤e of the eight-stranded ␤-barrel of each monomer. A similar network of polar interactions, most of them between the ␤-barrel domains, also stabilizes the antiparallel STAT1 dimer. Hydrophobic interactions at the STAT5a dimer interface are not quite as strong as in STAT1. Although the residues Val-389 and Leu-383, which are essential for the hydrophobic interactions at the interface of STAT1, are conserved in STAT5a (Val-402 and Leu-397), Phe-172 of STAT1 is replaced by Ile-174 in STAT5a.
The arrangement of the STAT5a core fragments is very similar to the antiparallel dimerization mode of the STAT1a core fragments (r.m.s.  deviation, 3.6 Å) (supplemental Fig. 1b), resulting in an interface of comparable size (1559 Å 2 for STAT5 and 1504 Å 2 for STAT1, as calculated with the program MSMS (27)). The third monomer in the asymmetric unit has crystal contacts with one of the monomers of the antiparallel dimer, resulting in an interface of only 500 Å 2 . Although the size of the interface within the antiparallel dimer clearly points to biological relevance (28), this can be clearly ruled out for the alternative interface. The third monomer is related via a 2-fold axis to the homologous monomer of the second asymmetric unit, forming again an antiparallel dimer with the same interface. The occurrence of this interface in the crystal structures of two different STATs strongly indicates that the antiparallel dimerization mode is relevant for the observed aggregates of unphosphorylated STATs (6 -9).
Docking of the N-Terminal Domain Dimer-The arrangement of the core fragments in both dimerization modes of STAT1 shows 2-fold symmetry. On the other hand the arrangement of the N-terminal domain dimers with regard to the overall symmetry of the dimer is symmetric only in the parallel dimerization mode. In the antiparallel dimerization mode one N-terminal domain packs against the SH2 domain of one core domain, whereas the other N-terminal domain packs against the four-helix bundle of the second core domain, thus breaking the overall symmetry of the protein dimer. This breaking of the symmetry in the antiparallel arrangement might be due to crystallization because the intrinsic symmetry of the N-terminal domain, which is also found in the arrangement of the N-terminal domain of STAT4 (29), suggests that these domains should have an identical chemical environment. A symmetric arrangement of the N-terminal domains with regard to the antiparallel core fragment dimer would be possible assum-ing coincidence of the 2-fold axes of both dimers. Based on this assumption we manually docked the STAT4 N-terminal domain into the groove formed by the helical bundles of the STAT5 core fragment monomers (Fig. 3A). Besides the substantial shape complementarities, the surface charge along the concave side of the STAT5a dimer is positive (Fig. 3B), whereas the surface charge of the STAT4 N-terminal domain around the 2-fold axis side is negative (Fig. 3B). This is also the case for the STAT1 core fragment and the STAT1 N-terminal domain. Thus, the two domains can be fitted exactly on the basis of shape and charge (Fig. 3). The distance between the N termini of the two STAT5a fragments in the dimer (50 Å) is in agreement with the distance between the C termini of the STAT4 N-terminal domains (60 Å). In the resulting model the C termini of the N-terminal domains have a distance of 15 Å from the N termini of the STAT5a core domains. In all mammalian STATs about 10 residues link the C terminus of the N-terminal domain and the N terminus of the core domain. The distance between the two domains in our docking model can be easily spanned by this number of residues.
Live Cell FRET Measurements-Our docking model as well as both dimerization modes found in the structure of unphosphorylated STAT1 imply that during the rearrangement taking place upon activation by phosphorylation both the N-terminal domain and the core domain interface have to be disrupted. To follow this transition we performed FRET measurements in NIH3T3 cells co-expressing full-length STAT5a carrying either YFP (the fluorescence acceptor molecule) or CFP (the fluorescence donor molecule) at its N terminus, which form mixed STAT dimers after expression. The functionality of these N-terminal fusion constructs was verified by a reporter gene assay and an electrophoretic mobility shift assay (supplemental Figs. 2 and 3). Before activation a significant FRET signal (about 18% of the positive control considered as the maximum FRET value in our system) was obtained (Fig. 4a) and confirmed by an increase of donor fluorescence after acceptor bleaching (Fig. 4a). This result supports the structural data suggesting that the N-terminal domain stabilizes the dimeric state of latent, cytoplasmic STATs. Importantly, no or a negligible FRET signal was detectable after STAT5a activation by prolactin (Fig. 4b), indicating separation of the N-terminal domains. Interestingly, FRET data obtained in cells co-expressing STAT3 carrying either YFP or CFP at the C terminus show an increase of the FRET signal after activation (9). This increase in FRET signal was consistent with an increase of SH2 domainmediated dimeric STAT3 formation upon phosphorylation. Taken together, FRET measurements with both N-terminally and C-terminally labeled STATs are consistent with a major structural rearrangement from a core domain-and N-terminal domain-stabilized unphosphorylated STAT dimer to a SH2 domain-driven dimer after phosphorylation. Finally it should be noted that our FRET data are compatible with the two possible dimerization modes of STAT1, as both modes require the separation of the N-terminal domains upon the rearrangement following tyrosine phosphorylation.
Rearrangement of STATs upon Interaction with DNA-Unphosphorylated full-length STATs, except for STAT1 (30), do not bind DNA (31)(32)(33). It had been suggested in the past (10) that the N-terminal domain and possibly also the C-terminal transactivation domain might be involved in preventing the STAT core fragment from binding to DNA. Our docking model and the structure of unphosphorylated STAT1 are consistent with this hypothesis, as they allow us to predict that the N-terminal domain stabilizes the latent core domain dimer and thus prevents conformational rearrangement into the DNA binding conformation found in phosphorylated STATs. The monomeric state of the STAT5a and the STAT1 core fragments in solution (11) is in line with the importance of the N-terminal domain for stabilizing the unphosphorylated full-length dimer. Accordingly, in an electrophoretic mobility shift assay the unphosphorylated STAT5a core fragment, uninhibited by the N-terminal domain, interacts with DNA, whereas the full-length STAT5a in unstimulated cell extracts does not (Fig. 5,  lanes 1 and 4). A 5-fold higher amount of the unphosphorylated STAT5a core fragment as estimated by Western blot analysis (supplemental Fig. 4) was necessary to obtain a shift comparable with STAT5a from stimulated cells (Fig. 5, lanes 4 and 11). Core domain binding to DNA could be reduced by competition with unlabeled oligonucleotide containing the ␤-casein promoter recognition site for STAT5a but less efficiently using unlabeled oligonucleotides with a recognition site for STAT3 (SIE), suggesting specific interaction of the core domain with the STAT5a recognition sequence. However competition of DNA binding with unlabeled oligonucleotides was substantially less efficient with the unphosphorylated STAT5a core fragment as compared with the full-length STAT5a from cell extracts (Fig. 5, compare lanes 4 -7 with  11-14). A much higher amount of unlabeled ␤-casein oligonucleotide was required to compete the binding of the unphosphorylated STAT5a core fragment. This could be suggestive of a higher specificity of the full-length protein for the STAT5a recognition site or differences in binding kinetics.
In contrast to STAT5a and other mammalian STATs, full-length unphosphorylated STAT1 protein was shown to interact with specific DNA binding sites in gel shifts (30). Thus, for STAT1 the stabilization brought about by the N-terminal domain is probably considerably weaker or might be overcome by interaction with other transcription factors like IRF1.
In essence this work confirms that the antiparallel dimerization mode of the STAT5a core fragment, as found also in the crystal structure of STAT1, may be a general feature of unphosphorylated mammalian STATs. Docking of the N-terminal domain dimer onto the unphosphorylated STAT5a core fragment dimer, assuming 2-fold symmetry of the overall dimer, is in good agreement with the perfect shape and charge complementarity of the involved surfaces. In the future further structural investigations will be necessary to test this docking model.