The Nuclear Dot Protein Sp100, Characterization of Domains Necessary for Dimerization, Subcellular Localization, and Modification by Small Ubiquitin-like Modifiers*

The Sp100 and promyelocytic leukemia proteins (PML) are constituents of nuclear domains, known as nuclear dots (NDs) or PML bodies, and are both covalently modified by the small ubiquitin-related protein SUMO-1. NDs play a role in autoimmunity, virus infections, and in the etiology of acute promyelocytic leukemia. To date, little is known about the function of the Sp100 protein. Here we analyzed Sp100 domains that determine its subcellular localization, dimerization, and SUMOylation. A functional nuclear localization signal and an ND-targeting region that coincides with an Sp100 homodimerization domain were mapped. Sequences similar to the Sp100 homodimerization/ND-targeting region occur in several other proteins and constitute a novel protein motif, termed HSR domain. The lysine residue of the Sp100 protein, to which SUMO-1 is covalently linked, was mapped within and may therefore modulate the previously described HP1 protein-binding site. A consensus sequence for SUMOylation of proteins in general is suggested. SUMOylation strictly depended on a functional nuclear localization signal but was not necessary for nuclear import or ND targeting. A three-dimensional structure of Sp100, which supports the mapping data and provides additional information on Sp100 structure/function relationships, was generated by computer modeling. Taken together, our studies indicate the existence of well defined Sp100 domains with functions in ND targeting, nuclear import, nuclear SUMOylation, and protein-protein interaction.

Among the numerous substructures of the cell nucleus (1), nuclear dots containing PML 1 and Sp100 proteins (NDs or PML bodies) attracted the interest of many researchers in recent years. NDs belong to the heterogeneous group of nuclear bodies (2) and are distinct subnuclear organelles that do not co-localize with any of the other known nuclear substructures (3)(4)(5)(6)(7). Originally discovered as targets of autoantibodies in patients suffering from the autoimmune disease primary biliary cirrhosis (8,9), NDs gained major attention when their disruption to a microgranular form in the hematopoietic malignancy acute promyelocytic leukemia was discovered (6,7,10,11). In this disease, a natural constituent of NDs, the PML protein, is expressed aberrantly as an oncogenic fusion protein with the retinoic acid receptor ␣ (RAR) (12). Expression of PML-RAR has been demonstrated to be sufficient to induce leukemias in vivo (13)(14)(15). Accordingly, a role of ND-associated proteins in cell transformation and growth control or regulation of differentiation (16 -20) was postulated. In a recent report, NDs were described to contain nascent RNA (21). On the other hand, NDs were hypothesized to represent transcriptional repressing/regulating complexes similar to the polycomb group complex (22). Interestingly, numerous viral regulatory proteins target NDs and influence ND structure and composition (23)(24)(25)(26)(27)(28)(29), and the expression of both known major ND proteins, PML and Sp100, is induced by interferons (30 -33) (for review on NDs, see Ref. 34).
Most recent work on NDs focuses on the PML protein because of its direct involvement in the development of acute promyelocytic leukemia. However, the Sp100 protein was actually the first ND protein to be characterized biochemically and through cloning of its cDNA using sera from patients suffering from primary biliary cirrhosis, an autoimmune liver disease (30,(35)(36)(37). Sp100 is an acidic protein with a calculated molecular mass of 54 kDa and exerts a highly aberrant electrophoretic mobility in SDS-polyacrylamide gel electrophoresis of approximately 100 kDa (35,36). The SP100 gene gives rise to a number of alternatively spliced Sp100 variants (34, 38 -41), some of which contain an HNPP box and an HMG box (34,38,40,41), the latter representing a DNA-binding motif. This is of particular interest, as Sp100 has been described to exhibit transcriptional modulatory effects under certain experimental conditions (22,39,40). In recent publications, the heterochromatin protein HP1 (22,40) and hHMG2 (22) were found to directly interact with Sp100. Finally, Sp100 is covalently modified by the SUMO-1 protein (42), another feature shared by Sp100 and PML (42)(43)(44). SUMO-1 is a small ubiquitin-related protein, also known as PIC-1 (45), Sentrin (46), GMP-1 (47), or UBL-1 (48), which is supposed to regulate protein targeting (43,49,50) or inhibition of degradation (51). Modification of proteins by SUMO-1 resembles the ubiquitinylation pathway in that SUMO is attached to an internal lysine residue but requires a unique set of activating and conjugating enzymes (50,52,53). In the case of PML, modification by SUMO-1 has very recently been reported to be essential for localization of the protein to NDs (54). Interestingly, SUMOylation of PML is highly enhanced after exposure of cells to arsenic trioxide (43), a pharmacon used in the therapy of acute promyelocytic leukemia (55)(56)(57). Hyper-SUMOylation of the PML-RAR oncoprotein might be one of the initial events of apoptosis induction in * This work was supported by grants from the Deutsche Forschungsgemeinschaft and the Deutsche Krebshilfe. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
‡ To whom correspondence and reprint requests should be addressed: Tel./Fax: 0049 40 48051221; E-mail: sternsdo@hpi.uni-hamburg.de. 1 The abbreviations used are: PML, promyelocytic leukemia protein; Ab, antibody; HP1, heterochromatin protein 1; mAb, monoclonal antibody; ND, nuclear dot(s) (nuclear domain(s) containing PML and Sp100 proteins); RAR, retinoic acid receptor-␣; SUMO-1, small ubiquitin-related modifier; EGFP, enhanced green fluorescent protein; HSR, homogeneously staining region; CAT, chloramphenicol acetyltransferase; aa, amino acid. leukemic cells by arsenic. 2 In this study, we have performed a series of experiments in order to characterize further the Sp100 protein regarding nuclear localization, ND targeting, homomeric interaction, and the role of covalent modification of Sp100 by SUMO-1. We have demonstrated that SUMO modification of Sp100 strictly depends on nuclear import but does not affect or require ND localization. Finally, we have performed molecular modeling in order to generate a tertiary structure prediction for Sp100.
Immunoprecipitation-Cells were harvested from culture dishes using a rubber policeman and washed once in phosphate-buffered saline. The cell pellet was subsequently lysed in RIPA buffer (59) supplemented with COMPLETE proteinase inhibitor mixture (Roche Molecular Biochemicals, Penzberg, Germany) according to the manufactur-er's recommendations. DNA and RNA were digested using 0.5 units/l Benzonase (Merck, Darmstadt, Germany) at room temperature for 10 min. After centrifugation at 10,000 ϫ g, the supernatant was removed, cleared by a second centrifugation at 10,000 ϫ g, and used for immunoprecipitation. This supernatant contained most of the Sp100 proteins expressed from the transfected plasmids as determined by control experiments. Immunoprecipitation was carried out under RIPA conditions as described previously (42). For immunoprecipitation, rabbit anti-Sp100 (anti-SpGH) or anti-Gal4 mAb (Santa Cruz Biotechnology, Santa Cruz, CA) was used.
Mammalian Two-hybrid Interaction Assay-For in vivo interaction studies, the CLONTECH Matchmaker mammalian two-hybrid assay system (CLONTECH, Palo Alto, CA) and HuH7 hepatoma cells were used. The Sp100 cDNA and subfragments thereof were each cloned in-frame into plasmids pM and pVP16 (see above) for expression of the corresponding fusion proteins with the Gal4 DNA binding domain and the VP16 transactivation domain, respectively. Reporter gene assays were performed using plasmid pG5CAT according to the manufactur-erЈs protocol (CLONTECH, Palo Alto, CA). Cells of a 60-mm Petri dish were transfected with 0.2 pmol of each pM and pVP16 constructs, 0.7 pmol of the reporter construct, and 1 g of pCMV-␤-Gal to standardize transfection efficiencies. For measurement of CAT activity, an equivalent of 2.0 A 405 in the ␤-galactosidase assay was used. CAT activity was determined with a commercial CAT enzyme-linked immunosorbent assay (Roche Molecular Biochemicals, Penzberg, Germany).
Cells, Cytokines, Transfections, and Indirect Immunofluorescence Microscopy-Rat-1, HeLa S3, and HuH7 cells were maintained in Dulbecco's modified Eagle's medium (Life Technologies, Inc.) supplemented with 10% fetal calf serum. Plasmid DNAs were introduced into cells using the calcium phosphate precipitation procedure (59) or with Fu-GENE 6 (Roche Molecular Biochemicals, Penzberg, Germany) according to the manufacturerЈs protocol. One to five micrograms of the expression plasmids were precipitated per 6-cm dish. For indirect immunofluorescence staining, cells were grown on coverslips and treated as indicated. Cells were fixed at Ϫ20°C for 5 min in methanol and for 20 s in acetone. Rabbit or rat anti-Sp100 (Sp26) and rat anti-PML (anti-PML-N) antibodies were diluted 1:400. The anti-FLAG mAb was diluted 1:200. Cells were incubated with Abs for 30 min at room temperature. For detection, dichlorotriazinylfluorescein-or lissamine rhodamine-conjugated donkey anti-mouse, goat anti-rabbit, or donkey anti-rat IgG Abs (Dianova, Hamburg, Germany) were diluted 1:200 in phosphate-buffered saline.
Cell imaging was performed with a Zeiss Axiophot Microscope (Zeiss, Oberkochen, Germany) and the Seescan video system by Intas (Heidelberg, Germany). Images were processed on a 6500/275 Power Macintosh computer (Apple, Cupertino, CA) using Adobe Photoshop 4.01 (Adobe Systems Inc.).
Molecular Modeling-Molecular modeling was performed using the SYBYL molecular modeling software (Tripos Inc., St. Louis) on an Indigo workstation (Silicon Graphics, Mountain View, CA).
The Sp100 modeling was initiated by alignment of the Sp100 amino acid sequence against 977 Protein Data Bank files of resolved threedimensional structures within the SYBYL data base, using a cut off of 19% sequence homology and a gap open penalty of 8 amino acids. Three chains matching with sufficient homology (glycogen phosphorylase b, trimethylamine dehydrogenase, and transketolase) were selected for modeling.
Twenty four putative loops were detected and inserted into the model. Disulfide bonds were found between cysteines 16 -96, 209 -238, 248 -266, 270 -289, and 309 -373. Secondary structure prediction, debumping, and energy minimization of the raw model were done using SYBYLs standard settings, followed by pair-wise root mean square fits of the homologous areas of the model against the template chains. Final refinement of the structure was done by subsequent steps of debumping and energy minimization throughout the whole model, using 100 cycles per step, until no forbidden values for bond length, angle, and torsion were observed, and no further changes in the three-dimensional structure could be detected.

RESULTS
Domains and Subfragments of the Sp100 Protein-In order to characterize functional domains of the Sp100 protein and to obtain specific mutants for further functional studies, we have generated a series of truncated versions of Sp100 by inserting fragments of the Sp100 cDNA into the eukaryotic expression vectors pSG5-FLAG or pSG5-LINK for localization studies and into vectors pM or pVP16 for interaction studies (see "Experi-mental Procedures"). An overview of the type of expressed Sp100 fragments, together with a schematic representation of some putative Sp100 domains as determined by sequence comparison or functional studies, is given in Fig. 1. At the N terminus, Sp100 contains a short stretch of amino acids with significant sequence similarity to MHC class I molecules (amino acids 9 -49) (36) followed by a region, which is the only Sp100 domain conserved in a truncated Sp100 protein predicted to be expressed in mice (amino acids . This murine Sp100 homologue or related protein is encoded in a highly amplified gene, Sp100-rs (61,62). Since this amplified gene forms a homogeneously staining region (HSR) on murine chromosome 1 in several wild mice strains (63,64), the corresponding Sp100 protein domain was denominated HSR domain (34) (see below). A Sp100 domain with transcriptional transactivating properties is depicted according to Xie et al. (39) as well as a region reported to mediate interaction with the HP1 protein (22,40).
Identification of the Sp100 Nuclear Localization Sequence-To identify and characterize the sequence responsible for nuclear import of Sp100 (NLS), the Sp100 fragments Sp100-(1-334) and Sp100-(326 -480) were expressed transiently in Rat-1 cells with an N-terminal FLAG epitope tag and detected by immunostaining using a monoclonal antibody specific for the FLAG epitope. Rat-1 cells were chosen in this experiment because none of our polyclonal anti-Sp100 antisera cross-reacts with the rat Sp100 protein. Therefore, also the subcellular localization of untagged Sp100 proteins expressed from the transfected plasmids could be unequivocally determined using these anti-Sp100 antisera. As evident from Fig. 2A, Sp100-(1-334) exhibited a cytoplasmic localization, whereas Sp100-(326 -480) was detected mainly in the nucleus (Fig. 2B), suggesting that the Sp100 NLS resides between amino acids 334 and 480. In addition, the diffuse nuclear distribution of Sp100-(326 -480) indicates that this Sp100 region lacks the sequences necessary for ND targeting (see below). Additional experiments using shorter N-terminal constructs confirmed that the Sp100 NLS is located in the C terminus of the protein (data not shown). A search for amino acid motifs in the NLS-containing region that are similar to known NLS sequences revealed a single bona fide Sp100 NLS between amino acids 444 and 450 ( Fig. 1, top, motif PSRKRRF). To test whether this sequence is essential and sufficient for nuclear transport of Sp100, a single amino acid substitution (K447E) was introduced into the untagged wild-type Sp100. This changes the basic stretch of the NLS from PSRKRRF into PSRERRF (underlined boldface letter indicates the amino acid substitution). Transient expres-sion of this Sp100 mutant (Sp100(447E)) in Rat-1 cells revealed an almost complete abrogation of nuclear transport (Fig. 2D), as evident by immunostaining with polyclonal anti-Sp100 antibodies. To exclude that this K447E mutation has a nonspecific effect on subcellular localization of Sp100 due to a change of the overall conformation of the protein, we have inserted the cDNA encoding Sp100(447E) in frame also into expression vector pSG5-LINK. This vector contains an artificial start codon followed by SV40 large T-NLS and FLAG epitope encoding sequences. The immunostaining pattern of the transiently expressed NLS-FLAG-Sp100(447E) protein was indistinguishable from the pattern obtained with transiently expressed wildtype Sp100 (Fig. 2, E and C, respectively). These data demonstrate that nuclear localization of Sp100 can be almost completely abolished by the K447E amino acid substitution and implies that Sp100 has a single NLS located between amino acids 444 and 450 that is essential and sufficient for nuclear import.
Determination of Sp100 Sequences Necessary for ND Targeting-For analysis of the ND targeting domain of Sp100, the sequences encoding amino acids 1-253, 1-182, 1-75, 33-140, 208 -480, 326 -480, and 208 -334 (compare Fig. 1) were inserted into pSG5-LINK. Thus, the various Sp100 polypeptides were expressed with a FLAG epitope tag and an additional NLS sequence to ensure nuclear localization. After transfection of cells with the respective Sp100 constructs, the transiently expressed Sp100 polypeptides were visualized using the anti-FLAG mAb (Fig. 3, red fluorescence). Preformed nuclear dots were visualized by immunostaining of the PML protein expressed from the endogenous PML gene by using a polyclonal rat anti-PML antiserum (Fig. 3, green fluorescence) on the same coverslips. HeLa S3 cells were used for these transient transfection experiments because of their high level of endogenous PML and Sp100 protein expression. The results of these experiments are summarized in Fig. 3. All C-terminal Sp100 polypeptides showed a diffuse nuclear distribution pattern and were not enriched at nuclear dots stained by anti-PML antibodies (shown for the largest C-terminal fragment Sp100-(208 -480), Fig. 3, C and D, respectively). N-terminal Sp100 fragments (red label), in contrary, co-localized with the PML protein in NDs (green label) (shown representatively for Sp100-(1-253) in A and B or for Sp100- , in E and F, respectively). As the shortest N-terminal fragment Sp100-(1-75) was not detectable by immunofluorescence, Sp100-(33-149) was the smallest Sp100 polypeptide which was targeted to NDs. Interestingly, expression of Sp100-(208 -480) led to a significant redistribution of NDs resulting in fewer and more brightly staining dots (Fig. 3, C and D). This is surprising because of its diffuse distribution, which suggests a lack of direct interaction of this Sp100 polypeptide with NDs. In contrast, the N-terminal fragments Sp100-(1-253) and Sp100-(33-149) did not show such an effect when expressed at moderate levels ( Fig. 3, A, B, E, and F). A similar redistribution of NDs was also observed with the Sp100 polypeptides Sp100-(326 -480) and Sp100-(208 -334), which were likewise expressed with a diffuse pattern (data not shown). This may be due to competition of the diffusely distributed C-terminal fragment for cellular factors involved in regulation of ND distribution and morphology, thereby indirectly affecting ND structures. A control construct, expressing enhanced green fluorescent protein (EGFP) fused to the same FLAG epitope tag and NLS sequences as the corresponding Sp100 constructs, did not trigger such an effect (data not shown). Taken together, these data suggest that the Sp100 nuclear dot targeting signal is located between amino acids 33 and 149 and that amino acids 208 -480 might be involved in regulatory processes.
Mapping of a Dimerization Domain in the N terminus of the Sp100 Protein-Investigation of the dimerization or multimerization potential of the Sp100 protein is an important prerequisite for understanding how NDs can be formed, as well as for construction of dominant negative mutants of Sp100 needed for functional studies. To test the homomeric interaction potential of Sp100, we have used the two-hybrid assay in mammalian HuH7 hepatoma cells and have performed co-immunoprecipitation experiments. Fig. 4 summarizes the results of the twohybrid studies and gives an overview on the type of Sp100 fragments including the full-length Sp100 (aa 1-480), which were expressed as fusion proteins with the Gal4 DNA binding domain or the VP16-transactivating domain, respectively (Fig.  4A). Measured were the amounts of CAT protein expression from the co-transfected reporter plasmid. By using this assay, a strong signal similar to that of the positive control (Gal4-p53/ VP16-SV40 large T) was detected with all N-terminal subconstructs containing the region between amino acids 33 and 149 (Fig. 4A, shaded), indicating that this region can mediate homomeric interaction. The shortest Sp100 polypeptide Sp100-(1-75), in contrast, completely lacked dimerization activity according to this assay. Expression of all Sp100 fusion proteins was confirmed by immunoblotting (data not shown). Surprisingly, Gal4 and VP16 fusion proteins containing the region between amino acids 334 and 480 resulted in a weaker or no CAT expression (see combinations Gal4-(1-334) and VP16-(1-480) or Gal4-(1-480) and Vp16-(1-480), respectively). This may be due to inhibitory sequences in the C-terminal part of the protein or by steric hindrance in formation of the transcriptional activating complex driving the expression of the reporter plasmid.
In order to confirm the data obtained by the two-hybrid analysis by an independent assay and to investigate whether different splice variants of Sp100 also heterodimerize, co-immunoprecipitation experiments were performed (Fig. 5). HuH7 cells were transiently transfected with plasmids pM-Sp100-(1-182) and pSG5-Sp100-(1-480) or pM-Sp100-(1-480) and pSG5-SpAlt-C. The latter plasmid expresses the Sp100 splice variant SpAlt-C (41), which differs from Sp100 by the lack of C-terminal 32 amino acids that are replaced by 24 amino acids. For co-immunoprecipitation, a monoclonal antibody specific for the Gal4 domain of the Sp100 fusion proteins expressed from the pM constructs was used. Precipitates and supernatants were subjected to immunoblotting using polyclonal anti-Sp100 antibodies for detection (rabbit anti-SpAB, specific for aa 1-253 of the Sp100 protein). As evident from Fig. 5, left panel, co-expression of Gal4-Sp100-(1-182) and wild-type Sp100-(1-480) followed by immunoprecipitation with anti-Gal4 mAb resulted in efficient co-precipitation of the untagged full-length Sp100- The heteromeric interaction between murine p53 and SV40 large T antigen and the homomeric interaction of the PML protein as described previously (73) were measured using the corresponding Gal4and VP16 constructs. Three representative experimental results are given as picogram of CAT protein per 200-l assay volume. evidence for multimerization of both Sp100 polypeptides. However, the SpAB antiserum used for detection in these experiments was raised against Sp100-(1-253), implying that there are probably more potential epitopes present on the Sp100-(1-480) protein than on the Sp100-(1-182) fragment. Therefore, the signal intensities of the two bands do not necessarily reflect the relative molar amounts of the two polypeptides. The reciprocal immunoprecipitation experiment using an antiserum raised against the C-terminal region of Sp100 (anti-SpGH, corresponding to Sp100 amino acids 383-474) confirmed these data in that Sp100-(1-182) was co-precipitated with Sp100-(1-480) using anti-SpGH (data not shown).
To be able to compare directly the relative amounts of coprecipitated proteins and in order to test for heterodimerization of two variant Sp100 proteins, we have co-expressed Gal4-Sp100-(1-480) and the Sp100 variant SpAlt-C. When immunoprecipitated with anti-Gal4 mAb, SpAlt-C was co-precipitated with Gal4-Sp100-(1-480) (lane 6) but remained in the supernatant when expressed alone (lanes 7 and 10, respectively). Gal4-Sp100-(1-480), when expressed alone, was precipitated as expected (lane 5). Since in this experiment both proteins contain the complete Sp100-(1-253) region, which is recognized by the SpAB antiserum, the signal intensities directly correlate with the relative amounts of the co-precipitated proteins. In this experiment, equivalent amounts of SpAlt-C and Gal4-Sp100-(1-480) were precipitated (lane 6). A comparison of the signal intensities of the SpAlt-C and Gal4-Sp100-(1-480) bands in the corresponding supernatants (lane 9) demonstrates that most of the Gal4-Sp100-(1-480) was precipitated, whereas an excess of SpAlt-C remained in the supernatant. If the two proteins would form multimers, the relative amounts of the proteins in the precipitates should reflect the relative total amounts of the expressed proteins. Accordingly, in this experiment an excess of SpAlt-C over Gal4-Sp100-(1-480) should be expected in the precipitate, which is not the case. Therefore, these data provide indirect evidence that Sp100 might form dimers rather than multimers. As, however, Sp100 multimers might be disassembled under the buffer conditions used, additional experiments are required to clarify whether Sp100 is capable of multimer formation. Consistent with our study, homomeric interaction of Sp100 has been reported very recently also by another group (40).
Taken together, our data imply that Sp100 can form homodimers and that splice variants of Sp100 can heterodimerize. The region mediating this interaction resides within the same domain responsible for ND targeting. It covers Sp100 amino acids 33-149, which are essentially encoded by the Sp100 exons three and four as we have shown previously by analysis of the gene organization (62).
The Sp100 ND-targeting and Dimerization Domain Defines a Protein Family-The Sp100 region between amino acids 33 and 149, which is important for ND targeting as well as for dimerization, is conserved in a murine-truncated Sp100 homologue, which is expressed from a highly amplified gene, designated Sp100-rs (61)(62)(63). In some mice strains, the amplification has progressed to such extremely high copy numbers that the amplified gene locus is visible as a homogeneously staining region (HSR) on the corresponding murine chromosome (64,65). In addition, the same region is also conserved in a human Sp100related protein, the leukocyte-specific LYSP100/Sp140 (20,38). We compared the sequences of these proteins and performed similarity searches in the GenBank TM data base in order to examine whether the Sp100 dimerization and ND-targeting domain represents a motif also found in other proteins. Fig. 6 gives an alignment of the corresponding Sp100 domain, as well as the corresponding domains of the murine HSR-encoded Sp100-rs protein and the LYSP100/Sp140 protein. In addition, we found significant homology to this region in two additional proteins, the AIRE autoimmune regulator protein and a putative human protein (AA431918, WashU-Merck EST Project). Amino acid residues at several positions are fully conserved (Ile-33, Ala-36, Phe-41, Pro-42, Leu-48, Asp-50, Pro-72, Leu-82, Leu-97, Phe-98, Asn-102, Tyr-106, Leu-109, and Phe-116) between the different proteins. Therefore, the region corresponding to Sp100 amino acids 40 -147 indeed defines a novel protein motif. As this domain comprises most of the sequence of the truncated murine Sp100-rs protein, it was termed HSR domain (34).
Analysis of Sequences Necessary for Sp100 SUMOylation-As we have shown previously, the Sp100 protein undergoes post-translational modification by covalent attachment of the small ubiquitin-related protein SUMO-1 (42). For PML, it has been reported that modification by SUMO-1 regulates targeting of PML to NDs (43) and that PML lacking the functional SUMOylation sites fails to localize to NDs (54). It is likely that only one SUMO molecule is attached to Sp100 (42), which is different from PML which is modified by at least three molecules of SUMO-1 (54). In order to understand the role of SUMO-1 modification of Sp100, we have performed a series of experiments to map the corresponding lysine residue where SUMO-1 is attached in vivo and to examine the effects of mutation of this site from Sp100. To ensure correct nuclear localization irrespective of the presence of the Sp100 NLS, we have expressed a number of Sp100 fragments fused to the FIG. 5. Analysis of Sp100 homomeric interaction using immunoprecipitation. HuH7 cells were transiently transfected with plasmids pM-Sp100(1-182) and pSG5-Sp100 as indicated (left panel) or with pM-Sp100 and pSG5-SpAlt-C as indicated (right panel). Immunoprecipitation (IP) was carried out using an anti-Gal4 mAb. Pellets (pell.) and supernatants (s.n.) of the precipitation reaction were used for immunoblotting. The immunoprecipitated proteins were detected with the polyclonal rabbit anti-Sp100 antiserum (produced with Sp100 amino acids 1-253). Schematic representations of the expressed proteins are given below the blots. Indicated is the Gal4-domain recognized by the anti-Gal4 mAb used for immunoprecipitation. Positions of the corresponding proteins are marked by arrows.
Gal4-domain (Fig. 7A). For analysis of SUMOylation, we have developed a co-transfection assay in HuH7 cells. HuH7 cells were used because endogenous SUMOylation of transiently expressed Sp100 is low as determined by immunoblotting (shown representatively for Sp100 expressed without Gal4 sequences, Fig. 7B, left panel). Sp100, expressed transiently from the transfected plasmid pSG5-Sp100, was detected by Sp100 antibodies as a broad band at approximately 100 kDa (lane 1), which, at lower exposure times, resolves at least into four distinct bands probably representing different phosphorylated forms of Sp100 (not shown). A weak additional band reactive with anti-Sp100 antibodies is visible at 100 kDa (Fig. 7B,  asterisk). As we have shown previously, this band represents Sp100 modified by SUMO (42). In order to increase SUMOylation, we co-expressed SUMO-1 by co-transfection of 1 or 5 g of the respective expression vector pSG5-SUMO. This resulted in a strong increase of the intensity of the upper band corresponding to increased modification of the transiently expressed Sp100 (Fig. 7B, lanes 2 and 3, respectively). Interestingly, no difference in modification was observed between the co-transfection of 1 or 5 g of pSG5-SUMO, indicating saturation of the modifying enzymatic machinery. In further experiments, cotransfection of 1 g of pSG5-SUMO was performed with 5 g of Sp100 expression vectors and compared with transient transfection with the corresponding Sp100 expression vector alone. An increase in intensity of an additional higher migrating band after co-transfection was taken as evidence for SUMOylation of the corresponding Sp100 polypeptide. The results of the corresponding co-transfection experiments with the Gal4-Sp100 subfragments are summarized in Fig. 7A. Only the full-length Gal4-Sp100-(1-480) and the large C-terminal fragment Gal4-Sp100-(208 -480) showed a strongly increased signal intensity for the higher migrating band after co-expression of SUMO-1. These data suggest that sequences necessary for SUMOylation are located C-terminally of amino acid 253 and N-terminally of amino acid 326.
Since SUMO modification occurs at lysine residues of proteins (49,50,66), we have changed three of the four lysines in the corresponding Sp100 region into arginine (K297R, K298R, and K300R) In order to exclude any influence of these sequences on the efficiency of the modification, the constructs express Sp100 without heterologous sequences. We have also included the Sp100(447E) mutant in this analysis in order to investigate the role of nuclear transport for SUMOylation. In this assay, the Sp100 mutant Sp100(297R) was completely deficient in SUMO modification (Fig. 7, lanes 4 and 5), whereas the mutants Sp100(298R) and Sp100(300R) (Fig. 7, lanes 6 -9) were efficiently modified after co-expression of SUMO-1 (Fig. 7,  lanes 7 and 9, respectively) and also exhibited some modification without co-expression of SUMO-1 (Fig. 7, lanes 6 and 8).
According to these data, lysine 297 is most likely the definite and only residue where SUMO-1 is attached in vivo. Also the Sp100(447E) mutant with the non-functional NLS was severely compromised in SUMO modification in this assay (Fig. 7, lanes  10 and 11). Only a very faint band was visible after co-expression of SUMO-1 (Fig. 7, lane 11), and without co-expression this band was completely absent (Fig. 7, lane 10). As this could be due either to lack of SUMO modification in the cytoplasm or to direct involvement of the lysine, which is mutated in the Sp100 Emergence of an additional band indicating modification by SUMO-1 is indicated for each construct on the right side of the graph. The minimal region required for SUMOylation as determined in this experiment is shaded and the corresponding amino acid sequence is given below. Amino acid changes introduced by mutagenesis are indicated by arrows and numbers. B, HuH7 cells were transiently transfected with plasmids pSG5 -Sp100 (left panel, lanes 1-3), pSG5-Sp100(297R) (lanes 4 and 5), pSG5-Sp100(298R) (lanes 6 and 7), pSG5-Sp100(300R) (lanes 8 and 9), pSG5-Sp100(447E) lacking a functional NLS (lanes 10 and 11) and pSG5-LINK-Sp100(447E), re-adding an NLS to the 447E mutant ( lanes  12 and 13) without (lanes 1, 4, 6, 8, 10, and 12) or with co-expression of SUMO-1 (lanes 2, 3, 7, 9, 11, and 13 K447E polypeptide in the modification process, we have inserted the Sp100(447E) cDNA into plasmid pSG5-LINK, thereby providing the mutant with an active heterologous NLS. As evident from Fig. 7, lanes 12 and 13, Sp100(447E) when expressed with a functional NLS again becomes efficiently modified by SUMO-1, arguing against lysine 447 as being directly involved in the SUMOylation reaction. The altered electrophoretic mobility of this protein is caused by 25 heterologous amino acids derived from the pSG5-LINK vector. These data indicate that modification of Sp100 by SUMO-1 is tightly linked to nuclear localization. In order to determine whether the weak SUMO-modified Sp100(447E) band visible after cotransfection might be due to some weak modifying activity also present in the cytoplasm or whether this faint band is due to the small amount of Sp100 present in the nucleus after transient expression, which is then normally modified, we performed a co-expression experiment of Sp100(447E) and SUMO-1 in Rat-1 cells. Rat-1 cells were chosen in this experiment because of the lack of reactivity of the anti-Sp100 antisera with the rat Sp100 homologue. In Fig. 8, left panels, the result of such a co-expression experiment is documented. Cells were double-labeled with polyclonal rabbit anti-Sp100 (green) and the anti-SUMO-1 mAb (red). Again, Sp100(447E) is expressed predominantly in the cytoplasm with some minor amount still localizing in the nucleus where it is targeted to NDs (Fig. 8, panel Sp100(447E)). The same cells, when stained with the anti-SUMO mAb, showed predominantly diffuse nuclear distribution with some additional dot-like staining (Fig. 8,  SUMO-1). This pattern was typical for transient expression of SUMO-1 in all cell lines tested so far (data not shown). An overlay of both pictures clearly demonstrates that the dot-like structures in the nucleus show almost perfect co-localization. The large amount of cytoplasmic Sp100, in contrast, is not stained by the anti-SUMO mAb indicating that cytoplasmic Sp100 is not modified. The co-immunostaining of nuclear Sp100 with SUMO-1 is in agreement with the interpretation of the immunoblot data, indicating that the minor amount of Sp100(447E) localized in the nucleus becomes efficiently modified by SUMO-1.
For PML, it has been reported that mutagenesis of SUMO modification sites resulted in poor nuclear dot formation with diffuse distribution of the transiently expressed protein. In order to analyze whether SUMOylation of Sp100 has a similar role for ND targeting, we have performed immunofluorescence analyses using Rat-1 cells transiently transfected with plasmids pSG5-Sp100(297R), pSG5-Sp100(298R), and pSG5-Sp100(300R). As evident from Fig. 8, right panels, the Sp100(297R) mutant exhibited a nuclear speckled distribution indistinguishable from the mutants Sp100(298R) (not shown), Sp100(300R), and wild-type Sp100 (Fig. 8, Sp100(297R), Sp100(300R), and Sp100, respectively). Therefore, SUMOylation of Sp100 is not necessary for ND targeting and accumulation of Sp100 at NDs. In summary, these data provide strong evidence that Sp100 becomes SUMOylated at lysine residue 297 and, furthermore, indicate that SUMOylation of Sp100 depends on a functional nuclear import signal but is not required for ND targeting.
By using our data of the specific modification site of Sp100, we compared the lysine motifs of all so far known SUMOylation sites to find similarities that would allow us to predict a consensus sequence for SUMOylation motifs. The result of this comparison shows that the SUMOylation sites of Sp100 (this FIG. 8. Left panel, co-expression of Sp100(447E) and SUMO-1 in Rat-1 cells. Cells were transiently transfected with pSG5-Sp100(447E) and pSG5-SUMO followed by double immunostaining using polyclonal rat anti-Sp100 (green fluorescence) and a monoclonal anti-SUMO Ab (red fluorescence). For better co-localization, the images were electronically merged (merge). Colocalization of the signals is observed only in the nucleus. Right panel, expression of Sp100 mutants in Rat-1 cells. Rat-1 cells were transiently transfected with plasmids pSG5-Sp100, pSG5-Sp100(297R), or pSG5-Sp100(300R), followed by immunostaining using polyclonal rabbit anti-Sp100 (red fluorescence). The intracellular distribution pattern of all three Sp100 proteins is practically indistinguishable. report), IkappaB (51), RanGAP (49,50), and two out of three lysines described for PML (54) fit very well into a consensus sequence for SUMOylation (Fig. 9A) similar to the prediction made from the comparison of the SUMOylation sites in the human RanGAP-1 protein and IB carried out by Desterro et al. (51). The core motif of this consensus sequence is (I/L)KXE, but a contribution of additional amino acids, such as (R/H) at position Ϫ4 -5 or (K/R) at position ϩ1-2, is certainly conceivable. In Fig. 9B, we have used this consensus sequence to predict additional SUMOylation sites in other ND proteins or proteins related to NDs. A potential SUMOylation site is present in the non-amplified murine homologue of Sp100 (63), in the LYSP100/Sp140 protein (20,38), as well as in a human nuclear phosphoprotein (HNPP) (67), which contains a protein motif also present in Sp100 splice variants (34). Finally, in an Sp100 splice variant (40,41), there is a second bona fide SUMOylation site in the HMG domain of the protein, which is therefore also present in HMG-1.
This consensus sequence is a useful basis for studies on SUMOylation of additional proteins.
A Three-dimensional Structure Modeling for Sp100 and Localization of the Functional Domains-Functional domains of proteins, which are involved in protein-protein interaction, post-translational modification, and subcellular localization, should reside on readily accessible regions on the surface of the protein molecule. In order to evaluate if the experimentally defined functional domains of Sp100 fulfill this criterion, we have generated a model of the Sp100 protein using the SYBYL® molecular modeling software (for parameters see "Experimental Procedures"). The result of this molecular modeling is depicted in Fig. 10. Three spatial views of the molecule are given at the top of Fig. 10. Functional domains and residues are colorized as indicated. In this model, the Sp100 molecule has an asymmetric shape consisting of two subdomains. One domain comprises the lower protruding part, which also contains the dimerization and ND-targeting domain (HSR domain, colored in cyan). Within the HSR domain, there is an exposed ␣-helix (amino acids 40 -56), for which it is very tempting to speculate that it mediates protein-protein interaction ( Figs. 1 and 10, Helix 1). However, this predicted helix is also present in the Sp100-(1-75) construct, which did not show any dimerization in our experiments. Therefore, it cannot be sufficient for dimerization of Sp100. The upper part of Sp100 consists of a large, wing-shaped domain, which contains the Sp100 NLS exposed on the molecule in a flexible arm (colored in green) and therefore is predicted to be readily accessible for the nuclear import machinery. A subdomain of the upper Sp100 domain with a high helical content is in part equivalent to the region of Sp100 with transactivating properties. Lysine 297 likewise is also located at the surface of the molecule (depicted in red) and, therefore, accessible for the modifying enzymes. We have attached the model for ubiquitin to lysine 297 as an illustration for the relative size of the Sp100 protein and the SUMO-1 molecule. Despite its limited sequence homology, the structure of SUMO-1 has been reported to resemble that of ubiquitin as determined by NMR (68). Taken together, the experimentally defined functional regions of Sp100 and the computer-generated structure prediction are in excellent agreement and together can be used for the construction of specific mutants for the elucidation of the cellular function of Sp100. DISCUSSION In our study, we have performed a detailed characterization of structural and functional sequence features of the Sp100 protein regarding subcellular localization, dimerization, and covalent modification by SUMO-1. We have mapped the Sp100 NLS, as well as a ND-targeting and dimerization sequence which defines a new protein motif, termed HSR domain, also present in other proteins. Furthermore, we have identified the lysine residue necessary for modification of Sp100 by SUMO-1 and demonstrated that a functional NLS is essential for modification of Sp100 by SUMO-1. We have shown that SUMOylation of Sp100 is not essential for ND targeting of Sp100 and, vice versa, that ND targeting is not necessary for SUMOylation. Finally, we propose a consensus sequence for SUMOylation of proteins based on the comparison of the known SUMOylation motifs on other proteins. A three-dimensional structural prediction of the Sp100 protein was generated by molecular modeling. Consistent with our mapping data in this model all functional sites of the Sp100 protein were found in surfaceexposed regions.
So far, separation of the ND-targeting and dimerization motif was not possible, because the shortest Sp100 fragment Sp-(1-75) could not be detected by immunofluorescence staining. Therefore, we cannot exclude that the ND targeting observed for the transiently expressed Sp100 polypeptides may be in part or even in toto be due to dimerization of the protein with endogenous Sp100 located at NDs. However, since our data provide indirect evidence that Sp100 can dimerize, but probably not multimerize, the ND targeting has to be, at least in part, due to a targeting toward the Sp100 docking site at NDs, independently of homodimerization. Another possibility is that the Sp100 multimerization capability is restricted to the fraction of Sp100 localized at NDs, possibly due to a changed conformation when compared with Sp100 in solution, as it is present in the cytoplasm of cells (42) and in the cellular extracts used for our immunoprecipitation experiments. The fact that Sp100 with a mutated NLS exhibits a diffuse cytoplasmic distribution also argues against multimer formation of Sp100 alone. Accordingly, using Sp100 from solubilized cellular protein extracts, only the dimeric form would be observed. However, a definite answer as to whether Sp100 is capable of multimer formation requires additional experiments, like sucrose gradient analysis. Analysis of further Sp100 protein subfragments is required in order to separate homomeric interaction from ND targeting in order to confine the ND targeting signal of Sp100. As this signal would be expected to interact with novel ND components, the identification of the sequences necessary is of major interest for the isolation of these components. One has to keep in mind, however, that it is very well possible that homodimerization of the corresponding Sp100 domain might be required in order to generate a functional ND-targeting signal.
An interesting finding in these studies was the discovery of a novel protein motif, the HSR domain, within this ND targeting/ dimerization domain. It will be interesting to investigate whether the corresponding domains in the other members of the HSR domain family proteins have the same function. In fact, at least the LYSP100/Sp140 protein shows partial (38) or even complete (20) co-localization with NDs in lymphocytes, and also for the murine Sp100-rs protein an ND localization is likely (62). It could be hypothesized also that the AIRE protein and the AA431918 protein are novel ND proteins. Remarkably, three members of the HSR family share more extensive sequence similarity. The Sp100 splice variant Sp100-B (SpAlt-212), the LYSP100/Sp140 protein, and the AIRE protein share the N-terminal HSR domain and an additional protein motif toward the C terminus called the HNPP box (34,41), also recently termed the SAND domain (69). Recent data from our laboratory argue that the HNPP box/SAND domain might direct proteins to nuclear domains different from NDs and therefore represent a targeting signal antagonistic to the HSR domain on the same molecule (41). One could speculate that the genes encoding these proteins represent a family derived from a common ancestor gene and might fulfill similar functions in the cell. The AIRE protein is of special interest in this regard, as it is mutated in a monogenic hereditary autoimmune disease and appears to play a major role in control of the immune response (70).
Another interesting observation in our studies is the redistribution of NDs caused by expression of the C-terminal Sp100 fragments Sp100-(208 -480), Sp100-(326 -480), and Sp100-(208 -334). As these Sp100 fragments are diffusely distributed in the nucleoplasm, redistribution of NDs is most likely due to an indirect effect, such as competition of the overexpressed Sp100 fragment for cellular factors. As this redistribution was also observed when a Sp100-(208 -480) K297R mutant was expressed, it is most likely not due to competition for cellular SUMO-1 (data not shown). However, it is also conceivable that the Sp100 fragments compete for the corresponding SUMOmodifying enzyme, thereby leading to hypomodification of NDlocated cellular Sp100 and/or PML, which may result in redistribution of NDs. Alternatively, competition for other cellular factors, such as kinases, could play a role. In fact, the region present on all three Sp100 constructs tested (aa 326 -334) contains a consensus casein kinase 2 phosphorylation site.
SUMO modification of proteins has recently been shown to be a novel cellular mechanism of general importance for a multitude of biological processes (reviewed in Ref. 66). Therefore, the identification and mutational analysis of the Sp100 SUMOylation site as shown here is of general relevance. The strict dependence of SUMO modification of Sp100 on nuclear targeting raises the question whether SUMO modification is restricted to the nuclear compartment in general. In fact, most of the proteins for which covalent SUMOylation has been demonstrated so far, i.e. Sp100, PML, and IB can be found, at least in part, inside the nucleus. For IB translocation into the nucleus of the newly synthesized protein was demonstrated FIG. 10. Three-dimensional structural prediction of the Sp100 protein as obtained by molecular modeling. At the top of the panel, three spatial views of the model are given. At the bottom of the panel, to demonstrate the relative sizes of the proteins, a model of ubiquitin, which is reported to be almost identical in structure to SUMO-1 (68), was graphically attached to Lys-297 of the Sp100 protein (depicted in red as space-fill representation). Experimentally defined regions are colored as indicated. Note the helical stretch (Helix 1) at the outside of the protein. (71). For RanGAP-1, staining at the mitotic spindle was reported (47). Taking this into account, one could hypothesize that SUMOylation of proteins could involve nuclear import as an obligatory step. Whether the modification reaction itself occurs in the nucleoplasm, during nuclear import, or after docking at the nuclear pore at the outside of the nuclear pore in the cytoplasm remains to be shown. Since in our experiments the Sp100-(208 -480) polypeptide with a diffuse nuclear localization still becomes SUMOylated, ND targeting is most likely not a prerequisite for SUMOylation. Another open question is whether the proteins (described in previous works) to interact with SUMO-1, such as the tumor necrosis factor receptor, Fas, Rad51, and Rad52, are indeed modified by SUMO-1 or only interact with the SUMO-1 modification of another protein, which in turn again could be modified in the nucleus. If so, SUMO-1 modification could very well be part of a signaling cascade.
In our study, we found no effect of removal of SUMO modification of Sp100 by mutagenesis on nuclear import or ND targeting of the protein. In this regard, SUMO modification of Sp100 differs obviously from that of the PML protein, which was reported to be severely compromised in ND targeting when SUMO modification was abolished by mutagenesis (54). However, since one of the known functions of SUMO modification is protein targeting, as shown for the RanGAP1 protein, which is bound to the nuclear pore when modified by SUMO-1 (49,50), one may hypothesize that SUMOylation regulates protein-protein interaction also in the case of Sp100. A good candidate for such a SUMO-regulated association is the HP1 protein, which has been reported to interact directly with Sp100 (22,40). In these experiments, the interaction domain was mapped to amino acids 287-333. Given the relative size of SUMO-1 modification, which is similar to that of ubiquitin (shown in Fig.  10), one would expect that SUMO modification of Sp100 at amino acid position 297 will interfere with HP1 binding. This could either enhance binding, as in the case of RanGAP1, or on the other hand, prevent HP1 association through steric hindrance. Co-precipitation experiments in order to elucidate a possible role of SUMO-1 modification in regulation of HP1 binding to Sp100 are currently underway. On the other hand, SUMOylation was reported to act as a stabilizing factor by counteracting ubiquitin-mediated degradation (51,72). However, we currently have no evidence that Sp100 lacking the SUMOylation site is less stable than SUMO-modified Sp100 when transiently expressed. We also do not know whether the unmodified form of Sp100 is more rapidly degraded than the modified form. Another hypothesis is that SUMO modification plays a role in certain cell cycle stages, as we have provided evidence that during mitosis PML becomes demodified and, by this mechanism, might regulate ND disassembly, which occurs in mitosis (42). Likewise, SUMOylation of Sp100 might regulate a cell cycle-specific function of Sp100.
Collectively, our data provide important information and new tools for the analysis of the function of Sp100 and NDs. Mutants consisting only of the HSR domain may act in a dominant negative manner, as they are capable of dimerization with Sp100, but lack the entire C terminus including the SUMOylation site and the HP1 binding domain. Mutants lacking the N-terminal part of the protein, which induced changes in ND distribution when expressed transiently, likewise might act in a transdominant fashion. Moreover, expression of Sp100 without SUMOylation site in a stable system should give new clues to the role of this modification during the cell cycle, differentiation, or during virus infection.