Complex nuclear localization signals in the matrix protein of vesicular stomatitis virus.

The matrix (M) protein of vesicular stomatitis virus (VSV) functions from within the nucleus to inhibit bi-directional nucleocytoplasmic transport. Here, we show that M protein can be imported into the nucleus by an active transport mechanism, even though it is small enough (approximately 27 kDa) to diffuse through nuclear pore complexes. We map two distinct nuclear localization signal (NLS)-containing regions of M protein, each of which is capable of directing the nuclear localization of a heterologous protein. One of these regions, comprising amino acids 47-229, is also sufficient to inhibit nucleocytoplasmic transport. Two amino acids that are conserved among the matrix proteins of vesiculoviruses are important for nuclear localization, but are not essential for the inhibitory activity of M protein. Thus, different regions of M protein function for nuclear localization and for inhibitory activity.

In eukaryotic cells, molecular transport between the nucleus and cytoplasm occurs through nuclear pore complexes (NPCs), 1 large proteinacious structures that span the nuclear envelope (for reviews, see Refs. 1 and 2). Small molecules can diffuse through NPCs, but larger molecules must be actively transported. Active transport through NPCs is mediated by receptor proteins (also known as importins, exportins and transportins, or karyopherins), which interact with localization signals on cargo molecules, RanGTP, and proteins of the NPC (nucleoporins or Nups) (for reviews, see Refs. [3][4][5]. Nuclear localization signals (NLSs) are amino acid sequences that promote the active nuclear import of proteins, even when these proteins are small enough to diffuse through the NPC (for example, see Refs. 6 and 7). A wide variety of sequences have been identified that can function as NLSs, the best characterized of which are the highly basic mono-and bi-partite NLSs of SV40 T antigen and nucleoplasmin, respectively (for review, see Ref. 8). In addition, many other NLSs have been identified that differ from these sequences with respect to size and/or highly basic character (for example, see Refs. 9 -12). Thus, it is difficult to predict the identity of an NLS without empirical evidence.
Previously, we and others have shown that the matrix (M) protein of vesicular stomatitis virus (VSV) inhibits active nucleocytoplasmic transport (13)(14)(15). In the absence of other viral components, M protein inhibits nuclear export of mRNAs, snR-NAs, and rRNAs, but not tRNAs, and slows nuclear import of proteins containing highly basic mono-and bi-partite NLSs (13,14). The M proteins from two other vesiculoviruses, chandipura virus and spring viremia carp virus, also have inhibitory activity (16).
Earlier work from our laboratory demonstrated that M protein must be present in the nucleus to inhibit nucleocytoplasmic transport and that this inhibitory activity correlates with the ability of M protein to associate with NPCs (14). These observations are consistent with the findings that M protein associates with the intranuclear nucleoporin Nup98, and that this association is important for the inhibitory activity of M protein (15,17). The mechanism by which M protein gains access to Nup98 or other potential nuclear targets remains to be determined. Even though the size of M protein (ϳ27 kDa) is below the diffusion limit of the NPC (4), nuclear entry might occur by active import.
Here, we show that M protein can localize to the nucleus by an active import mechanism. We identify two regions of M protein that are each sufficient to direct the nuclear localization of a heterologous protein. These regions share a common sequence of 10 amino acids. We show that the region spanning amino acids 47-229 contains an NLS and is sufficient for the inhibitory activity of M protein. Finally, we identify two amino acids within M-(47-229) that are important for nuclear localization, but are not necessary for inhibitory activity. Thus, the interactions between M protein and cellular protein(s) that occur during nuclear localization are likely to be different from interactions involved in the inhibitory activity of M protein.

Construction of GFP 3 -M Protein DNA Plasmids-
The pEGFP-C3 vector encoding three tandem copies of GFP (pEGFP 3 -C3) was kindly provided by Y. Lazebnik (Cold Spring Harbor Laboratory). The reading frame within the multiple cloning site of pEGFP 3 -C3 was shifted by generating a double-stranded DNA fragment for insertion into the SacII site using the following complementary oligonucleotides: 5Ј-GGGCTGCAGAGATCTCCGC-3Ј and 5Ј-GGAGATCTCTGCAGCCCGC-3Ј. Oligonucleotides were gel-purified, phosphorylated using T4 DNA kinase (Promega), annealed, and ligated into pEGFP 3 -C3 vector that had been digested with SacII. Correct orientation of the insert was confirmed by DNA sequencing. The resulting plasmid, pEGFP 3 -C1, was used as the vector for all constructs encoding GFP 3 -M fusion proteins.
To make pEGFP 3 -M-(1-229), a DNA fragment encoding M protein was released from pEGFP-C1-OM (16) by BamHI digestion. This fragment was ligated into pEGFP 3 -C1 that had also been digested with BamHI. All truncations of M protein for ligation into pEGFP 3 -C1 were made by PCR using pGEX-2T-OM (14) as template (see Table I for oligonucleotides). PCR products were digested with Bam H1 and ligated into pEGFP 3 -C1 that had also been digested with BamHI. Correct orientation and sequence of all clones was confirmed by DNA sequencing.
Construction of GST-HA-M Protein DNA Plasmids-To make a vector encoding GST with an HA epitope tag fused to the C terminus, PCR was done using the vector pGEX-2T (Amersham Biosciences) as template, and the following primers (5Ј and 3Ј, respectively): 5Ј-GTCTAT-GGCCATCATACGTTA-3Ј and 5Ј-CGGGATCCAAGAGCGTAATCTG-GAACATCGTATGGGTAACGCGGAACCAGATCCG-3Ј. The resulting PCR product, encoding a carboxyl-terminal portion of GST with the HA epitope tag, was digested with BalI and BamHI. The pGEX-2T vector was also digested with BalI and BamHI. The gel-purified vector fragment of pGEX-2T was ligated to the PCR product to generate the vector pGEX-2T-HA. The presence and orientation of the insert was confirmed by DNA sequencing.
To make a plasmid encoding GST-HA-M-(1-229), a DNA fragment encoding the M protein was released from pEGFP-C1-OM (16) by digestion with BamHI. This fragment was ligated into the vector pGEX-2T-HA that had also been digested with BamHI. A construct encoding GST-HA-M-(47-229) was made by ligating the BamHI-digested PCR product coding for M-(47-229) (see above) into BamHI-digested pGEX-2T-HA vector. DNA sequencing was done to confirm the orientation and sequence of inserts. Activity assays (described below) confirmed that the presence of the HA epitope tag had no detectable effect on the inhibitory activity of the M protein.
GST-HA-M Protein Purification-For production of recombinant proteins, all plasmids were transformed into E. coli BL21 cells. Cells were grown overnight at 37°C in LB medium containing ampicillin (50 g/ml). Overnight cultures were used to inoculate fresh LB-amp to an OD 600 of 0.04. Cultures were grown at room temperature to an OD 600 of ϳ0.6 and then induced for 8 h with 1 mM isopropyl-1-thio-␤-D-galactopyranoside. Cells were harvested and protein was affinity-purified as previously described (14).
Analysis of RNA Export in Xenopus laevis Oocytes-Preparation and injection of stage VI X. laevis oocytes was as described (14). Purified GST-HA-M proteins (ϳ100 g/ml) were injected into the nucleus (12 nl) or into the cytoplasm (24 nl) 1 h prior to injection of RNA export substrates. A mixture of in vitro-synthesized (18) 32 P-labeled RNAs, which contained ϳ5 fmol of each species of RNA, was injected into the oocyte nuclei (12 nl). To control for the accuracy of injection and dissection, all injected samples included blue dextran, and the RNA mixture contained U3 snoRNA, which is not exported from the nucleus (19). At indicated time points, oocytes were manually dissected into cytoplasmic and nuclear fractions. Total RNAs were isolated from each fraction and analyzed by denaturing PAGE and autoradiography as previously described (20).
Transfections-One day prior to performing transient transfections, a 6-well tissue culture plate containing coverslips was seeded with 4 ϫ 10 5 HeLa cells per well. Transfections were done according to the Invitrogen protocol, using 1 g of DNA and 8 l of LipofectAMINE™ reagent (Invitrogen Life Technologies).
Fluorescence Microscopy-Cells were processed for fluorescence microscopy 24 h after transfection by fixation with 3% paraformaldehyde in phosphate-buffered saline for 20 min. To assay for NPC association of GFP 3 -M fusion proteins (data not shown, but see "Discussion"), cells were extracted first with 0.5% Triton X-100 for 3 min and then fixed with paraformaldehyde for 20 min (14). Fluorescent proteins were visualized using the ϫ100 objective of an Axioplan 2 fluorescence microscope (Zeiss).
To score the nuclear localization of GFP 3 -M fusion proteins, levels of fluorescence were quantified using Labworks Imaging Software (UVP, Inc). The ratio of average fluorescence in the nucleus to average fluorescence in the cytoplasm over a defined region of three representative cells was calculated and averaged for each protein. The values (N avg / C avg ) avg Ͻ 1 were scored as (Ϫ) for nuclear localization. Values 1 Ͻ (N avg /C avg ) avg Յ 1.2 and values (N avg /C avg ) avg Ͼ 1.2 were scored as (ϩ) and (ϩϩ), respectively.

M Protein Has an NLS within Amino Acids 47-229 -Since
nuclear localization is essential for the inhibitory activity of VSV M protein, it is important to understand how this protein enters the nucleus. To analyze the localization properties of M protein, we generated a fusion protein that contains M protein and three tandem copies of GFP 3 and thus is much larger (ϳ108 kDa) than the size limit for diffusion (ϳ60 kDa) through the NPC (4). Fusion proteins were expressed in HeLa cells by transient transfection, and protein localization was visualized in fixed cells by fluorescence microscopy.
The expressed GFP 3 protein was stable in HeLa cells, as indicated by the predominance of full-length protein (86% of total protein detected) in cell extracts analyzed by Western blotting (Fig. 1B, lane a). Localization of GFP 3 was almost exclusively cytoplasmic (Fig. 1A, panel a). In contrast, the fusion protein containing M protein and GFP 3 (GFP 3 -M-(1-229)), which was also stable when expressed in HeLa cells (Fig.  1B, lane b), accumulated strongly in cell nuclei (Fig. 1A, panel  b). In addition, GFP 3 -M-(1-229) was visible at the nuclear rim, consistent with previous reports (14,15). The ability of M protein to direct import of a cytoplasmic protein into the nucleus demonstrates that M protein contains at least one NLS that is capable of mediating active transport.
To identify sequence(s) within M protein that function as an NLS, we examined the localization of GFP 3 -M fusion proteins containing truncated versions of M protein (diagrammed in Fig. 1C). An amino-terminal truncation was made to generate GFP 3 -M-(47-229), based on previous reports of a stable carboxyl-terminal fragment of M protein produced by trypsin digestion (21,22). Expressed GFP 3 -M-(47-229) was stable (Fig. 1B,  lane c), and this protein accumulated in the nucleus (Fig. 1A, panel c), demonstrating that amino acids 47-229 of M protein are sufficient for nuclear localization.
The NLS within M-(47-229) was defined further by making truncations based on a computer-generated prediction of the secondary structure of M protein (16). Sequence was deleted from either the amino-or carboxyl-terminal ends of M-(47-229)   (Fig. 2A, numbered residues in boldface), reasoning that amino acids important for protein function are likely to be conserved. Single alanine substitutions were also made at the positions of three carboxylterminal residues that are not identically conserved ( Fig. 2A, residues denoted by asterisks), but which were previously implicated as being important for the inhibition of cellular gene  . panel b), probably due to inhibition of nucleocytoplasmic transport (13) and/or transcription (32,33). C, schematic diagram of full-length and truncated GFP 3 -M fusion proteins. Dark boxes represent the M protein sequences and hatched boxes represent the three tandem GFP sequences. The GFP 3 region (ϳ90 kDa) is not drawn to scale. The nuclear localization of each protein was scored as described under "Experimental Procedures" and in the legend for Table II, and scores ((ϩϩ) or (Ϫ)) are shown on the right. expression in VSV-infected cells (23). For each mutant protein, levels of nuclear and cytoplasmic fluorescence were quantified and scored (Table II). Representative cells that were scored as (Ϫ) (Fig. 2B, panels a and b), (ϩ) (panel c), and (ϩϩ) (panel d) for nuclear localization are shown. The stabilities of all mutant proteins were confirmed by Western blotting (data not shown).
Of the 22 single alanine substitutions made in GFP 3 -M-(47-229), 20 had little or no effect on nuclear localization (Table II). However, when alanine substitutions were made at positions 91 (W91A) or 105 (Y105A) in GFP 3 -M-(47-229), the resulting mutant proteins accumulated in the cytoplasm (Fig. 2B, panels  a and b), suggesting that these residues are important for the function of NLS-C. We asked if phosphorylation of Tyr-105 contributes to the function of NLS-C, since it had previously been shown that M protein can be phosphorylated at several Ser, Thr, and Tyr residues (24,25). When Tyr-105 was replaced by glutamic acid, to introduce a negative charge (mimicking constitutive phosphorylation), the resulting protein (Y105E) accumulated in the cytoplasm like Y105A (Fig. 2C, panel a,  compared with Fig. 2B, panel b); conversely, replacement of Tyr-105 with Phe, which cannot be phosphorylated, resulted in a protein (Y105F) that accumulated in the nucleus (Fig. 2C,  panel b) like wild-type GFP 3 -M-(47-229) (Fig. 1A, panel c). These results suggest that phosphorylation of Tyr-105 is not important for the nuclear localization of M-(47-229), but that the presence of an aromatic residue at position 105 is important.
Amino acids 47-229 Are Sufficient for the Inhibitory Activity of M Protein-Previous work by us (14,16) and others (15,17) established a correlation between the abilities of M protein to associate with NPCs and to inhibit nucleocytoplasmic transport. Since GFP 3 -M-(47-229) was visible at the nuclear rim as well as within the nucleoplasm (Fig. 1A, panel c), we asked if M-(47-229) was active as an inhibitor of nucleocytoplasmic transport. The inhibitory activity of M-(47-229) was assayed by testing the ability of a chimeric protein (GST-HA-M(47-229)) to inhibit RNA export in X. laevis oocytes (14). The GST-HA-M-(47-229) protein inhibited export of snRNA and mRNA when injected into oocyte nuclei (Fig. 3, compare
Thus, the effect of the W91A mutation on the function of NLS-C is suppressed in the context of full-length M protein, suggesting that amino acids 1-46 could comprise an alternative NLS.
M Protein Has a Second, Novel NLS within Amino Acids 23-57-To determine if the amino-terminal region of M protein functions autonomously as an NLS, we examined the localization of several GFP 3 -M fusion proteins (diagrammed in Fig.  5A), all of which were stable when expressed in HeLa cells (data not shown). Although some GFP 3 -M-(1-47) protein was observed in nuclei, it did not accumulate there (Fig. 5B, panel  a). However, the slightly larger fusion protein, GFP 3 -M-(1-57), did accumulate in the nucleus (Fig. 5B, panel b), indicating that a second NLS is contained within amino acids 1-57 of the M protein.
In the context of GFP 3 -M-(23-57), single alanine substitutions (and one Ala to Leu substitution) were made at positions that are conserved among the vesiculoviral M proteins (Fig. 5C, numbered residues in boldface). Three of the six single amino acid substitutions reduced, but none abolished, nuclear accumulation of GFP 3 -M-(23-57) (Table III). Representative cells that were scored as (ϩ) (Fig. 5C, panel a) and (ϩϩ) (panel b) for nuclear localization are shown. The sequence in M-  displays no striking homology to previously reported NLSs, suggesting that NLS-N is a novel NLS. DISCUSSION We have shown that the M protein of VSV can be actively imported into the nucleus, since it can direct nuclear accumulation of a large cytoplasmic protein, GFP 3 , in transiently transfected HeLa cells. Whereas it has previously been shown that M protein distributes between the nucleus and the cytoplasm during VSV infection (26), the mechanism by which M protein enters the nucleus was unclear. This work demonstrates that, even though it is smaller than the diffusion limits of the NPC, M protein is able to exploit cellular mechanisms for active nuclear import. Perhaps active import allows for nuclear localization that is more rapid and efficient than nuclear localization via simple diffusion.
The region of M protein containing NLS-N was mapped to amino acids 23-57. Interestingly, the highly basic region (amino acids 1-22) of M protein, which resembles the classical NLSs of SV40 T antigen (PKKKRKVE) and nucleoplasmin (KRPAATKKAGQAKKKKLD), was not required for nuclear localization, nor does it function as an NLS when fused to GFP 3 (data not shown). While M-(23-57) has no apparent similarity to published NLSs, it does contain two motifs, PPXY (amino acids 24 -27 of M protein) and P(T/S)XP (amino acids 37-40 of M protein), that can each bind WW domain-containing proteins, including members of the Nedd 4 family of E3 ubiquitin ligases (27,28,29). Since it has recently been shown that Nedd 4 is a nucleocytoplasmic-shuttling protein (30), it is plausible that NLS-N could promote nuclear import via a piggyback mechanism, bound to a WW domain-containing protein, such as a Nedd 4 family member. This model is consistent with our inability to abolish the function of NLS-N with a single alanine substitution, since each individual motif would be capable of independently binding a WW domain-containing protein.
A second NLS was mapped to amino acids 47-229. NLS-C appears to be rather complex, in that amino acids located at its amino-and carboxyl-terminal ends are necessary for nuclear import. These sequences could promote nuclear localization either directly, by interacting with one or more cellular pro-teins, or indirectly, by contributing to the proper folding of M protein. In addition to elements at the amino-and carboxylterminal ends of M-(47-229), conserved amino acids at positions 91 and 105 are also important for the function of NLS-C, since the amino acid substitutions W91A and Y105A abolished nuclear accumulation of GFP 3 -M-(47-229). Importantly, these substitutions did not destroy the ability of M protein to inhibit transport, indicating that there is no gross misfolding of the mutant proteins.
Some insight into the organization of NLS-C can be gained from the recently solved crystal structure of a fragment of M protein containing amino acids 48 -229 (31). From this work, it is clear that both Trp-91 and Tyr-105 are buried residues that are not exposed at the surface of the protein, so they are likely to contribute to the maintenance of a specific structural motif important for the function of NLS-C. The contributions of the amino-and carboxyl-terminal regions of M-(47-229) to the organization of NLS-C are less obvious. While the amino-terminal region of the crystal structure (amino acids 48 -58) is disordered, the carboxyl terminus is exposed and lies along the surface of the protein. It is unclear whether the impaired nuclear localization of GFP 3 -M-(47-194) (Fig. 1A, panel e) arises from structural changes induced by deletion of the carboxylterminal 35 residues or from the absence of one or more specific residues in this region that is recognized by a cellular protein required for nuclear localization. Further mutational analysis of this region, based on information from the crystal structure, may be helpful in distinguishing between these possibilities.
In addition to containing NLS-C, M-(47-229) is also sufficient for the inhibition of nucleocytoplasmic transport. Moreover, consistent with previous findings that transport inhibition activity correlates with NPC association (14,15,17), M-(47-229) is sufficient for association with NPCs (nuclear rim staining visible in Fig. 1A, panel c). Within M-(47-229), two amino acids, Trp-91 and Tyr-105, were shown to be important for nuclear localization (Fig. 2), but not for inhibitory activity (Fig. 3) or association with NPCs (data not shown). Conversely, the alanine substitution of Met-51, a residue previously shown to be essential for inhibitory activity (14,15), reduced, but did not abolish, nuclear localization in the context of either M-(47-229) or M-(23-57) (Tables II and III). Moreover, in the context of M-(1-229), the M51A substitution had no detectable effect on nuclear localization (data not shown), as was previously shown for an M51L substitution (16). Thus, Met-51 is an amino acid in M protein that is essential for inhibitory activity (14, 15) but is not required for nuclear localization. We conclude that distinct amino acids in M protein are required for nuclear localization and for inhibitory activity.
In the context of full-length M protein, it is unclear whether NLS-N and NLS-C can function independently or whether they work together. Both NLSs require a common region of M protein (M-(47-57)) for their function, but this region is not sufficient for nuclear localization, since neither GFP 3 -M-(32-57) (Fig. 5B, panel d) nor GFP 3 -M-(47-194) (Fig. 1A, panel e) accumulated in the nucleus. Therefore, additional sequences unique to each NLS are essential for function. Curiously, even though our data indicate that M-(23-57) contains an NLS that can function autonomously and that can overcome the deleterious effects of W91A on nuclear localization, we observed a lack of nuclear accumulation of GFP 3 -M-(1-229) (Y105A). Thus, in the context of full-length M protein, Tyr-105 could be important for both NLSs to work together efficiently.
Why might M protein contain multiple NLSs? The presence of multiple NLSs has been observed in several viral, as well as cellular, proteins (35-37). Perhaps more than one NLS is present in M protein to ensure efficient entry into the nucleus by The effects of single amino acid substitutions on nuclear localization of GFP 3 -M-(23-57) Values of (ϩ) and (ϩϩ) are assigned as described under "Experimental Procedures" and in the legend for Table 2. Nuclear localization of the wild-type version of GFP 3 -M-(23-57) was also quantified, and the score is shown in Fig. 5A Substitution Nuclear localization active import. Different cellular proteins bound to both NLSs could work cooperatively during nuclear import of the full length M protein. It will be interesting to learn which cellular proteins recognize the NLSs of VSV M protein to mediate its nuclear localization.