An alternative domain containing a leucine-rich sequence regulates nuclear cytoplasmic localization of protein 4.1R.

In red blood cells, protein 4.1 (4.1R) is an 80-kDa protein that stabilizes the spectrin-actin network and anchors it to the plasma membrane. The picture is more complex in nucleated cells, in which many 4.1R isoforms, varying in size and intracellular location, have been identified. To contribute to the characterization of signals involved in differential intracellular localization of 4.1R, we have analyzed the role the exon 5-encoded sequence plays in 4.1R distribution. We show that exon 5 encodes a leucine-rich sequence that shares key features with nuclear export signals (NESs). This sequence adopts the topology employed for NESs of other proteins and conserves two hydrophobic residues that are shown to be critical for NES function. A 4.1R isoform expressing the leucine-rich sequence binds to the export receptor CRM1 in a RanGTP-dependent fashion, whereas this does not occur in a mutant whose two conserved hydrophobic residues are substituted. These two residues are also essential for 4.1R intracellular distribution, because the 4.1R protein containing the leucine-rich sequence localizes in the cytoplasm, whereas the mutant protein predominantly accumulates in the nucleus. We hypothesize that the leucine-rich sequence in 4.1R controls distribution and concomitantly function of a specific set of 4.1R isoforms.

Red blood cell protein 4.1, 4.1R, or 4.1R 80 was identified in human erythrocytes as an 80-kDa multifunctional protein that stabilizes the spectrin-actin network. Protein 4.1R anchors this network to the overlying lipid bilayer through interactions with cytoplasmic domains of transmembrane proteins (1). For the maintenance of normal erythrocyte morphology and the mechanical strength of the membrane, the formation of the spectrin-actin-4.1R ternary complex is essential; alterations in the spectrin-actin-binding site of 4.1R, located at the C-terminal region of the molecule (2)(3)(4)(5) are associated with congenital hemolytic anemias (6).
In nucleated cells, multiple isoforms of 4.1R are expressed as a result of extensive alternative splicing of the 4.1R pre-mRNA (7)(8)(9)(10). This event is cell-and tissue-specific and also dependent on growth and differentiation stages of the cell (10 -13). Immunological studies have detected 4.1R epitopes at different intracellular sites (14 -20). Concomitantly, the association of 4.1R with proteins localized at different intracellular sites have been reported (21)(22)(23)(24)(25)(26)(27), thus suggesting that 4.1R may be involved in many processes in nucleated cells. A possible role for 4.1R in organizing the nuclear and microtubule architecture and the mitotic spindle poles has been suggested. 4.1R is known to interact with nuclear components of the splicing machinery (19,22), pICln (28), a regulator of a chloride channel recently shown also to associate with spliceosomal proteins (29), interphase microtubules (30), a novel centrosomal protein termed CPAP (31), and the nuclear mitotic apparatus protein (32). Transfection studies using 4.1R cDNAs isolated from different sources have allowed the identification of specific nuclear isoforms of 4.1R (32)(33)(34)(35) and the signals involved in 4.1R nuclear targeting (33,34,36) and the abrogation of its nuclear accumulation (34,37).
Nuclear transport in eukaryotic cells is triggered by specific transport signals that are recognized by soluble receptors that interact with the nuclear pore complexes. Nuclear import is mediated by nuclear localization signals present in nuclear proteins. In the case of classical basic nuclear localization signals, this process involves the importin ␣ and importin ␤ proteins, whose function is regulated by Ran GTPase (38,39). Like import, nuclear export is mediated by specific signals known as nuclear export signals (NESs). 1 There are two relatively well defined exportins that transport proteins: CAS and CRM1. These receptors associate in the nucleus with their export substrates in the presence of Ran-GTP forming trimeric export complexes that are then transferred to the cytoplasm (40). Human CRM1 binds leucine-rich NESs found in different proteins such as Rev, mitogen-activated protein kinase kinase 1, cAMP-dependent protein kinase inhibitor, and cyclin B (41)(42)(43)(44).
In our previous studies, which focused on the identification of signals involved in differential intracellular localization of proteins 4.1R (33,34,37), we showed that a constitutive region of the 4.1R molecule, one that is therefore present in all 4.1R isoforms and is thus designated as the "core region," was re-sponsible for nuclear targeting of 4.1R (34). Because 4.1R isoforms expressing exon 5 are predominantly localized in the cytoplasm (34), in this study we have aimed to identify the amino acid sequence that is responsible for this effect. We show here that exon 5 encodes a leucine-rich sequence resembling a NES. This sequence and, more specifically, two leucine residues that are also conserved in NESs are necessary for 4.1R cytoplasmic localization and for 4.1R binding to the export receptor CRM1.

EXPERIMENTAL PROCEDURES
Cell Culture and Transfection-COS-7 cells were used for transient cDNA expression, immunofluorescence, and biochemical analyses. The cells were grown as described (33). Transient transfections were performed by electroporation using the Electro Cell Manipulator 600 (BTX, San Diego, CA). The cells were always processed 48 h after transfection. For each cDNA construct tested, more than 300 cells from at least five independent replicates were counted.
cDNA  (34), respectively, and appropriate sense and antisense primers complementary to both ends of their respective open reading frames. E5-GFP was constructed by PCR using pSR␣ 4.1R 80 ⌬16 as a template and sense and antisense primers complementary to both ends of 4.1R exon 5 coding sequence. The PCR-amplified sequences were cloned into the vector pcDNA3.1/CT-GFP-TOPO (Invitrogen) following the manufacturer's instructions. GST-4.1R 80 ⌬16 was prepared as described (30). The constructs zz-CRM1, GST-RevNES, GST-M10, and RanGTP were prepared as reported (45).
Antibodies-Anti-c-Myc (9E10) monoclonal antibody (46) was obtained from the American Type Culture Collection. Goat anti-mouse IgG secondary antibodies conjugated with horseradish peroxidase or fluorescein isothiocyanate were obtained from Southern Biotechnology Associates, Inc. (Birmingham, AL). Anti-GST monoclonal antibody was purchased from GeneTex (San Antonio, TX).
Immunofluorescence Microscopy-Cells grown on glass coverslips were fixed, permeabilized, and blocked as described (33). The cells were incubated with the appropriate antibodies and processed as reported (16). The preparations were examined using a Zeiss epifluorescence microscope. Controls to assess the specificity and lack of cross-labeling included incubations with nonimmune rabbit serum and control monoclonal antibodies or omission of either of the primary antibodies.
Protein Expression and Western Blot Analysis-For binding assays, GST fusion proteins were expressed and purified using standard procedures (47). To verify their size, as well as those of the proteins expressed in transfection experiments using COS-7 cells, total protein extracts were obtained, and the protein fractions were analyzed by SDS-PAGE (48) and Western blot. The membranes were processed and developed as described elsewhere (16).
Solution Binding Assays-For binding assays with zz-tagged CRM1 (45, 49), 10 g of zz-CRM1/binding reaction were incubated with IgG-Sepharose beads for 30 min at 4°C in 100 l of binding buffer (100 mM potassium acetate, 30 mM HEPES-KOH, pH 7.5, 2 mM magnesium acetate, and 0.001% Triton X-100). The beads were then recovered by gentle centrifugation and washed with binding buffer. The beads were subsequently incubated with candidate proteins in a final volume of 100 l for 60 min at 4°C. The beads were recovered by gentle centrifugation and washed four times with 1 ml of binding buffer. The bound proteins were eluted by the addition of 1 ml of 1 M MgCl 2 , and the eluted proteins were precipitated with isopropanol. The protein pellets were resuspended in Laemmli (48) sample buffer and analyzed by SDS-PAGE and Western blot.

Exon 5-encoded Sequence Alters the Subcellular Localization
of Nuclear GFP-To study the role that the exon 5-encoded sequence plays in 4.1R intracellular distribution, we prepared different cDNA fusion constructs in which the sequence coding for GFP was appended at the 3Ј end of 4.1R coding sequences ( Fig. 1). Transfection experiments were performed in COS-7 cells and the distribution patterns of the expressed proteins were analyzed 48 h post-transfection by fluorescence microscopy. Cells expressing only GFP had intense nuclear staining with some fluorescence also detected in the cytoplasm ( Fig. 2A and Table I). Cells expressing a fusion protein containing the complete amino acid sequence of a nuclear 4.1R isoform that lacked expression of exon 5 (4.1R 60 ⌬16,18-GFP) also showed predominant nuclear staining ( Fig. 2B and Table I). By contrast, cells expressing a 4.1R isoform containing the sequence encoded by exon 5 (4.1R 80 ⌬16-GFP) presented predominantly cytoplasmic staining ( Fig. 2C and Table I). Consistently, appending the 35 amino acids encoded by exon 5 to GFP (E5-GFP) also resulted in an increase in the number of cells showing cytoplasmic fluorescence ( Fig. 2D and Table I).
A Leucine-rich Sequence Resembling a NES in 4.1R Isoforms Expressing the Alternative Exon 5-NES sequences are short sequence motifs that are necessary and sufficient to mediate the nuclear export of large proteins (38,39,50). Important for The nucleotide sequence encoding the c-Myc-epitope tag (myc) was added at the 3Ј end of cloned cDNAs. GFP protein is expressed at the C terminus, whereas GST is at the N terminus of the expressed 4.1R chimeras. Mutations introduced in different 4.1R cDNAs and constructs are indicated. In 4.1R 80 ⌬16 NESmut and GST-4.1R 80 ⌬16 NESmut , residues Leu 34 and Leu 36 within the leucine-rich sequence were replaced by Ala and Gln, respectively. In 4.1R 80 ⌬16 LEEDYmut , the sequence L 37 EEDY was replaced by SRAGN (see "Experimental Procedures" for details). their function is a characteristic spacing of hydrophobic residues, mainly leucine or isoleucine (38,39). Analysis of the amino acid sequence coded by exon 5 allowed the identification of a hydrophobic region, L 26 LKRVCEHLNLL, which is significantly similar to leucine-rich NES sequences (Fig. 3A). The key hydrophobic residues shown to be important for NES function in other proteins (51,52) are also found in the putative NES in 4.1R and correspond to Leu 34 and Leu 36 (numbered as in the erythroid 4.1R sequence reported in Ref. 10). The crystal structure of the N-terminal 30-kDa domain of erythroid 4.1R comprising exon-5-encoded sequence has been determined (53). From the tertiary structure it may be inferred that the 4.1R hydrophobic sequence, L 26 LKRVCEHLNLL , adopts an ␣-helix conformation (Fig. 3D). Leu 36 is exposed on one side of the ␣-helix, whereas Leu 34 and the hydrophobic residues Leu 26 and Val 30 are exposed on the opposite side (Fig. 3B). A similar topology has been described for other NES (50, 54, 55), such as that of protein p53, in which Leu 350 is located on one side of the ␣-helix, whereas Leu 348 and the hydrophobic residues Met 340 and Leu 344 appear on the opposite side (Fig. 3C).

Mutation of Two Essential Residues within the Leucine-rich Sequence Alters the Cytoplasmic Localization of Protein 4.1R-
Mutation of two critical hydrophobic amino acids within NES sequences has been shown to affect the function of NES in many proteins (51,52). We next investigated whether mutations of the two conserved and presumably critical hydrophobic amino acids within the 4.1R leucine-rich sequence also affect 4.1R subcellular distribution. COS-7 cells were transfected   The conserved hydrophobic amino acids are blue; the two most highly conserved residues crucial to the proper function of NES are marked with two dots. B, three-dimensional structure of the leucine-rich sequence present in protein 4.1R. Amino acids are color-coded as follows: yellow for hydrophobic, red for acidic, blue for basic, and green for polar residues. C, three-dimensional structure of the NES present in p53 protein. D, three-dimensional structure of the FERM domain of 4.1R (53) showing the position of the leucine-rich sequence in an exposed ␣-helix. The two conserved residues, Leu 34 and Leu 36 , are shown. E, predicted three-dimensional structure of the mutated FERM domain of 4.1R 80 ⌬16 NESmut showing the position of the two mutated residues, Ala 34 and Gln 36 . A was prepared using the ClustalX alignment program (66). B-E were modeled using the Swiss PDB Viewer program (67), based on the crystallographic coordinates of the 4.1R 80 FERM (53) and p53 tetramerization (55) domains deposited in the Protein Data Bank with accession numbers 1GG3 and 1AE1, respectively. In E, the FOLD-X program (56), available at fold-x.embl-heidelberg.de, was used to predict the coordinates of the mutated FERM domain. Note that the overall structure of the domain is not altered and that the ␣-helix comprising the mutated leucine-rich sequence remains exposed to the solvent.
with either a cDNA coding the wild-type protein for the expression of the leucine-rich sequence (4.1R 80 ⌬16) or a cDNA encoding a protein with Leu 34 and Leu 36 replaced by Ala 34 and Gln 36 within the leucine-rich sequence (4.1R 80 ⌬16 NESmut ). The subcellular distribution of the expressed proteins, tagged with c-Myc sequences at the C terminus, was determined by staining with antibody 9E10.
Wild-type 4.1R 80 ⌬16 protein had a predominantly cytoplasmic distribution (Fig. 4A and Table I) with a small percentage of the transfected cells containing the expressed protein in the nucleus and cytoplasm (Table I). By contrast, protein 4.1R 80 ⌬16 NESmut was predominantly distributed in the nucleus ( Fig. 4B and Table I). An explanation for the results obtained for the mutated 4.1R protein could be that it is first directed to the nucleus via its core region (34) and that once there it cannot be exported to the cytoplasm because the two mutated hydrophobic amino acids, Leu 34 and Leu 36 , are essential for nuclear export. The predicted folding of the mutant was determined using as template the crystallographic coordinates of the 4.1R 80 FERM domain (53) deposited in the Protein Data Bank (accession number 1GG3) and the FOLD-X computer algorithm (56) (Fig. 3E). Substitution of Leu 34 and Leu 36 by Ala 34 and Gln 36 is not predicted to cause perturbation of folding and is slightly favorable to protein stability (the free energy of folding is 52.32 kcal⅐mol Ϫ1 for the wild type versus 52.14 kcal⅐mol Ϫ1 for the mutant). It is very unlikely that the results described above for the mutant are due to perturbation of folding. Isoform 4.1R 80 ⌬5,16, which is similar to 4.1R 80 ⌬16 except for lacking exon 5-encoded sequences, also accumulates in the nucleus (Fig. 4C and Table I). All of these results indicate that although wild-type 4.1R 80 ⌬16 is found predominantly in the cytoplasm, either the mutation in the leucine-rich sequence or the deletion of exon 5 clearly resulted in the accumulation of the protein in the nucleus. Thus, the leucine-rich sequence and, more specifically, amino acids Leu 34 and Leu 36 play a pivotal role in 4.1R cytoplasmic distribution in COS-7 cells.
The binding site of band 3 to 4.1R consists of the L 37 EEDY sequence that is adjacent to the NES (57). Mutation of L 37 EEDY to S 37 RAGN (protein 4.1R 80 ⌬16 LEEDYmut ), unlike 4.1R 80 ⌬16 NESmut , did give rise to predominant cytoplasmic staining of 4.1R 80 ⌬16 LEEDYmut (Fig. 4D and Table I). The capacity of protein 4.1R 80 ⌬16 to be exported from the nucleus was analyzed by injecting recombinant protein GST-4.1R 80 ⌬16 into Xenopus laevis oocyte nuclei and processing the samples as described (58). At 3 h post-injection, most of the protein GST-4.1R 80 ⌬16 was detected in the cytoplasmic fraction supporting nuclear export of 4.1R 80 ⌬16 (data not shown). The mutant protein GST-4.1R 80 ⌬16 NESmut was also detected in the cytoplasmic fraction. This unexpected result suggests that 4.1R contains additional nuclear export signal(s) that appear(s) to be functional in the X. laevis oocyte system. Protein 4.1R 80 ⌬16, but not 4.1R 80 ⌬16 NESmut , Associates with the Nuclear Export Protein CRM1 in a RanGTP-dependent Manner-Proteins containing a leucine-rich NES are recognized in the nucleus by the exportin CRM1 and form a trimeric complex with RanGTP, which is exported from the nucleus to the cytoplasm (40). We investigated whether 4.1R 80 ⌬16 associates with CRM1 in the RanGTP-dependent manner characteristic of export substrates. zz-tagged CRM1 was bound to IgG-Sepharose and incubated with the different substrates used in the binding assays. Bead-bound material was eluted from the column and processed as indicated under "Experimental Procedures." A functional export substrate containing the NES of Rev fused to the C terminus of GST (GST-RevNES) (45,49) and an export-deficient control substrate containing a disrupted NES of Rev (GST-M10) were used as positive and negative controls, respectively. All of the binding assays were performed in the absence and presence of RanGTP. GST-RevNES bound efficiently to zz-CRM1, in a RanGTP-dependent manner (Fig. 5, compare lanes 1 and 2). Similarly, GST-4.1R 80 ⌬16 bound to CRM1 in a RanGTP-dependent manner (Fig. 5, compare lanes 5 and 6). By contrast, the export deficient  1-8), or blotted to PVDF and detected by incubation with anti-GST antibody (lanes 9 -12). GST-RevNES (RevNES) and GST-M10 (M10) were used as positive and negative controls, respectively. Note that GST-4.1R 80 ⌬16 protein (4.1R 80 ⌬16) binds zz-CRM1 in the presence, but not in the absence, of RanGTP, whereas the mutated GST-4.1R 80 ⌬16 NESmut protein (4.1R 80 ⌬16 NESmut ) does not. Arrowheads mark the positions of GST-RevNES, RanGTP, and GST-4.1R 80 ⌬16.
control (GST-M10) and the 4.1R mutant in the leucine-rich sequence (GST-4.1R 80 ⌬16 NESmut ) did not bind to CRM1 even in the presence of RanGTP (Fig. 5, lanes 3 and 4 and lanes 7 and  8, respectively). The RanGTP-dependent binding of CRM1 to GST-4.1R 80 ⌬16 but not to GST-4.1R 80 ⌬16 NESmut was more clearly revealed by Western blot analysis (Fig. 5, lanes 9 -12). DISCUSSION In recent years, our view of strict compartmentalization of nuclear and non-nuclear components has been challenged as plasma membrane (59), tight junction (60), endocytic (61), and cytoskeletal (50) proteins have been described in the nucleus. Moreover, proteins interacting with 4.1R and with predominant extranuclear localization and function have recently been detected in the nucleus. One of these proteins is actin, which is a ubiquitous, essential cytoskeletal protein and is therefore like 4.1R 80 ⌬16 with a predominant cytoplasmic localization at steady state. Actin contains two functional NES that mediate its nuclear export in a CRM1-mediated manner. However, the physiological relevance of the shuttling of actin between the nucleus and cytoplasm is not known (50). ZO-1 interacts with 4.1R in cell tight junctions (24), and ZO-1 has been shown to accumulate in the nucleus in a cell density-dependent fashion (60). hCASK interacts with 4.1R, establishing a link between the extracellular matrix and the cortical actin cytoskeleton (21). It has been reported that hCASK directly enters the nucleus and thereby regulates gene expression (62). Conversely, a 4.1R-interacting protein, the nuclear splicing factor U2AF35 (22), has recently been shown to shuttle from the nucleus to the cytoplasm, but its role in the cytoplasm remains to be elucidated (63). This study shows that 4.1R isoforms expressing exon 5 contain a leucine-rich sequence that shares key features with NESs known to trigger the rapid, active delivery of proteins and RNA-protein complexes from the nucleus to the cytoplasm (39,40,64). Thus, the key hydrophobic residues important for NES function in other proteins are also found to be essential in 4.1R, and the topology adopted for the NES of proteins such as p53 (55) or actin (50) is similar to that adopted for 4.1R (53). Moreover, the leucine-rich sequence is required for 4.1R binding to CRM1 in a RanGTP-dependent fashion.
Our previous studies on signals involved in 4.1R differential intracellular localization showed that all 4.1R molecules have a common region, designated the core region, involved in their nuclear targeting (34). However, in this study we show that 4.1R isoforms expressing exon 5 contain a leucine-rich sequence that predominantly excludes them from the cell nuclei. A third signal comprising basic amino acids coded by the alternative exon 16 have also been involved in 4.1R nuclear targeting (33,36). Finally, a fourth region mediating 4.1R intracellular localization is the 209-amino acid head piece (HP) of high molecular weight 4.1R 135 isoforms that inhibits nuclear targeting of 4.1R (37). The complexity of signals and regions identified to date as being involved in 4.1R nuclear and cytoplasmic distribution and the hierarchical fashion in which they regulate the intracellular localization of 4.1R are summarized in Fig. 6A. Fig. 6B represents schematically possible models for the intracellular traffic experienced by different 4.1R isoforms. A first group of 4.1R 80 isoforms containing the constitutive core region but lacking the alternative leucine-rich sequence is directed to the nucleus by a mechanism that has yet to be elucidated. This group would be comprised of the nuclear set of 4.1R isoforms (Fig. 6B, panel a). A second group of 4.1R 80 isoforms contains the leucine-rich sequence, and these enter the nucleus via the core region and might bind to CRM1-RanGTP to be exported to the cytoplasm (Fig. 6B, panel b). A third group of 4.1R 80 isoforms expresses exon 16, and they enter the nucleus by binding to human importin ␣2, Rch1 (36) (Fig. 6B, panel c). Finally, a fourth group of 4.1R 135 isoforms contains the HP domain that inhibits nuclear targeting of 4.1R (37). The HP domain might adopt a conformation masking the regions involved in 4.1R nuclear targeting. However, an association be- FIG. 6. Hierarchical organization of 4.1R targeting signals. A, the 4.1R cDNA has been divided into four regions represented by white boxes. Region 1, the 209-amino acid sequence, or HP, specific for 4.1R isoforms translated from ATG-1, which spans from ATG-1 to ATG-2; region 2, the sequence including the alternative exon 5 encoding the leucine-rich sequence (NES) and spanning from ATG-2 to ATG-3; region 3, the core region encompassing the sequence encoded from ATG-3 to the in-frame-ATG located at exon 17 (ATG-E17); region 4, the Cterminal region encoded from ATG-E17 to the 4.1R stop codon at exon 21 (TGA). The striped boxes indicate the alternative exons 5 and 16. Arrows show the sequences involved in 4.1R differential targeting. The directions of the arrows represent the predominant localization determined by the expression of every block of sequence: nuclear for the core region and for exon 16 (in the presence of exon 5) and cytoplasmic for exon 5 (in the absence of exon 16) and for the HP domain (33,34,36,37). The hierarchical effect of these regions on 4.1R intracellular localization is as follows: the constitutive core region determines nuclear localization, which is blocked by expression of exon 5 (the leucine-rich sequence) and restored by the simultaneous expression of exons 5 and 16. Finally, the HP domain specific to 4.1R isoforms translated from ATG-1 is dominant over all the signals, being sufficient to abrogate nuclear entry (37). B, schematic representation of possible models indicating the intracellular traffic experimented by different sets of 4.1R isoforms (for explanation, see "Discussion"). tween the HP domain and a cytoplasmic partner could also account for this effect (Fig. 6B, panel d). Thus, there must be orchestrated regulatory mechanisms controlling the traffic of specific 4.1R isoforms to play distinct roles in different compartments.
We do not know the physiological role that 4.1R isoforms containing the leucine-rich sequence play in nucleated cells, and it is rather speculative, therefore, to propose possible cytoplasmic or nuclear roles. If the presence of 4.1R 80 ⌬16 in the cell nucleus has deleterious consequences for the cell, it is possible that the function of the leucine-rich sequence of 4.1R is to control the access of 4.1R to nuclear components. Alternatively, the sequence may be important for maintaining the cytoplasmic localization of a putative binding partner that would otherwise gain access to the nucleus to carry out specific functions. Lastly, 4.1R 80 ⌬16 may transit to the nucleus to deliver a nuclear component, either protein or RNA, to the cytoplasm. Protein 4.1R 80 ⌬16, and other 4.1R isoforms expressing exon 5, may be candidates for participation in the relay of information between the cytoplasm/plasma membrane and the nucleus. Investigation of the functions of 4.1R isoforms containing the leucine-rich sequence will provide new insights into this multifunctional protein.