Structure-activity relationship of the leucine-based sorting motifs in the cytosolic tail of the major histocompatibility complex-associated invariant chain.

The cytosolic tail of the major histocompatibility complex-associated invariant chain protein contains two Leu-based motifs that both mediate efficient sorting to the endocytic pathway. Nuclear magnetic resonance data on a peptide of 27 residues corresponding to the cytosolic tail of human invariant chain indicate that in water at pH 7.4 the membrane distal motif Leu7-Ile8 lies within a nascent helix, while the membrane proximal motif Met16-Leu17 is part of a turn. The presence of a small amount of methanol stabilizes an α helix from Gln4 to Leu17 with a kink on Pro15. Point mutations of the cytosolic tail of the protein suggest that amino-terminal residues located in spatial proximity to the Leu motifs contribute to efficient internalization and targeting to endosomes in transfected COS cells. Residues on the spatially opposite side of the Leu motifs were, on the other hand, mutated with no measurable effect on targeting. Structural and biological data thus suggest that the signals are not continuous but consist of “signal patches” formed by the three-dimensional structure of the cytosolic tail of invariant chain.

The cytosolic tail of the major histocompatibility complex-associated invariant chain protein contains two Leu-based motifs that both mediate efficient sorting to the endocytic pathway. Nuclear magnetic resonance data on a peptide of 27 residues corresponding to the cytosolic tail of human invariant chain indicate that in water at pH 7.4 the membrane distal motif Leu 7 -Ile 8 lies within a nascent helix, while the membrane proximal motif Met 16 -Leu 17 is part of a turn. The presence of a small amount of methanol stabilizes an ␣ helix from Gln 4 to Leu 17 with a kink on Pro 15 . Point mutations of the cytosolic tail of the protein suggest that amino-terminal residues located in spatial proximity to the Leu motifs contribute to efficient internalization and targeting to endosomes in transfected COS cells. Residues on the spatially opposite side of the Leu motifs were, on the other hand, mutated with no measurable effect on targeting. Structural and biological data thus suggest that the signals are not continuous but consist of "signal patches" formed by the three-dimensional structure of the cytosolic tail of invariant chain.
In transmembrane proteins, specific signals for endosomal or lysosomal sorting have been identified within the cytosolic tails. In a number of proteins sorted to the endosomal pathway, a Tyr-containing motif seems to mediate internalization from the plasma membrane (for review, see Refs. 1 and 2)) and form a tight ␤ turn functionally comparable to the internalization signal found in cell surface receptors (3)(4)(5)(6). However, recent NMR analysis of a synthetic peptide corresponding to the extreme 21 carboxyl-terminal amino acid residues of the cytosolic domain of trans-Golgi network (TGN) 1

protein TGN38/41
(which is routed from the plasma membrane and back to TGN) shows that this Tyr-containing internalization signal lies within a nascent helix (7).
Tyrosine signals are not universal for sorting of membrane proteins to the endosomal/lysosomal pathway. In fact, the cytosolic tail of the lysosomal membrane protein LIMP II contains no Tyr but a Leu-Ile signal, located two residues from the carboxylic end, which mediates efficient sorting (8,9). Letourneur and Klausner (10) reported that a Leu-Leu motif could mediate lysosomal targeting of the CD3-␥ and -␦ chains of the T-cell receptor complex. This signal was functional, even if located at the carboxyl-terminal end, and together with a Tyrbased motif they were individually sufficient to induce endocytosis and delivery to lysosomes. Two similar signals were also reported in the cytosolic tail of the mannose 6-phosphate-receptor by Johnson and Kornfeld (11). The above studies may suggest that the Tyr signal is primarily mediating sorting to the endocytic pathway via internalization from the plasma membrane, whereas the Leu signal alone, or in combination with the Tyr signal, mediates direct sorting from the TGN to the endocytic pathway. However, both signals internalize efficiently plasma membrane proteins, and further information is needed to clarify the requirements for selecting the pathway from the TGN to endosomes (for further discussion see (2)).
The invariant chain (Ii), which efficiently targets the associated major histocompatibility complex class II molecules to the endosomal/lysosomal pathway, is another example of a protein with cytosolic Leu-sorting motifs (12)(13)(14). Invariant chains from different species contain in their cytosolic tails pairs of Leu-Ile and Met-Leu (Ile-Leu), and mutational analysis of the cytosolic Ii tail fused to reporter molecules have shown that these signals can independently mediate endosomal targeting (15,16). Distribution and internalization studies also show that a significant fraction of the Ii molecules, alone or in complex with major histocompatibility complex class II, were transported to endosomes via the plasma membrane (17). Furthermore, the Leu-Ile and Met-Leu motifs individually mediate rapid internalization of a chimeric protein (INA) obtained by fusing neuraminidase (NA) to the cytosolic tail of Ii (15). In another study the Ii tail was fused to the transferrin receptor, but the measured sorting via the plasma membrane was not sufficient to account for the amount of protein synthesized, indicating sorting also directly from the TGN (16). The accumulated data thus suggest that Leu-based motifs are actively engaged in the sorting of membrane proteins to the endosomal/ lysosomal pathway, both directly from the TGN and via the plasma membrane.
Here we report structural and mutational studies on the 27-amino acid synthetic peptide Ii-(1-27) (Scheme I), corresponding to the amino-terminal region of human Ii. NMR results demonstrate, for the first time, that in water Leu 7 -Ile 8 and Met 16 -Leu 17 internalization motifs lie within a nascent helix and a turn, respectively, but can take up a kinked helix in the presence of small amount of methanol. Site-directed mutagenesis has been performed to determine in which context the two internalization signals are functional. By comparing the biological data with the three-dimensional structure we conclude that the spatial arrangement of the leucine motif is essential for a functional signal and that it is not a continuous signal but a "signal patch" that depends upon the secondary structure of the protein.

EXPERIMENTAL PROCEDURES
Peptide Synthesis-The peptide was synthesized on polyoxyethylenepolystyrene graft resin in a continuous flow instrument constructed and operated as described by Frank and Gausepohl (18). Peptide chain assembly was performed using Fmoc (fluorenylmethoxycarbonyl) chemistry (19) and in situ activation of amino acid building blocks by PyBOP (20). The peptide was purified by reversed phase high performance liquid chromatography and characterized by laser desorption mass spectrometry.
NMR Data Collection-Samples were prepared by dissolving lyophilized Ii-   . Spectra were acquired with a Bruker AMX 500-MHz spectrometer interfaced to an Aspect X32 computer and referenced to 3-(trimethylsilyl)propionic acid sodium salt. Two-dimensional NMR spectra, namely double-quantum-filtered correlated spectroscopy (DQF-COSY) (21), nuclear Overhauser enhancement (NOE) spectroscopy (NOESY) (22) and total correlation spectroscopy (TOCSY) (23), were recorded in the phase-sensitive mode with quadrature detection in 1 provided by the time-proportional phase incrementation scheme (24). Usually, 16 -32 transients of 2048 points were collected for each of the 512 t 1 increments, with a spectral width of 6024 Hz. Timedomain data matrices were all zero-filled in the 1 dimension to 2048 and to 4096 in the 2 , thus yielding a digital resolution of 5.88 and 2.94 Hz/point, respectively. Lorentz-to-Gauss resolution enhancement in both dimensions were used as weighting functions before transformation. NOESY experiments were obtained with different mixing times with no random variation. TOCSY experiments, modified to be clean (25), were recorded with a spin-lock period of 0.064 s, achieved with the MLEV-17 pulse sequence (26). Irradiation of the strong solvent signal was achieved in the coherent mode (27) Zuiderweg et al., 1986) during the relaxation time, and in the case of the NOESY, during the mixing time. Slowly exchanging protons at pH (uncorrected meter reading) 7.0 and 283 K, were identified by recording a NOESY spectrum (0.20 s mixing time) of the peptide immediately after dissolution in C 2 H 3 O 2 H-2 H 2 O. Measurement of the 3 J HN␣ coupling constants was obtained from one-dimensional experiments acquired with 131,072 points and application of strong Lorentz-to-Gauss resolution enhancement or estimated, as apparent values, by measuring antiphase peak separations in the DQF-COSY experiment (28).
Calculation of Structures-The input used for structure calculations of the peptide consisted of 186 NOE upper distance constraints, which were derived from NOESY spectra recorded at 0.060, 0.10, and 0.20 s of mixing times. A semiquantitative correlation between NOESY crosspeaks and proton-proton distances was obtained by comparing the effects observed at different mixing times with the standard distances observed in protein structures (29). Strong, medium, and weak intraresidual and sequential NOEs observed at 0.060 s were translated into three distance ranges: 0.26 -0.28 nm, 0.29 -0.31 nm, and 0.33-0.36 nm, respectively. Looser upper limits were set for medium range NOEs observed only in the 0.10-s and the 0.20-s NOESY spectra, with a cut-off distance restraints of 0.40 and 0.46 nm, respectively. The ␤-methylene groups were stereospecifically assigned with the program HABAS (30), which uses intraresidual and sequential NOEs and vicinal coupling constants 3 J HN␣ and 3 J ␣␤ . Five ␤-methylene groups (namely, those of residues 4, 11, 12, 13, and 25) were stereo-specifically assigned. When no stereo-specific assignment was possible for methylene and methyl protons, distance constraints were corrected for the pseudo atom representation (31). For methyl groups, an additional correction of 0.05 nm was added for the highest apparent intensity of methyl resonances (32). Distance-geometry calculations were performed with the DIANA program (33). 20 dihedral angle constraints based on experimental 3 J HN␣ coupling constants were also used as input data. The variable target function was changed according to the standard strategy from level 1 to 27. A total of 100 starting structures were generated from random choices of dihedral angles. The best nine structures, in terms of distance violations, were chosen for further refinement. Energy minimization calculations were performed with the GROMOS software package (34); bond length constraints were applied with the SHAKE method (35,36). In order to release the strain caused by bad van der Waal contacts, while retaining the features of the original distance-geometry structures, 800 steps of steepest descent restrained energy minimization were applied. The list of nonbonded neighboring atom pairs was updated every 10 cycles of energy minimization. A cut-off radius of 0.08 nm was used beyond which no nonbonded interactions were evaluated. Distance restraints obtained from NMR measurements were incorporated into calculations as a semiharmonic potential function with a force constant of 1000 kJ⅐nm Ϫ2 ⅐mol Ϫ1 . Graphical representation and root mean square deviation analysis between energy-minimized structure pairs were carried out with the program SYBYL (Tripos, Inc).
Expression Vector and Plasmid Constructions-pSV51L is a late replacement vector with a short polylinker behind the SV40 promoter, and this vector has been shown to give high transient expression of several proteins in simian cells (12,37). The cDNA for neuraminidase is from human influenza strain A7/Victoria3/75 (38), and the various INA constructs consisting of the cytosolic tail of Ii have been described earlier (15,39). The deletion mutants ⌬11INA is lacking the amino acid 2-11 of the Ii tail, and ⌬12-29INA is lacking residues 12-29. Point mutations in the cytosolic tail of Ii were introduced by site directed mutagenesis in a single-stranded M13mp19 vector by the method of Kunkel (40). The mutagenized tail regions were confirmed by DNA sequencing.
Cell Culture and Transient Expression in COS Cells-The COS cells are derived from CV1 cells transformed with an origin-defective mutant of SV40 coding for the wild type T-antigen (41). The cell lines were maintained in Dulbecco's modified Eagle's medium supplemented with 10% fetal calf serum (DMEM-FCS). The cells were seeded into 35-mm wells the day before transfection at 50% confluence. For transfection 0.5 g of plasmid DNA was dissolved in 0.1 ml of DMEM with 10% NuSerum (Collaborative Research). The solution was mixed with 1 ml of DMEM-NuSerum containing 400 g/ml DEAE-dextran and 0.1 mM chloroquine. The cells were washed twice in phosphate-buffered saline (pH 7.4) before the DNA was added (1 ml/35-mm well). After 3-4 h at 310 K, 6% CO 2 the cells were given a Me 2 SO shock (10% dimethyl sulfoxide, 2-3 min) and grown for a further 2 days in DMEM-FCS for expression of the proteins.
Antibodies and Immunofluorescence-The mouse monoclonal antibody NC71 which both recognize the luminal domain of neuraminidase was a gift of Dr A. Douglas, Mill Hill, London. Two days after transfection, the cells were labeled for immunofluorescence with fluorescein isothiocyanate and Texas Red-conjugated goat anti-rabbit Ig and goat anti-mouse Ig antibodies (Dianova, Hamburg, Federal Republic of Germany), as described earlier (12).
Purification and 125 I Labeling of Antibodies-Mouse ascites NC71 (IgG 1 ) was precipitated with 0.18 g/ml Na 2 SO 4 and purified on a protein A-Sepharose (Pharmacia) column at pH 8.0 before iodination by IODO-GEN™ as described by the manufacturer (Pierce). The amount of acidsoluble and precipitable material was determined by trichloroacetic acid precipitation and counting in a Cobra Auto-Gamma ␥ counter. The amount of soluble radioactivity in the samples was usually as low as 1-2%.
Internalization of 125 I-Labeled Antibodies-Transiently transfected COS cells in 35-mm wells were incubated with 125 I-labeled NC71 in DMEM-FCS (approximately 1 g/ml) on ice for 2 h. The cells were then washed six times in ice-cold phosphate-buffered saline with 2% FCS. The chase was performed in DMEM-FCS at 310 K for different periods of time. Cells were then cooled on ice and treated twice with 0.5 M acetic acid in 0.15 M NaCl (pH 2.5) for 7 min. This step removed 95-98% of the Met 1 -Asp-Asp-Gln-Arg 5 -Asp-Leu-Ile-Ser-Asn 10 -Asn-Glu-Gln-Leu-Pro 15 -Met-Leu-Gly-Arg-Arg 20 -Pro-Gly-Ala-Pro-Glu 25 -Ser-Lys SCHEME I. Amino acid sequence of the Ii-(1-27) peptide corresponding to the cytosolic tail of human invariant chain. The di-leucine-like signals are in italics.
surface bound antibody. The cells were removed from the wells by lysis in 1 M NaOH. The acid wash and the lysed samples were counted in the ␥-counter. Each time point was performed in duplicate. Internalized antibody was calculated as the antibody resistant to the low pH wash relative to the antibody bound before the chase period. Typically, when cells were sham-transfected with the vector without insert and incubated with 125 I-NC71 on ice and washed, the cell-bound activity was less than 1% of the activity bound to cells transfected with INA variants.  peptide was studied at pH 7.4 in phosphate buffer at 283 K in water and water/methanol mixture, respectively. Assignment of proton spin systems was obtained with the sequential methodology outlined by Wü thrich (29). From the amide protons, TOCSY experiments allowed identification of the ␣ and the ␤ protons of almost all of amino acids. Residues with long side chains were identified by a combination of TOCSY and NOESY experiments. Individual spin systems were placed in the primary structure by identification of characteristic short and medium range NOE connectivities. Fig. 1 reports the summary of NOE information for Ii-(1-27) in water at 283 K and pH 7.4. Together with the strong ␣CH i -NH iϩ1 connectivities, a number of NOEs were observed between the NH resonances of sequential amino acids in the regions before and after Pro 15 , the strongest being observed in the Asp 6 -Asn 11 region and between Met 16 and Leu 17 . Proximity of NHs of adjacent residues requires a kink in the backbone, a conformation associated with a ␤ turn or with an ␣ helix (42). Accordingly, the NOEs between pairs of NHs in the region Arg 5 -Glu 12 suggest that consecutive turns are present in this part of the peptide and that Met 16 -Leu 17 is part of a turn. An ensemble of consecutive turns resembles a helix-like conformation, since distance constraints in tight turns are similar to those in helical segments (29). This is confirmed by ␣CH i -NH iϩ2 connectivities (Arg 5 -Leu 7 , Asp 6 -Ile 8 , Leu 7 -Ser 9 , Ile 8 -Asn 10 , and Ser 9 -Asn 11 ), and by ␤CH i -NH iϩ1 cross-peaks (Gln 4 -Arg 5 , Leu 7 -Ile 8 , Ser 9 -Asn 10 , and Glu 12 -Gln 13 ). However, the contemporary presence of ␣CH i -NH iϩ1 and NH i -NH iϩ1 NOEs (Fig. 1) indicates that the local helix-like structure in the region at the amino-terminal side of Pro 15 dynamically transforms into extended conformations.

NMR Analysis-Ii-
The absence of a stable helical structure is confirmed by the lack of characteristic NOEs and by the 3 J HN␣ coupling constants values. We did not observe loop to loop interproton NOEs (e.g. ␣ of residue i to the NHs of residues iϩ3 and iϩ4), although several residues show the NH to NH interaction. On the other hand, only the amides of residues Leu 7 and Ser 9 have coupling constants (5.9 Hz) similar to that expected for a stable helical structure (43), while the 3 J HN␣ coupling constants for the remaining residues in that region all are Ͼ 7 Hz. These results (stretch of sequential and short range NOEs, the absence of helix-defining NOEs within a conformation ensemble giving rise to largely averaged coupling constants) argue for the presence of "nascent helix" structure (44), including the Leu 7 -Ile 8 sorting signal of invariant chain. The term nascent helix refers to an ensemble of interconverting extended chain and turn-like structures existing over a peptide sequence at the earliest stages of helix initiation.
From Pro 15 onward, the peptide also assumes turn-like conformations up to Ala 23 , as suggested by NH i -NH iϩ1 , ␤CH i -NH iϩ1 and ␣CH i -NH iϩ2 NOE connectivities (Fig. 1). In particular, the presence of strong NOEs between the amide protons of Met 16 and Leu 17 and between ␣CH of Pro 15 and NH of Leu 17 suggest that the second signal Met 16 -Leu 17 is part of a turn.
A nascent helix can be stabilized by small amounts of organic co-solvent (44). Upon addition of methanol at a concentration of 20% (v/v) to the Ii-(1-27) aqueous solution, all resonances needed to be reassigned because one-to-one comparison between the two solvents was not possible. Complete sequential assignment was made as described above. Stabilization of the helical structure was confirmed by the sequential (␣CH i -NH iϩ1 and NH i -NH iϩ1 ) and medium-range (␣CH i -NH iϩn , n Ն 2, and ␣CH i -␤CH iϩ3 ) NOEs (45), slowly exchanging amide protons (29), and 3 J HN␣ coupling constants (43). Fig. 2 summarizes the observed NOEs, the relative exchange rates of amide protons and the apparent 3 J HN␣ coupling constants for Ii-(1-27) at 283 K and pH 7.4 in the presence of 20% methanol. The fact that in the region Gln 4 -Leu 14 the NH i -NH iϩ1 NOEs are intense, while the ␣CH i -NH iϩ1 NOEs are much weaker, implies a generally helical structure (42). The observation of several unambiguous ␣CH i -NH iϩ3 , ␣CH i -␤CH iϩ3 and a single ␣CH i -NH iϩ4 crosspeaks (Fig. 2) supports the presence of a helix.
Further corroborative data come from slowly exchanging amides; except for Gln 13 , all the amide protons in the Gln 4 -Leu 14 region are in slow exchange. The slow exchange most likely indicate hydrogen bonding, since it is unlikely that a slowly exchanging proton is buried in the interior of a biomolecule as small as the cytosolic tail of Ii. 3 J HN␣ Ͻ 6 Hz in the Gln 4 to Leu 14 region also supports the presence of a helix (43). Furthermore, ␣CH i -NH iϩ2 cross-peaks, suggestive of a 3 10  and we conclude that the Gln 4 -Leu 14 region of Ii-(1-27) forms an ␣ helix.
From Pro 15 onward, we observed strong ␣CH i -NH iϩ1 NOE connectivities, meaning that the conformation is essentially extended. However, the presence of NH i -NH iϩ1 , ␤CH i -NH iϩ1 , two ␣CH i -NH iϩ2 (Pro 15 -Leu 17 and Met 16 -Gly 18 ) and a single small ␣CH i -NH iϩ3 (Leu 14 -Leu 17 ) NOE connectivities (Fig. 2) is indicative of local structures and short range order. ␤CH i -NH iϩ1 connectivities are commonly observed in type I or type III turns (42), but while type III turns are very similar to type I, the presence of a Pro in the trans isomer is compatible with both type I and type II classes of turns (46). In principle, they can be distinguished on the basis of 3 J HN␣ coupling constants at positions 2 and 3 of the turns, and NOE connections (42,45). Except for Met 16 and Lys 27 , for which we measured a 3 J HN␣ Ͼ 7 Hz, and Gly 22 and Ser 26 (both with a 3 J HN␣ Ͻ 6 Hz), all the residues of the carboxyl-terminal region showed coupling constants of approximately 6.5 Hz. Furthermore, the amides of Leu 17 , Ala 23 , and Ser 26 are slowly exchanging, thus suggesting the formation of hydrogen bond (45) and the possible formation of consecutive turns in the regions Leu 14 to Leu 17 , Arg 20 to Ala 23 , and Ala 23 to Ser 26 . The NMR data slightly favor a type I and two type II turns, respectively, for the above segments. In fact, a ␦CH-NH NOE connectivity between Pro 15 and Met 16 suggested that the region Leu 14 to Leu 17 forms a type I ␤ turn, since the ␦CH-NH distance is between 0.19 and 0.35 nm (47). The presence of an ␣CH i -NH iϩ3 NOE between Leu 14 and Leu 17 suggests that the type I turn can actually prolong the helix up to Leu 17 . For Arg 20 to Ala 23 and Ala 23 to Ser 26 , the observation of ␣CH-␦CH NOE connectivities between Arg 20 and Pro 21 , and Ala 23 and Pro 24 for the trans isomer of both regions, suggests the presence of two type II ␤ turns.
Methanol is known to induce structure in peptides (see, for example, Ref. 48). However, at the used ratio of 80% water/20% methanol (v/v), the mixture has a helix-promoting ability slightly higher than that of pure water (48), indicating that the structure is only stabilized by the presence of methanol. The possibility that the secondary structure arises through aggregation was ruled out by investigating a 10-fold diluted sample of Ii- . No differences in chemical shift and line width of the NH resonances were observed in one-dimensional spectra, and NOESY experiments confirmed all the connectivities, and thus the structure, described above.
From NMR data in water/methanol, 100 randomly selected starting conformations were generated by means of distancegeometry calculations. The best nine structures, in terms of smallest target function values, were subjected to restrained energy minimization. Before minimization, they fulfilled quite well the whole set of NMR restraints with no violations of the upper bounds of the distance restraints greater than 0.05 nm, and of dihedral angle restraints greater than 5°. The energy of the refined structures were all in the narrow range from Ϫ1877 to Ϫ2080 kJ⅐mol Ϫ1 ; the maximal distance constraint violation was 0.050 Ϯ 0.001 nm, and the average sum of distance constraint violations was 0.293 Ϯ 0.001 nm. None of the selected structures presented additional short interproton distances not experimentally observed. Fig. 3A shows a superposition of all nine structures for the region covering residues Met 1 -Pro 15 , all compatible with NOEs data, since calculations of structures from NOEs is a means to assess possible and favored conformational states and not single structures. The convergence achieved over the well defined ␣ helix from Gln 4 -Leu 14 was good, with a root mean square deviation for its backbone atoms of 0.090 nm. The region Pro 15 -Glu 25 (Fig. 3B) did not converge to a consensus structure, reflecting the absence of structurally significant NOEs for this region of the polypeptide and the variability in its relative position with respect to the helix. The average root mean square deviation for backbone atoms of residues 16 -25 is 0.293 nm. No unique conformation could be determined for the Met 1 -Asp 2 -Asp 3 and the Ser 26 -Lys 27 segments.
Effect of Point Mutations on the Two Internalization Signals-To elucidate the requirements for a structural context of the Leu 7 -Ile 8 and Met 16 -Leu 17 signals, we performed a set of point mutations on the INA fusion protein changing several residues to alanine (Fig. 4). To monitor internalization from the plasma membrane, the transfected COS cells were incubated with 125 I-labeled antibodies on ice and chased at 310 K. In order to study one signal at a time, the region of the other signal was either deleted or destroyed by mutating Leu 7 or Leu 17 to Ala and the time response for internalization of the bound antibody was measured as described earlier (15). The time to reach 25% internalization is noted in Fig. 4. Constructs were considered not actively internalized if they did not reach 15% internalization after 10 min, in analogy with INA constructs with both internalization signals deleted (15).
Regarding the Leu 7 -Ile 8 signal, point mutation of Arg 5 and Asp 6 to Ala (construct 6, Fig. 4), no change of the internalization rate was detected, whereas both Gln 4 (construct 5) and Asp 3 (construct 4) prevented internalization. Alteration of amino acids at the carboxyl terminus of the signal can be changed to alanine without affecting the internalization signal; mutation of the potentially phosphorylatable Ser 9 (construct 9), and Asn 10 and Asn 11 (construct 10), did not change the rate of internalization in line with earlier studies (15).
For the second signal, Met 16 -Leu 17 , mutation of Pro 15 to Ala (construct 21) abolished internalization, whereas neither mutation of Gln 13 nor Leu 14 (constructs 19 and 20, respectively) altered the internalization efficiency. The negative residue Glu 12 (construct 15), however, reduced internalization to background level. For all the INA mutations, immunofluorescence studies showed a strong plasma membrane staining for the constructs that were not internalized and vesicular staining (V, localization column in Fig. 4) corresponding to endosomes for the constructs that were actively internalized. When native Ii harboring the identical cytosolic tail mutations as INA was expressed in COS cells, the corresponding Ii construct accumulated on the plasma membrane or in endosomes like the INA molecule (data not shown), verifying that the tetrameric NA is a reporter molecule that reflects the endosomal sorting properties of the trimeric native Ii molecule (15).
Deletion of the first 11 residues (construct 18) reached 25% internalization in 4.0 min, whereas elimination of the first signal by point mutations (construct 14) reached 25% internalization in 2.0 min. This may indicate that additional aminoterminal residues than those provided in construct 17 modulate the efficiency of the Met-Leu signal. Our biological data thus confirm that the cytosolic tail of Ii comprises two autonomous endosomal sorting signals that function in internalization (15,16,39) and in addition point out that a functional Leu-based sorting signal requires specific geometrically neighboring residues. DISCUSSION We have applied NMR spectroscopy to study the solution structure of a synthetic peptide corresponding to the cytosolic region of the protein Ii and containing the two internalization signals.
In aqueous solution at pH 7.4 we detected NOEs characteristic of a nascent helix involving the membrane distal motif Leu 7 -Ile 8 , while the membrane proximal motif Met 16 -Leu 17 is part of a turn. In the presence of 20% methanol, the nascent helix was stabilized. We observed a regular ␣ helix in the region Gln 4 -Leu 14 and a segment of consecutive ␤ turns between Leu 14 and Ser 26 , namely a type I (Leu 14 -Leu 17 ) and two type II (Arg 20 -Ala 23 and Ala 23 -Ser 26 ) ␤ turns. The presence of a small NOE between the ␣CH of Leu 14 and the NH of Leu 17 suggests that the turn is actually part of the helix. Inspection of the calculated backbone and dihedral angles of the segment Leu 14 -Pro 15 -Met 16 -Leu 17 shows that except for minor variation in Leu 14 , all residues retain values appropriate for an ␣ helix. The calculations thus indicate that the type I turn can be accommodated in the helix with minor conformational changes on Leu 14 without the rest of the helical residues being perturbed. The resulting structure is an ␣ helix from Gln 4 to Leu 17 with a bend at Pro 15 . Previous calculations on model helical polypeptides containing proline also confirm that it is sufficient to introduce a conformational change of only one residue in order to accommodate proline in a distorted helix (49,50). Kinked proline ␣ helices with minor conformational changes and minimal disruption of the helix hydrogen bonding have also been observed in crystal structures of proteins (51).
Comparison of the cytosolic tails of Ii from human, mouse, and rat shows that Pro 15 is conserved (12), while in the chicken 2 a Pro is present at site 17. This suggests that the proline conserved in that area might have a definite structural/ functional role. It is noteworthy that mutation of Pro 15 to Ala (construct 21, Fig. 4) abolishes internalization. Considering that Ala is the most helix favoring of the 20 commonly occurring amino acids (52, 53) such a substitution is expected to preserve the helix while avoiding the bend. In fact, energy minimization calculations (not shown) on Ala 15 Ii-(1-27) peptide have found a regular helix from Gln 4 to Leu 17 . Accordingly, it is tempting to speculate that a kinked helix is required for internalization of Ii.
Our biological data show that leucine motifs are influenced by residues located at their amino-terminal side as Ser 9 , Asn 10 , and Asn 11 can be changed to alanine without affecting the internalization signal. This is in line with other studies showing that the signals work independently from the carboxylterminal residues (8 -10). In addition, amino acids spatially close to the signals are fundamental for their correct functioning and this can be rationalized by referring to the threedimensional structure of Ii- . The presence of Ala instead of Gln 4 hampers efficient internalization (Fig. 4). The helical wheel diagram (Fig. 5) indicates that Gln 4 is positioned on the same side of the helix as Leu 7 and Ile 8 , so that its mutation alters the surroundings of the signal. Accordingly, if residues pointing away from the signal are changed, the mutation is expected to be irrelevant. In fact, mutation of Arg 5 and Asp 6 , which are found on the opposite side of the helix (Fig. 5), does not alter the sorting capabilities of the molecule. Mutation of the negatively charged Asp 3 into Ala also abolishes internalization. This may be related either to the need of negatively charged residue to be located at the amino-terminal side of the signal (see below) and/or to the specific ability of Asp to stabilize ␣ helical structures when flanking the amino terminus (54,55). The side chain of Ala at site 3 cannot form a hydrogen bond with the free NH groups at the amino terminus of the helix as does Asp residue, resulting in a strong decrease of helicity, which may prevent internalization. An indirect confirmation of the relevance of the second hypothesis would be the finding that Asp 3 becomes part of the helix when the concentration of Ii-(1-27) is increased. 3 Altogether our results suggest that for a FIG. 4. Cellular distribution and internalization rates of INA and its constructs. Mutations of amino acids into alanine are indicated by A, while dashes refer to conserved residues. Empty space indicates deletion within each construct. The peptide studied by NMR and corresponding to Ii-(1-27) is indicated by a continuous line under the sequence numbers. Localization in vesicular staining or plasma membrane is indicated by V and PM, respectively. For the internalization time, NI means not internalized, indicating that less than 15% of construct molecules internalized after 10-min chase at 310 K. Each value of internalization rate is the mean of at least three independent experiments. correct functioning, the first signal requires the specific structural context generated by the helix and the amino-terminal residues with a specific spatial arrangement around Leu 7 -Ile 8 .
The segment Leu 14 -Pro 15 -Met 16 -Leu 17 takes up a turn and forms the kinked part of the helix. Point mutations reveal that except for Leu 14 (construct 20 in Fig. 4), it is essential to have all the other residues unaltered for a functional second signal. By mutating Pro 15 , Met 16 , and Leu 17 (constructs 21, 22, and 23, respectively), no internalization was observed. It must be noted that substitution of residues within the helix with Ala does not modify the helix, rather it alters the chemical nature of the environment of the signal. In fact, only the substitution Ala for Pro has a structural explanation, since it destabilizes the turn (56) and/or removes the kink from the helix (see above), while both Ala and the native residues Met and Leu at sites 16 and 17 have low preferences for turns (56). As a partial confirmation to the relevance of the chemical properties of the side chains, we observed reduced internalization to background level through the second signal for the substitution Glu 12 to Ala. This finding points out the necessity of a negatively charged residue on the amino-terminal side of the second signal, suggesting as a putative consensus signal for Ii two hydrophobic residues and the negatively charged amino-terminal residue: Asp 3 (Glu 12 )-Xaa-Xaa-Xaa-Leu 7 (Met 16 )-Ile 8 (Leu 17 ). The two signals might work differently at various stages of the intracellular pathway, requiring specific structural contributions. For Ii, and the fusion protein INA, it is not clear whether it is sorted to the endosomal pathway solely via the plasma membrane and/or directly to endosomes from TGN. The most likely interpretation of the accumulated data is that both pathways are functional (2). Since endosomal localization is not detected when active internalization is destroyed, this would lead to the conclusion that both pathways are affected by the same point mutations. The sorting machinery at TGN and plasma membrane might thus see elements of the same sorting signal or require common structural elements.
The Ii protein assembles as a trimer (57), and the luminal domain encoded by exon 6 is essential for this process both in vitro (58) and in vivo. 4 Recently, Arneson and Miller (59) suggested that a single Ii tail is needed for internalization from the plasma membrane, whereas at least two intact tails in the complex are needed for efficient TGN to endosomal targeting. The NMR structure reported above is based on the Ii monomeric form and appears to be biologically relevant although sorting at some stages of the intracellular pathway may require more than one tail. Preliminary NMR results 5 on more concentrated samples of Ii-  indicate that the peptide starts to self-associate at a concentration 30-fold higher than that used in the present study and that the overall structure of the monomeric Ii-(1-27) described above is preserved. Further studies are thus required to elucidate whether the tail forms multimers and whether this influences the structure of the single molecule; however, the study presented here is certainly of interest for future modeling of the oligomer. Similar to our results, NMR data on a 22-residue peptide corresponding to the amino-terminal cytosolic tail of the native low density lipoprotein receptor (4), containing a Tyr in the internalization signal, indicate a nascent helix upstream of a turn containing a Pro at position 2. Comparing our results with those of Bansal and Gierasch (4) and Wilde et al. (7) on Tyrcontaining signals, we suggest that Tyr and Leu signals might share a common structural model. Although the nature of the specific molecules that are able to recognize leucine signal is not known, the structural similarity of the helix suggests that the site of interaction may be similar for the two signals. The sorting signals can be found in both type I and type II membrane proteins and can satisfactorily be transplanted from one molecule to another (for reviews, see Refs. 1, 6, and 8), suggesting that they are possibly recognized by cytosolic proteins that are not membrane bound or otherwise have a distinct direction with regard to the plasma membrane.