Lassa Virus Glycoprotein Signal Peptide Displays a Novel Topology with an Extended Endoplasmic Reticulum Luminal Region*

Lassa virus glycoprotein C (GP-C) is translated as a precursor (preGP-C) into the lumen of the endoplasmic reticulum (ER) and cotranslationally cleaved into the signal peptide and immature GP-C before GP-C is proteolytically processed into its subunits, GP-1 and GP-2, which form the mature virion spikes. The signal peptide of preGP-C comprises 58 amino acids and contains two distinct hydrophobic domains. Here, we show that each hydrophobic domain alone can insert preGP-C into the ER membrane. Furthermore, we demonstrate that the native signal peptide only uses the N-terminal hydrophobic domain for membrane insertion, exhibiting a novel type of a topology for signal peptides with an extended ER luminal part, which is essential for proteolytic processing of GP-C into GP-1 and GP-2.

Secretory proteins use N-terminal signal peptides to interact with the translocon at the ER 1 membrane (1-3). Signal peptides are essential for the translocation and membrane insertion of secretory and membrane resident proteins. They are cleaved by ER resident signal peptidase or remain uncleaved and serve as membrane anchors. Signal peptides display a tripartite structure as follows: (i) an N-terminal region of various lengths, usually comprising positively charged amino acids; (ii) a hydrophobic region with 7-15 amino acid residues; and (iii) a short polar C-terminal stretch with small uncharged residues in position Ϫ3 and Ϫ1, determining the signal peptide cleavage site (4).
The mechanism by which a signal sequence adopts a particular topology in the ER membrane still remains largely unknown. Usually, signal peptides have an N cyt -C lum orientation to the ER membrane (5), but it is also possible that the N terminus of the signal peptide is translocated into the lumen of the ER (6). Analysis of many transmembrane domains suggested that the residues flanking the hydrophobic region of the N-terminal signal anchor determine which topology is realized (6 -8). It was proposed that the net charges of the segments flanking the transmembrane sequence dictate its orientation, leaving the more positive segment cytoplasmically oriented (9). This "positive inside rule" is consistent with the topology of most proteins integrated into bacterial membranes, which is, in part, explainable by the negative inside electrical charge across the bacterial cytoplasmic membrane (10). However, this rule has not been successful in determining the topology of eukaryotic secretory proteins (11)(12)(13). Based on statistical analysis, Hartmann et al. (7) proposed the "charge difference hypothesis," stating that the orientation of the signal anchor at the N terminus is defined by the net charge in the 15 amino acids on either side of the hydrophobic core of the sequence; this hypothesis remains controversial, however (14).
The current view is that it is not only the net charges flanking the hydrophobic anchor that control the orientation in which a hydrophobic region is inserted into the ER membrane. Rather, the balance of sequence and flanking region properties such as hydrophobicity, charge or conformation of the hydrophobic region, and adjacent sequences dictate the orientation of the polypeptide at the translocon (13,15,16).
The Lassa virus is a member of the Arenaviridae, a family including the less pathogenic lymphocytic choriomeningitis virus (LCMV) as well as highly pathogenic members like the Junin virus. Lassa virus is endemic in West Africa, with ϳ100,000 -500,000 infections occurring annually of which around one-third result in illnesses ranging from flu-like symptoms to fulminant hemorrhagic fever with mortality rates of up to 30% (17). To date, no vaccine exists, and only an insufficient ribavirin therapy is available for treatment.
Lassa virions consist of a nucleocapsid surrounded by a lipid envelope in which viral glycoprotein spikes are embedded. The glycoprotein is synthesized as a 76-kDa precursor (preGP-C). The N-terminal portion of the Lassa virus preGP-C contains a signal peptide of highly extended length comprising 58 residues (18). We have shown recently that the signal peptide is essential for further proteolytic processing of the glycoprotein (19). After cotranslational cleavage of the signal peptide, GP-C is posttranslationally cleaved after nonbasic residues into the distal N-terminal subunit GP-1 and the C-terminal membrane-anchored GP-2 by the subtilase SKI-1/S1P (20,21) (Fig. 1A).
In this study, we demonstrate that the GP-C signal peptide contains two independent hydrophobic domains that can be used for ER translocation of preGP-C. However, topological studies show that only the N-terminal hydrophobic domain of the signal peptide of preGP-C is inserted in the ER membrane, whereas the extended C region is essential for the maturation cleavage of GP-C into GP-1 and GP-2.  1. A, schematic overview of the Lassa virus glycoprotein. The primary translation product preGP-C (aa 1-491), the signal peptide SP (aa 1-58), the precursor glycoprotein GP-C (aa 59 -491), the distal subunit GP-1 (aa 59 -259), and GP-2 (aa 260 -491) containing the membrane anchor (aa 427-450; stripes) are shown. The antisera bindings sites, Rb-␣-SP (aa 2-18) and Rb-␣-GP2-N (aa 259 -277), the signal peptidase (SPase) cleavage site between threonine residues 58 and 59 (arrow), the SKI-1/S1P cleavage site C-terminal of leucine 259 (arrow), and putative N-glycosylation sites (Y-like symbols) are indicated. B, the total SP sequence is shown in one-letter amino acid code. Hydrophobic domains (h1) and (h2) within the signal peptide are underlined. Net charges and hydrophobicity of signal peptide domains are illustrated. C, deletion of hydrophobic domains within the Lassa virus GP-C signal peptide. Vero cells were transfected with mutant ⌬18 -32, mutant ⌬43-52, and mutant ⌬18 -32/43-52 of recombinant Lassa virus glycoprotein preGP-C and the vector pCAGGS for mock transfection (M). WT, wild type. Protein samples were treated with PNGase F if indicated, separated on 10% (upper panel) or 12% (lower panel) acrylamide gels by electrophoresis, blotted onto polyvinylidene difluoride membrane, and immunostained using the antiserum Rb-␣-GP2-N. The deglycosylated form of GP-C is marked with an asterisk (GPC * ).
Vectorial Expression of Lassa Virus Glycoprotein and Mutagenesis-The genes of the full-length glycoprotein (wild type/preGP-C) and of the signal peptide of GP-C (Lassa virus strain Josiah) were expressed using the ␤-actin promotor-driven pCAGGS vector (18 -22). Lassa virus preGP-C mutants and the signal peptide, or the mutants thereof, were generated by recombinant PCR techniques (23). A list of the respective oligonucleotides will be made available on request. Sequences were confirmed by DNA sequencing. Vero cells were transfected with wild type and mutated recombinant DNA using LipofectAMINE 2000 (Invitrogen).
Antibodies-Antiserum Rb-␣-GP2-N was raised by immunization of a rabbit with a chemically synthesized peptide homologous to the N terminus of GP-2 (amino acid positions 259 -279), which was covalently cross-linked to keyhole limpet hemocyanin (KLH; Pierce) as a carrier protein, as described previously for antibodies Rb-␣-SP and Rb-␣-GP2 (18 -20). A monoclonal (FLAG M2) and a polyclonal antiserum directed against the FLAG epitope were purchased from Sigma. A polyclonal calnexin antiserum was purchased from StressGen Bioreagents.
Pulse-Chase Experiments and Immunoprecipitation-Plasmid-transfected Vero cells were starved 20 h post transfection for 1 h with Dulbecco's modified Eagle's medium lacking methionine and cysteine before cells were labeled with 10 Ci [ 35 S]methionine and [ 35 S]cysteine Premix (Amersham Biosciences) for 30 min. The radioactive medium was then replaced by Dulbecco's modified Eagle's medium during a 2-h chase or various chase times as indicated. The labeled cells were lysed in radioimmune precipitation assay buffer containing 1% Triton X-100, 1% deoxycholate, 0.1% SDS, 5% Trasylol, 150 mM sodium chloride, 20 mM Tris, and 10 mM EDTA, pH 8.5, and sonicated (40 watt; Branson sonifier). Non-soluble material was removed by centrifugation (20,000 ϫ g for 30 min), and supernatants of the cell lysates were incubated overnight with protein A-Sepharose coupled to the desired antibodies. Immunoprecipitated proteins were analyzed by SDS-PAGE followed by autoradiography on BioMax films (Kodak) (18,19).
Selective Membrane Permeabilization and Immunocytochemistry-Vero cells were grown on coverslips and LipofectAMINE-transfected with appropriate plasmid constructs for protein expression. Twenty hours after transfection, cells were washed and either incubated for 5 min at 4°C with intracellular buffer (20 mM HEPES buffer, 110 mM potassium acetate, 5 mM sodium acetate, 2 mM magnesium acetate, and 5 mM EGTA, pH 7.3) containing digitonin (2 g/ml) for selective permeabilization of plasma membrane or incubated for 5 min at room temperature with acetone/methanol (1:1) (v/v) for total permeabilization of membranes. Cells were subsequently washed and incubated for 1 h with 1:200 diluted primary rabbit antibodies followed by incubation for 45 min with 1:400 diluted anti-rabbit antibody from goat coupled to rhodamine (Dianova, Hamburg, Germany). Protein expression of cells was examined using an immunofluorescence microscope (Axiophot, Zeiss).

Topology of the Signal Peptide-
The hydrophobic plot program by Kyte and Doolittle (26) predicts two hydrophobic transmembrane domains, designated h1 and h2, within the signal peptide ranging from positions 18 to 32 and 43 to 52, respectively (Fig. 1B). To determine which of the hydrophobic domains has an ER translocation capacity, both were deleted alternatively and together (Fig. 1C). Deletion of only one hydrophobic domain, mutant ⌬18 -32 or ⌬43-52, still led to the N-glycosylated viral glycoprotein GP-C as confirmed by deglycosylation (Fig. 1C, lanes 3, 4, 6, and 7), whereas deletion of both domains (mutant ⌬18 -32/⌬43-52) resulted in the nonglycosylated form of preGP-C (Fig. 1C, lane 5). The absence of N-glycans indicates that the mutated preGP-C, which lacks both hydrophobic domains, is not translocated into the ER lumen, whereas deletion of only one hydrophobic domain did not prevent translocation. According to this, mutant ⌬18 -32/ ⌬43-52 migrates slower than mutant ⌬18 -32 and wild type preGP-C, because signal peptide cleavage does not occur within this mutant (Fig. 1C, lanes 5, 6, and 8). Mutant ⌬43-52 shows incomplete signal peptide cleavage (lane 7), most likely because a hydrophobic domain is missing in proximity to the signal peptide cleavage site for correct processing (27).
As a result, the following three models are conceivable for showing how both hydrophobic domains could interact with the ER membrane ( Fig. 2): (i) a double spanning transmembrane model in which the N and C termini of the signal peptide are within the ER lumen, and both hydrophobic domains (h1 and h2) serve as membrane anchors; (ii) a model with a long Nterminal cytoplasmic peptide segment that includes the Nterminal hydrophobic domain, h1, with the hydrophobic domain h2 as membrane anchor; and (iii) a model with a short N-terminal region in which the hydrophobic domain h1 serves as transmembrane anchor and the second hydrophobic domain is present within the ER lumen.
Several different experimental approaches were performed to determine the topology of the signal peptide in the ER membrane. First, we applied an immunocytochemical method that was successfully used previously for determination of the topology of the hepatitis C envelope protein (28). We inserted an HA tag (YPYDVPDYA) as an antigenic epitope at different sites of the Lassa virus glycoprotein signal peptide (Fig. 3A). Cell cultures were transfected with the three different HAtagged signal peptide constructs. All signal peptide mutants were detected after permeabilization of all cellular membranes by acetone-methanol treatment using immunofluorescence. In contrast, treatment of the cells with digitonin, which selectively permeabilizes plasma membranes but not ER membranes, resulted in fluorescence of only the N-terminally HAtagged construct (Fig. 3A). This method allows us to distinguish between the HA epitopes of the Lassa preGP-C signal peptide, which are present on the cytosolic or the luminal side of the ER membrane. Mutant SP-HA-N indicates that the HA-epitope is oriented toward the cytoplasm, suggesting that only one hydrophobic domain is membrane-anchored in the native signal peptide. The other HA-tagged mutants, mutant SP-HA-M, carrying the HA-epitope between both hydrophobic domains, and mutant SP-HA-C, with the HA tag after the second hydrophobic region, indicate that their antigenic epitopes are not accessible for immunostaining in digitoninpermeabilized cells and, therefore, face the ER lumen. Using another antigenic epitope (FLAG tag) instead of the HA tag, we obtained the same results (data not shown). As controls for correct permeabilization, cells expressing the ER marker calnexin show, as expected, no fluorescence signal in digitonin- permeabilized cells, but a fluorescence signal is shown in acetone/methanol-treated cells using antiserum directed against the ER luminal part of calnexin.
Secondly, N-glycosylation of a potential N-glycosylation site inserted into a predicted ER luminal amino acid sequence was investigated to substantiate the topology of the signal peptide. As N-glycosylation only occurs in a hydrophilic environment but not in the proximity of hydrophobic regions, attachment sites for N-glycosylation were inserted between the hydrophobic domains h1 and h2 (mutant SP-Glyc-M) flanked by an HA tag and a FLAG tag as spacers. Indeed, the signal peptide of mutant SP-Glyc-M shifted to a band near 14 kDa, which disappears after PNGase F treatment, indicating that the region between both hydrophobic domains is exposed to the ER lumen and can thus be glycosylated (Fig. 3B, lanes 2 and 3). To exclude the possibility that the insertion of the charged residues of the HA-or the FLAG-tagged epitope causes a switch in the topology, the charge of the wild type signal peptide was maintained by amino acid exchange in mutant SP-Glyc-M2. This mutant shows the same glycosylation pattern as mutant SP-Glyc-M, indicating that the topology of the signal peptide was not affected by the insertion of charges in this region (Fig.  3B, lanes 4 and 6). As negative controls, two mutants were generated with an N-glycosylation site inserted at the N terminus flanked by a FLAG tag (mutant SP-Glyc-N) or simply with an N-glycosylation site near the N terminus of the signal peptide (mutant SP-Glyc-N2), respectively (Fig. 3B). Both control mutants, SP-Glyc-N and SP-Glyc-N2, do not show any band shift in SDS-PAGE before or after PNGase F-treatment, confirming that the N-terminal N-glycosylation site is cytoplasmically orientated and, therefore, not accessible for the Nglycan attachment (Fig. 3B, lanes 6 -9). The differences in electrophoretic mobility between all mutants (Fig. 3B, lanes 3, 5, 7, and 9) are due to the differing lengths of their respective insertions. Interestingly, insertion of the charged epitopes did not alter the topology of any of the signal peptide mutants. Alternative attempts to control the topology of the signal peptide by exchanging the charges surrounding the first hydrophobic region did not result in a switch of the topology either (data not shown).
The Lassa virus signal peptide forms dimers when expressed in eucaryotic cells. It contains three cysteine residues, which could play a role in dimer formation. Two are located at the C terminus of the signal peptide (residues Cys-53 and Cys-57), whereas one is located in between the h1 and h2 domain (residue Cys-41). To prove, by using a third method, that the signal peptide is inserted via the h1 domain leaving a long C-terminal part facing the ER lumen, we analyzed the role of cysteine 41 for dimerization. For this purpose, a signal peptide mutant was constructed in which the cysteine residues Cys-53 and Cys-57 were mutated to serine (SP C53S, C57S-FLAG-N). Fig. 3C shows that this mutant still forms covalently linked dimers under non-reducing conditions, indicating that the region between both hydrophobic regions is ER luminal and, thus, accessible for disulfide linkage. To avoid disulfide formation caused by artificial oxidation of free sulfhydryl groups by air, N-ethylmaleimide was added to the lysis buffer at a final concentration of 20 mM. Furthermore, a mutant with all cysteine residues changed to serine (SP C41S/C53S/C57S-FLAG-N) does not form covalently linked dimers, indicating that dimerization is strictly disulfide-dependent. Finally, a mutant with a cysteine residue introduced at the N terminus and all other cysteine residues changed to serine (SP G2C/C41S/ C53S/C57S-FLAG-N) also shows no disulfide formation.
Taking our topological data together, we show that the Lassa virus glycoprotein signal peptide possesses a novel topology for signal peptides, as depicted in Fig. 2C, consisting of an Nterminal cytoplasmic peptide segment (17 amino acid residues) followed by a transmembrane domain (h1) comprising the amino acids from 18 to 32 and an exceptionally extended Cterminal, ER luminal region comprising amino acids 33 to 58, including the second hydrophobic domain (h2).
Role of the Topology for GP-C Cleavage-We have shown recently that the signal peptide of the Lassa virus glycoprotein GP-C is essential for the maturation cleavage of the glycoprotein as a trans-acting maturation factor (19). The signal peptide topology suggests that its C-terminal portion facing the ER lumen is responsible for the interaction with GP-C. Therefore, the N-terminal region of the signal peptide up to the transmembrane segment, which is cytoplasmically orientated, may well be dispensable for the cleavage of GP-C into GP-1 and GP-2. To prove this assumption, three preGP-C deletion mutants (mutant ⌬2-17, truncated from the N terminus to the transmembrane domain; mutant ⌬2-30; and mutant ⌬2-41, truncated until position 41, leaving the second hydrophobic domain intact) were constructed and vectorially expressed in Vero cells (Fig. 4). Western blot analysis of wild type and expressed truncated preGP-C mutants shows that deletion of the N terminus of the signal peptide up to the transmembrane segment, h1, has no effect on GP-C maturation (Fig. 4, lanes 2  and 3). Further truncations of up to 41 residues abolished cleavage of GP-C by SKI-1/S1P, whereas translocation was not affected (Fig. 4, lanes 4 and 5). These results strengthen the topology model and further demonstrate that the interaction between the signal peptide and GP-C occurs within the ER lumen. The N-terminal cytoplasmic region of the signal peptide is dispensable for GP-C cleavage into GP-1 and GP-2.

DISCUSSION
The data presented in this study show that the signal peptide of the Lassa virus glycoprotein possesses two independent hydrophobic domains, h1 and h2. Both have the potential to mediate membrane insertion and translocate GP-C into the ER lumen. However, only the N-terminal hydrophobic region (h1) is actually used as a membrane anchor and directs the glycoprotein into the exocytic pathway at the ER translocon. The data presented here provide evidence for the signal peptide topology model of Fig. 2C with a 17-amino acid-long N-terminal cytoplasmic peptide segment, the transmembrane domain h1, and an extended C-terminal region comprising the second hydrophobic region h2. This signal peptide topology is based on three independent analyses. (i) Only the N-terminal part of the recombinant signal peptide is immunocytochemically detectable on the cytoplasmic side of the ER membrane. (ii) Only the C-terminal peptide segment (aa 33-58) is glycosylated if an N-glycosylation motif is inserted. (iii) The cysteine residue Cys-41 between both hydrophobic regions forms covalently linked dimers.
The extended C-terminal segment of the signal peptide can be subdivided into three smaller regions: (i) the C-terminal hydrophobic domain h2; (ii) an h1-h2 connecting sequence (aa 33-42), which flanks h2 at its N terminus; and (iii) a short C-terminal peptide segment. The hydrophobic domain h2 may be either freely available in the ER lumen or attached to the inner leaflet of the ER membrane. Membrane association of h2 seems plausible because of its hydrophobic nature and because recent findings concerning the LCMV GP-C signal peptide, which is homologous to the Lassa GP-C signal peptide, suggest that the peptide is highly resistant to protease digestion, implying a very compact structure (29). The hydrophobic region h2 is probably necessary for correct signal peptidase processing, as cleavage occurs only 2-9 residues after a hydrophobic stretch (30). Although the hydrophobic domain h2 might be inserted into the inner leaflet of the ER membrane after signal peptide cleavage from the glycoprotein, the backward folding of the signal peptide via two hydrophobic membrane-spanning domains with the N-and C termini facing the cytoplasm is excluded by the fact that that the C terminus contains the signal peptide cleavage site and is thus located in the ER lumen. This is also in accordance with our results showing that a C-terminally HA-tagged signal peptide is immunocytochemically inaccessible in cells with non-permeabilized ER membranes. This proposed topology model appears plausible, because signal peptides are directed to the translocon in a signal recognition particle (SRP)-dependent manner. The SRP binds directly to the hydrophobic region of the nascent chain emerging from the ribosome. Translation is then halted, the SRP docks to its receptor at the ER translocon, the hydrophobic region is inserted into the membrane, and the elongation of the nascent chain continues into the ER lumen. As the h1 region was shown to be sufficient for ER translocation of the glycoprotein in this study, it seems highly unlikely that the SRP would dissociate from the previously recognized h1 region to bind to the h2 region.
To confirm that insertion of the charged epitopes does not cause a switch in the topology of the signal peptide that would lead to an artificial insertion, several controls were implemented. First of all, the HA epitope was replaced by other epitopes of variable charge, showing that the topology was not affected by simple charge alteration. Additionally, although the net charge in the respective regions was altered by the negatively charged HA epitope, the charge relation was still consistent with the wild type situation, leaving the N region more acidic than the C region for all mutants. Furthermore, in the case of mutants SP-HA-N and SP-FLAG-N, the insertion is 16 residues away from the h1 region and, as a result, not within a distance to influence membrane insertion (7). Finally, these mutants are still functional in regard to the proteolytic processing of Lassa GP-C into GP-1 and GP-2, which would be impossible with an altered topology. The effect of charge alterations between both hydrophobic regions was examined with the glycosylation mutants SP-Glyc-M and SP-Glyc-M2. Neutralization of all charges inserted with the epitopes still leads to a glycosylated signal peptide, indicating that the introduced charges did not affect the topology of the signal peptide.
The observation that the residue Cys-41 in the h1-h2 linker region is involved in disulfide formation is consistent with the previous results concerning the proposed topology model. However, the necessity of disulfide linkage of the signal peptide for glycoprotein maturation remains unclear. Table I shows that this cysteine residue is not conserved among the signal peptides of arenavirus glycoproteins in contrast to cysteine residue Cys-57, which is conserved. As a result, disulfide-dependent dimer formation of the signal peptide does not strictly have to be dependent on the cysteine residue Cys-41 but might also be accomplished with residue Cys-57. Studies to elucidate whether dimerization of the Lassa virus glycoprotein signal peptide is a prerequisite for further maturation of the glycoprotein are in progress. In conclusion, three independent analyses of the signal peptide topology show that the region between both hydrophobic regions is already ER luminal, which clearly points to the topology model of Fig. 2C, strictly excluding the other models.
The amino acid sequence of the Lassa virus glycoprotein signal peptide is highly conserved among the glycoproteins of all known arenaviruses including that of LCMV (Table I). To date, the signal peptide of the LCMV glycoprotein has been considered to be classical signal peptide with an extended Nterminal region (31). However, because the signal peptides of the Lassa virus and LCMV glycoproteins are homologous and functionally interchangeable (19), it is conceivable that the signal peptides of all arenaviral glycoproteins, including LCMV GP-C, display the same topology as we found for the Lassa virus glycoprotein exhibiting an extended C-region.
The glycoprotein signal peptide topology of arenaviruses is unique among glycoproteins of other virus families. Extended signal peptides have also evolved among envelope glycopro-

MGQFISFMQEIPTFLQEALNIALVAVSLIAIIKGIVNLYKSGLFQFFVFLVLAGRSCTEE 60
Pichinde MGQVVTLIQSIPEVLQEVFNVALIIVSTLCIIKGFVNLMRCGLFQLITFLILAGRSCDGM 60 ***..::: :* .:.*.:*:.:: : : :*.. *. .*:. :. ** *.**** teins of other viruses, e.g. foamy viruses, HIV-1, and the Sindbis virus, as well as among non-viral proteins such as prolactin (32). The extended foamy virus glycoprotein signal peptide is proposed to comprise ϳ148 amino acids. The N-terminal 66 residues form a hydrophilic cytoplasmic domain, 24 hydrophobic amino acids span the ER membrane, and ϳ58 hydrophilic amino acids are located within the lumen of the ER containing potential N-glycosylation sites (33). In contrast, the Lassa virus preGP-C signal peptide has two hydrophobic domains, albeit only h1 is membrane-inserted. This characteristic clearly distinguishes it from the foamy virus signal peptide, which is involved in virus particle formation. It is also different from the Sindbis virus signal peptide. Here, the 6K protein connecting E2 and E1 in the precursor protein virus can also be regarded as a signal peptide. It spans the ER membrane twice, and its C and N termini face the lumen of the ER (34).
In this study, we describe a novel type of signal peptide with an extended ER luminal region. The unique topology of the Lassa virus glycoprotein signal peptide clearly differs from that of known classical N-terminal signal peptides by enabling a second function. The extended ER luminal part of the signal peptide is necessary for proteolytic activation of the glycoprotein into GP-1 and GP-2. Because of its unusual topology, it cannot be excluded that the signal peptide might even engage in additional functions after proteolytic activation of the glycoprotein. Therefore, it will be interesting to determine further potential functional aspects in the life cycle of this unusual signal peptide.