Topological rules for membrane protein assembly in eukaryotic cells.

Insertion into the endoplasmic reticulum membrane of model proteins with one, two, and four transmembrane segments and different distributions of positively charged residues in the N-terminal tail and the polar loops has been studied both in vitro and in vivo Membrane insertion of these same constructs has previously been analyzed in Escherichia coli, thus making possible a detailed comparison between the topological rules for membrane protein assembly in prokaryotic and eukaryotic cells. In general, we find that positively charged residues have similar effects on the membrane topology in both systems when they are placed in the N-terminal tail but that the effects of charged residues in internal loops clearly differ. Our results rule out a sequential start-stop transfer model where successive hydrophobic segments insert with alternating orientations starting from the most N-terminal one as the only mechanism for membrane protein insertion in eukaryotic cells.

Insertion into the endoplasmic reticulum membrane of model proteins with one, two, and four transmembrane segments and different distributions of positively charged residues in the N-terminal tail and the polar loops has been studied both in vitro and in vivo. Membrane insertion of these same constructs has previously been analyzed in Escherichia coli, thus making possible a detailed comparison between the topological rules for membrane protein assembly in prokaryotic and eukaryotic cells. In general, we find that positively charged residues have similar effects on the membrane topology in both systems when they are placed in the N-terminal tail but that the effects of charged residues in internal loops clearly differ. Our results rule out a sequential start-stop transfer model where successive hydrophobic segments insert with alternating orientations starting from the most N-terminal one as the only mechanism for membrane protein insertion in eukaryotic cells.
What features of the amino acid sequence of integral membrane proteins control their insertion and orientation in the membrane? A number of recent studies in Escherichia coli have suggested that hydrophobic segments both target bacterial inner membrane proteins to the membrane and drive their insertion into the lipid bilayer, whereas flanking positively charged amino acids determine the final orientation of the hydrophobic transmembrane segments (1).
Statistical analyses suggest that a similar "positive inside" rule is applicable also to eukaryotic membrane proteins (2)(3)(4)(5)(6)(7), although the tendency for positively charged residues to be absent from extracellularly exposed parts is weaker in eukaryotic than in prokaryotic membrane proteins. Nevertheless, a handful of eukaryotic membrane proteins have been successfully expressed in E. coli with what appears to be the correct topology (8 -10), and at least one example of an E. coli inner membrane protein that inserts with the correct topology into mammalian microsomes in vitro has been described (11,12).
Mutagenesis experiments on a number of bitopic eukaryotic plasma membrane proteins with a single N-terminal transmembrane segment have revealed a clear effect of arginine and lysine residues on membrane orientation, although other factors such as the presence of negatively charged residues, the length and folding properties of the polar N-terminal tail preceding the transmembrane segment, and the length of the transmembrane segment itself have also been shown to be important (13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23). However, the effects of charged residues on the topology of polytopic eukaryotic membrane proteins have not been systematically studied so far, and it is not clear whether the topological rules relating amino acid sequence to membrane topology are the same in prokaryotic and eukaryotic cells.
Building on our earlier work on membrane protein assembly in E. coli, we now report a systematic comparative study of the insertion into the ER 1 membrane, both in vitro and in vivo, of a range of model proteins with one to four transmembrane segments and different distributions of positively charged residues in the flanking regions. Most of these proteins have previously been characterized in E. coli, allowing a detailed comparison between membrane protein insertion in prokaryotic and eukaryotic systems.

MATERIALS AND METHODS
Enzymes and Chemicals-Unless otherwise stated, all enzymes were from Promega. T7 DNA polymerase was from Pharmacia Biotech Inc. Endo H was from Boehringer Mannheim. [ 35 S]Met was from Amersham Corp. Ribonucleotides, deoxyribonucleotides, dideoxyribonucleotides, protein A-Sepharose, and the cap analog m7G(5Ј)ppp(5Ј)G were from Pharmacia. Plasmid pGEM1, transcription buffer, and rabbit reticulocyte lysate were from Promega. BHK 21 cells were grown in medium from Life Technologies, Inc. The glycosylation acceptor peptide N-benzoyl-Asn-Leu-Thr-N-methylamide and the nonacceptor peptide N-benzoyl-Asn-Leu-(allo)Thr-N-methylamide were synthesized according to Ref. 24. Oligonucleotides were from Kebo Lab (Stockholm, Sweden).
DNA Techniques-Site-specific mutagenesis was performed according to the method of Kunkel (25) as modified by Geisselsoder et al. (26). All mutants were confirmed by DNA sequencing of single-stranded M13 DNA using T7 DNA polymerase. For cloning into and expression from the pGEM1 plasmid, the 5Ј end of the lep gene was modified first by the introduction of an XbaI site and second by changing the context 5Ј to the initiator ATG codon to a "Kozak consensus" sequence (27). Thus, the 5Ј region of the gene was modified to: ATAACCCTCTAGAGCCACCATG-GCGAAT (XbaI site and initiator codon underlined). Mutants with four transmembrane segments were made by fusing the H1-P1-H2 region (from the upstream XbaI site to a KpnI site introduced at the codon corresponding to Asp 99 in the wild type sequence) of one Lep construct to a KpnI site introduced just upstream of the start codon of a second Lep construct (see Fig. 1). The KpnI site introduced three new amino acid residues (Glu-Val-Pro) at the fusion site. The XbaI-SmaI fragment carrying the constructs was cloned into pGEM1 behind the SP6 promoter. For cloning into the pSFV1 in vivo expression vector (28), constructs were polymerase chain reaction-amplified from pGEM1 using polymerase chain reaction primers introducing BamHI and SmaI sites flanking the relevant gene. The BamHI-SmaI fragment carrying the constructs was then cloned into the pSFV1 polylinker region.
Assay of Membrane Topology in E. coli-E. coli strain MC1061 (29) * This work was supported by grants from the Swedish Cancer Society, the Swedish Natural Sciences Research Council, the Swedish Technical Sciences Research Council, and the Göran Gustafsson Foundation (to G. v. H.) and a grant from the Naito Foundation (to M. S.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
In Vitro Transcription and Translation in Reticulocyte Lysate-The pGEM1 plasmids carrying the relevant constructs were linearized at a SmaI site immediately downstream of the coding region prior to in vitro transcription. Synthesis of RNA by SP6 RNA polymerase and translation in reticulocyte lysate in the presence of dog pancreas microsomes was performed as described (32) or according to the protocol provided by Promega. Translocation of polypeptides to the lumenal side of the microsomes was assayed by prevention of N-linked glycosylation through competitive inhibition by the addition of a glycosylation acceptor tripeptide but not by a nonacceptor tripeptide and by proteinase K treatment of the microsomes (12). Alkaline extraction of microsomes was carried out as described (33). SDS-polyacrylamide gel electrophoresis gels were scanned in a FUJIX Bas 1000 PhosphorImaging Plate Scanner and analyzed using the MacBAS software (version 2.1).
In Vivo Expression in BHK Cells-The pSFV1 vector containing the relevant constructs were linearized at a unique SphI site downstream of the cloned gene. The linearized plasmid was used as template for in vitro transcription, and mRNA was transfected into cells by electroporation as described earlier (34). 7 h post-transfection, cells were pulselabeled with [ 35 S]Met for 15 min, chased for 30 min, and solubilized in Nonidet P-40 (1%) lysate buffer (35). The lysate was centrifuged at 4000 ϫ g for 5 min. The supernatant was used for protein A-mediated immunoprecipitation using a Lep polyclonal antiserum. The precipitate was solubilized in Endo H buffer (1% SDS, 50 mM sodium-citrate, pH 6.0) and incubated for 5 min at 70°C. The sample was then divided in two and either mock treated or Endo H-treated (5 milliunits) for 6 h at 37°C and analyzed by SDS-polyacrylamide gel electrophoresis under nonreducing conditions.

Construction and Expression of Model Proteins with One,
Two, and Four Transmembrane Segments-In order to be able to make direct comparisons between the topological effects of positively charged amino acids in membrane proteins expressed in prokaryotic and eukaryotic systems, we chose the well characterized E. coli inner membrane protein leader peptidase (Lep) as a model protein. Lep has been used extensively in studies of membrane protein assembly in E. coli (1) and has also been shown to insert efficiently into ER-derived dog pancreas microsomes (11,12).
Lep constructs with one and two transmembrane segments and various point mutations, additions, or deletions ( Fig. 1) were made from the wild type gene by site-directed mutagenesis. Molecules with four transmembrane segments were constructed by fusing the H1-L1-H2 region from one construct with the entire coding region from another (36). To map the cytoplasmic or extra-cytoplasmic location of the N-terminal tail (N cyt and N ext ) and the C-terminal P2 domain (C cyt and C ext ), acceptor sites for N-linked glycosylation that can only be modified if translocated to the lumen of the ER were introduced into these regions by site-directed mutagenesis (note that all constructs with an N-terminal glycosylation acceptor site have an extra N-terminal extension of 14 residues, except in wild typederived (WT*) construct and construct ⌬H2 (Fig. 2), where the extension is 16 residues). All genes were cloned behind the SP6 promoter in the pGEM1 vector for expression in vitro and into the Semliki Forest virus vector pSFV1 (28) for expression in vivo.
A Single N-terminal Transmembrane Segment Is Sufficient for Targeting and Insertion of a Bacterial Protein into the ER Membrane-In E. coli, Lep adopts a somewhat unusual membrane topology with both the N and C termini facing the periplasm. When expressed in vitro in the presence of microsomes, both termini are again translocated, whereas the P1 loop remains on the cytoplasmic side of the membrane (12). To check whether both the H1 and H2 transmembrane segments are necessary for targeting and insertion into the ER membrane, transmembrane segment H2 was deleted from WT* with an N-terminal extension including a glycosylation acceptor site and a mutation (Asn 214 3 Gln) that removes the only acceptor site in the P2 domain, and the protein was expressed in the absence and the presence of microsomes. As seen in Fig. 2, the ⌬H2 mutant was glycosylated on the N-terminal acceptor site to the same extent as WT* (lanes 2 and 5). When the microsomes were treated with proteinase K, a protease-resistant fragment representing the H2-P2 domain (12) was produced from WT* (Fig. 2, lane 3), whereas mutant ⌬H2 was completely degraded (Fig. 2, lane 6). After disruption of the microsomes with the detergent Triton X-100, no protease-resistant material remained for either construct (data not shown). Together, these results demonstrate that H1 has an intrinsic ability to target to the ER membrane and to insert in the same N ext -C cyt orientation as adopted in E. coli.
N-terminal Positively Charged Residues Control the Topology of a Protein with Two Transmembrane Segments-Lep has a highly charged cytoplasmic loop (P1) that cannot be translocated across the inner E. coli membrane, presumably because the positively charged amino acids immediately downstream of the first hydrophobic domain (H1) prevent translocation (37,38). Thus, even though the addition of one or more lysines to the N-tail preceding H1 can cause certain Lep constructs lacking most of the charged residues in the P1 loop to insert into the inner membrane in an inverted N cyt -C cyt orientation with P1 exposed to the periplasm (31,39), no translocation of the P1 loop was observed even when as many as three lysines were added to the N-tail in wild type Lep (mutant WT-3K). This mutant was expressed in E. coli, and its topology was probed by trypsin treatment of spheroplasts followed by immunoprecipitation with a Lep polyclonal antiserum (Fig. 3A). Very little protein remained after protease treatment (Fig. 3A, lane 2), and there was no protease-protected fragment diagnostic for translocation of the P1 domain (31). Parallel analysis of the protease-sensitivity of the mature and precursor forms of the outer membrane protein OmpA was performed to ensure that the spheroplasts were intact. The protease sensitivity of the periplasmically exposed mature OmpA demonstrates that no intact cells remained, whereas the protease-resistance of cytoplasmic, nontranslocated precursor OmpA demonstrates that the spheroplasts were intact (31). Thus, the P2 domain in WT-3K is exposed to the periplasmic space, whereas the P1 domain is cytoplasmic.
It has been shown that translocation of the P2 domain across the microsomal membrane is much less sensitive to the presence of positively charged amino acids immediately downstream of H2 than when expression is carried out in E. coli (11), suggesting that the addition of one or more lysines to the normally uncharged N-tail of Lep might result in an inverted orientation in microsomes with the highly charged P1 loop in the lumen, in contrast to what is seen in E. coli. Wild type Lep expressed in vitro in the presence of dog pancreas microsomes was efficiently glycosylated on its only potential acceptor site for N-linked glycosylation (Asn 214 ) (Fig. 3B, top panel, lanes 1  and 2), demonstrating the lumenal disposition of the P2 domain. The addition of a single N-terminal lysine between Met 4 and Phe 5 resulted in a marked reduction in glycosylation efficiency (Fig. 3B, top panel, lanes 3 and 4), and with two and four N-terminal lysines the glycosylation efficiency was only ϳ20%, consistent with a predominantly N cyt -C cyt topology. The decrease in glycosylation was paralleled by a concomitant decrease in the amount of protease-resistant H2-P2 fragment (data not shown). In all cases, both the glycosylated and nonglycosylated forms were properly integrated into the microsomal membrane as shown by their retention in the pellet after alkaline extraction (Ref. 33 and data shown for the 0K and 2K mutants in Fig. 3B, bottom panel). We conclude that positively charged residues in the N-terminal tail have a strong influence on the topology of Lep expressed both in the E. coli and microsomal systems, whereas positively charged residues in the P1 loop between the two transmembrane segments block translocation only in E. coli.
The Length of the Loop between Two Transmembrane Segments Does Not Affect Topology-Another variable that has been shown to influence the topology of Lep-derived constructs in E. coli is the length of the P1 loop; although this 39-residuelong loop cannot be translocated (Fig. 3A), translocation is possible when its length is increased to more than ϳ60 residues (40). It has been speculated that this may be related to the participation of the Sec translocase in the translocation of long loops across the inner membrane; only a small number of positively charged residues would be allowed in short loops (Յ60 residues) that cannot use the Sec translocase for translocation, whereas longer loops that use the more permissive Sec pathway would have less stringent sequence requirements (41).
To test if the length of the P1 loop affects topology also in the ER membrane, a series of constructs with zero, one, two, and four N-terminal lysines and P1 lengths ranging from 19 to 88 residues was analyzed in the microsomal in vitro system. As shown in Fig. 4, the length of the P1 loop has at best a minor and not very consistent effect on the topology in the 1K series of mutants and has no effect for the 0K, 2K, and 4K series, in contrast to the situation in E. coli.
The construct with 88 residues in the P1 loop was glycosylated somewhat more efficiently than the other constructs when two and four lysines were present in the N-tail (Fig. 4B); however, in this particular case there is an additional glycosylation site in the P1 loop, ϳ15 residues downstream of H1. The higher glycosylation efficiency may thus result from partial glycosylation of this site rather than from translocation of the P2 domain. To ascertain that this was indeed the case, a second glycosylation site was inserted into the middle of the P1 loop, and the site in P2 was simultaneously removed. The resulting construct 2K(88*) with two N-terminal lysines and with two glycosylation sites in P1 and none in P2 was efficiently glycosylated (Fig. 4A, lanes 11 and 12); 20% of the chains were glycosylated on both sites, and 70% were glycosylated only on one, demonstrating that no more than ϳ10% of the molecules have the wild type orientation (or fail altogether to integrate into the membrane). N-terminal positively charged residues thus efficiently promote an inverted, N cyt -C cyt topology in the ER membrane irrespective of the length of the P1 loop.
The Topology of "Nonfrustrated" Proteins Follows the Posi-  Fig. 1, an additional pair of residues (Val-Ala) were inserted between Pro 12 and Gly 13 ). ⌬H2 is the same as WT* but with transmembrane segment H2 (residues 62-76) deleted. Nonglycosylated and glycosylated products are indicated by black and white circles, respectively. The proteinase K (PK)-resistant fragment in WT* marked by a black square represents the lumenally located H2-P2 domain.
tive Inside Rule-Previous topological analyses in E. coli of Lep-derived constructs with four transmembrane segments and different distributions of positively charged residues in the N-tail and the loops connecting the transmembrane segments have demonstrated that "frustrated" constructs, i.e. molecules where the distribution of positively charged residues is such that neither the N ext -C ext nor the N cyt -C cyt topology will have all the highly charged loops in the cytoplasm, often adopt a "leave one out" topology with only three of the hydrophobic segments inserted across the membrane (36).
To test if the influence of positively charged residues is equally strong in the ER, a subset of the constructs previously characterized in E. coli was expressed in the microsome system. First, two nonfrustrated constructs that adopt, respectively, the N ext -C ext (i.e. N and C terminus in the periplasm) and the N cyt -C cyt topology in E. coli were analyzed. As shown in Fig. 5A, construct 0K/3K/0K/3K with three lysines in each of the loops between the first and second and between the third and fourth transmembrane segments was efficiently glycosylated both in the C-terminal P2 domain (Fig. 5A, middle panel,  lane 2) and on an Asn-Ser-Thr acceptor site introduced into the N-tail (Fig. 5A, top panel, lane 2) and thus adopts the N ext -C ext topology just as in E. coli. The nonglycosylated full-length product of construct *0K/3K/0K/3K (Fig. 5A, top panel, lane 7) has the same size as the corresponding 0K/3K/0K/3K construct expressed in E. coli (36) (Fig. 5A, top panel, lane 8).
Proteinase K treatment of the microsomes (Fig. 5A, top panel, lane 5) produced a protected fragment that runs at the approximate location of the weak band in lane 1 of the top panel of Fig. 5A marked by a black square in the middle and bottom panels and considerably below this band in the top panel. This suggests that the protected fragment is glycosylated when the P2 domain carries a glycosylation site (Fig. 5A, middle and bottom panels) but not when the glycosylation site is in the N-tail ( fig. 5A, top panel). Indeed, the protected fragment in the middle and bottom panels of Fig. 5A has the same size as a glycosylated, protease-protected fragment derived from microsome-integrated wild type Lep that was previously shown to correspond to the glycosylated H2-P2 domain (Ref. 12 and data not shown), and we conclude that the protected fragment in Fig. 5A results from cleavage in the L1Ј loop.
With glycosylation sites simultaneously present in both the N-tail and the P2 domain, a majority of the molecules were glycosylated on both sites (Fig. 5A, bottom panel, lane 2), suggesting that all membrane-inserted molecules are glycosylated in the P2 domain and that most of these are also glycosylated on the N-tail. To rule out that the rather small number of residues between the N-terminal glycosylation site and H1 cause the somewhat inefficient glycosylation of the N-tail (12), we also made constructs where this distance was increased to 19 and 25 residues. Both were glycosylated with the same efficiency as the original construct (data not shown), suggesting that the N-terminal glycosylation site is either intrinsically less easily modified than the site in the P2 domain or that the N-tail is somewhat less efficiently translocated than P2. In any case, these data show that the majority of the molecules have the expected N ext -C ext orientation with the L1Ј loop (and most likely also the L1 loop) exposed on the cytoplasmic side of the microsomal membrane.
In contrast, construct 3K/0K/3K/0K with three lysines in the N-tail and in the loop between transmembrane segments 2 and 3 was found to be essentially unglycosylated on both the N-and C-terminal acceptor sites (Fig. 5B, top panel), suggesting the same N cyt -C cyt topology as observed in E. coli. The trivial explanation that this particular construct is not inserted into the membrane was ruled out by alkaline extraction of the micro-

FIG. 3. Addition of positively charged residues to the N-tail causes wild type Lep to insert with an "inverted" N cyt -C cyt topology into microsomes but does not cause translocation of the P1 loop across the E. coli inner membrane. A, mutant WT-3K
with three lysines in the N-terminal tail was expressed in E. coli, and its topology was probed by trypsin treatment of spheroplasts followed by immunoprecipitation with a Lep polyclonal antiserum (top panel). As a control, the outer membrane protein OmpA was immunoprecipitated in parallel (bottom panel). mOmpA, mature OmpA; pOmpA, precursor OmpA). In lane 1, spheroplasts were incubated with a mixture of trypsin, trypsin inhibitor, and phenylmethylsulfonyl fluoride. B, top panel, mutants with zero, one, two, and four N-terminal lysines were expressed in the absence (Ϫ) or the presence (ϩ) of rough microsomes (RM). The number of N-terminal lysines is shown above the lanes. Nonglycosylated and glycosylated products are indicated by black and white circles, respectively. Bottom panel, alkaline extraction of mutants with zero and two N-terminal lysines expressed either in the presence of rough microsomes (ϩ) or with rough microsomes added posttranslationally to the reaction mixture (Ϫ). T, total reaction; P, pellet; S, supernatant. The predominant topology is shown for the 0K and 2K mutants. In the cartoon, the nonmodified glycosylation site in mutant 2K is circled. somes (33); as seen in Fig. 5B (bottom panel), a large fraction of the molecules partitioned with the wash supernatant when translation was carried out in the absence of microsomes (or when microsomes were added post-translationally; data not shown), whereas essentially no molecules were found in this fraction when translation was carried out in the presence of microsomes, demonstrating proper membrane integration. Further, the long L1 loop in the related construct 3K/88/3K/0K (see below) is efficiently glycosylated, again consistent with proper membrane assembly and an N cyt -C cyt topology.
Both nonfrustrated constructs with a glycosylation acceptor site in the P2 domain were also expressed in vivo in BHK cells. As in vitro, construct 3K/0K/3K/0K was not glycosylated, whereas construct 0K/3K/0K/3K was fully glycosylated (Fig.  5C), demonstrating efficient targeting and insertion into the ER membrane (we cannot completely rule out that some molecules may fail to integrate and are rapidly degraded). The two lower molecular weight bands marked with black and white squares are also faintly seen in most of the in vitro experiments. These bands run at roughly the same position as nonglycosylated and glycosylated wild type Lep, respectively (data not shown), and may result either from endogenous proteolysis or from internal initiation at one or other of the four methionine codons present in the L2 loop (see Fig. 1).
We conclude that nonfrustrated constructs with four transmembrane segments based on a bacterial inner membrane protein can be targeted and inserted into the ER membrane both in vitro and in vivo and that they adopt the same topology as in E. coli.
Membrane Insertion of Frustrated Proteins-Two frustrated constructs, previously shown to adopt leave one out topologies in E. coli, were also tested in the ER system. Construct 0K/3K/ 3K/0K was glycosylated on the N terminus and in the P2 domain both in vitro and in vivo and was doubly glycosylated when acceptor sites were simultaneously present in these two regions (Fig. 6A, bottom), suggesting that the topology is predominantly N ext -C ext with four transmembrane segments (Fig.  6B, bottom panels). This is in contrast to E. coli, where this construct adopts a leave one out topology with only three transmembrane segments (36).
On the other hand, the N-tail in construct 3K/0K/0K/3K was not glycosylated, whereas the P2 domain was efficiently glycosylated (Fig. 6A, left top panels). When expressed in vivo, this construct was again fully glycosylated in the P2 domain (Fig.  6A, right top panel). As in E. coli, the dominating topology is thus N cyt -C ext , with only three transmembrane segments (Fig.  6B). It is formally possible that the second hydrophobic segment is translocated into the lumen with the first, third, and fourth segments spanning the membrane; however, given that there is no precedence for the translocation of hydrophobic segments across the ER membrane, we consider this unlikely. Because this construct is slightly less efficiently glycosylated in the P2 domain in vitro than the nonfrustrated construct 0K/ 3K/0K/3K (65% versus 75%) and because most of the nonglycosylated molecules are found in the membrane fraction after alkaline extraction in both cases (data not shown), a minor proportion of the molecules (ϳ10%) may have the P2 domain in the cytoplasm. This is not the case in vivo, however. Which of the two uncharged loops that is exposed to the lumenal side cannot be determined with the glycosylation assay. Protease treatment of the microsomes did not give interpretable results for this mutant, because no unique protease-resistant fragment was produced.
In order to address this point further, two constructs where the L1 loop between the first and second transmembrane segments is 88 rather than 25 residues long were also tested. A  1-10) were expressed in the absence (Ϫ) or the presence (ϩ) of rough microsomes. All loops were derived from the 39-residue-long P1 loop in wild type Lep by deletions (⌬Asp 23 -Arg 42 to make the 19-residues loop; ⌬Asp 23 -Arg 32 to make the 29-residues loop) or insertions (see Ref. 40 for sequences). Mutant 2K(88*) (lanes 11 and 12) has two lysines in N-tail, two glycosylation acceptor sites in the P1 loop, and none in P2. The number of lysines in N-tail and the length of the P1 loop is shown above the lanes. Nonglycosylated products are indicated by a black circle, singly glycosylated products are indicated by a white circle, and doubly glycosylated products are indicated by two white circles. The predominant topology of the 2K(88*) mutant is shown. B, fraction of molecules glycosylated in the P2 domain for the mutants shown in panel A (one lysine in N-tail), as well as for mutants with zero, two, and four lysines in N-tail. Note that the 4K(48) construct has not been made.
glycosylation acceptor site placed in this loop is far enough from transmembrane segment H1 (ϳ15 residues) to be accessible to the lumenally disposed glycosyltransferase, and the location of the loop can thus be mapped. In the nonfrustrated construct 3K/88/3K/0K, the long loop was quite efficiently glycosylated (Fig. 7A, lane 3), and no doubly glycosylated molecules were apparent when the acceptor site in the P2 domain was also present (Fig. 7A, lane 2), demonstrating that the dominating topology is N cyt -C cyt , as expected. The slight increase in glycosylation efficiency (from 45 to 55%; compare lanes 2 and 3,  3 and 4) of microsomes. P, pellet; S, supernatant. C, topology of constructs 3K/0K/3K/0K and 0K/3K/0K/3K expressed in BHK cells. Proteins in the cell lysate were immunoprecipitated by a Lep polyclonal antiserum and either treated (ϩ) or not treated (Ϫ) with Endo H to remove attached oligosaccharides. In both constructs, a single glycosylation acceptor site was present in P2. Nonglycosylated products are indicated by a black circle, and singly glycosylated products are indicated by a white circle. The black and white squares indicate, respectively, a nonglycosylated and a glycosylated product resulting either from endogenous proteolysis of from initiation at a methionine in the L2 loop (see text). In the cartoons, nonmodified glycosylation sites are circled.
both one and two glycosyl moieties were apparent (Fig. 7A, lane  4); with an additional acceptor site present in the P2 domain, the molecules were glycosylated either once or on all three sites (Fig. 7A, lane 5). The absence of twice glycosylated molecules modified only in the P2 domain in the latter construct demonstrates that no molecules have a topology with the P2 domain and the short L2 loop between the second and third transmembrane segments in the lumen. Thus, the majority of the molecules have both the long loop and the P2 domain in the lumen (N cyt -C ext , leave one out topology) (Fig. 7B), although the singly glycosylated molecules in lane 5 of Fig. 7A suggest the additional presence of molecules with four transmembrane segments and N cyt -C cyt orientation. Similar proportions of nonglycosylated and glycosylated molecules were found in the membrane pellet after alkaline extraction (data not shown). DISCUSSION We have constructed a number of model proteins with either one, two, or four hydrophobic transmembrane segments and have analyzed their membrane assembly both in vitro in dog pancreas microsomes and in vivo in BHK cells. Most of these constructs have previously been analyzed in E. coli (36,39,40), and we can thus make a detailed comparison between the sequence determinants of membrane protein topology in prokaryotic and eukaryotic cells.
In general, we find that positively charged residues placed in the N-tail have similar effects on the topology of model proteins in both the prokaryotic and eukaryotic systems, i.e. that such residues prevent translocation of the N-tail across the membrane. However, there are also some obvious differences.
In constructs with two transmembrane segments, the length of the connecting loop does not seem to be important for its ability to be translocated across the ER membrane (Fig. 4), whereas in E. coli short and long loops apparently use different mechanisms for translocation and react differently to the presence of positively charged residues (40 -43). Thus, when two lysines are added to the N terminus of wild type Lep, most of the molecules insert into the ER membrane in an inverted N cyt -C cyt orientation (Fig. 3), whereas addition of three lysines to the N-tail has no effect on the orientation when the protein is expressed in E. coli. Our data thus suggest that positively charged residues placed in the N-tail are critical topological determinants in both E. coli and mammalian ER, whereas charged residues in the following P1 loop have a major effect on the topology only in E. coli.
Three of the four constructs with four transmembrane segments also behave similarly in the two systems. The nonfrustrated constructs 3K/0K/3K/0K and 0K/3K/0K/3K insert according to the positive inside rule both in E. coli (36) and in the ER. However, of the two frustrated constructs, one (3K/0K/0K/ 3K) adopts a leave one out N cyt -C ext topology in both E. coli and the ER, whereas the other (0K/3K/3K/0K) behaves differently in the two systems. Whereas in E. coli this construct has only three transmembrane segments and N ext -C cyt topology, both the N-and C termini are quite efficiently translocated across the ER membrane. This suggests that the L2 loop between the second and third hydrophobic segments is lumenal in these molecules, despite its high charge.
In summary, in all constructs with a highly charged N-tail the first transmembrane segment is oriented N cyt , and in constructs with no positively charged residue in the N-tail the orientation is N ext . This is consistent with earlier statistical (2)(3)(4)(5)(6)44) and experimental (17,19,21) studies. There is also a clear tendency for highly charged internal loops to remain on the cytoplasmic side, although this tendency is stronger in E. coli than in the ER system. The apparently greater ease with which highly charged loops are translocated in the eukaryotic system may be related to the absence of a membrane potential in the ER (cf. Ref. 45) and may explain the earlier observation that extra-cytoplasmic loops in eukaryotic plasma membrane proteins have on average a higher content of positively charged residues than found in bacterial inner membrane proteins (2)(3)(4).
We have previously suggested (36,46) that the so-called Sec-independent mechanism of membrane protein assembly in E. coli is based on helical hairpins composed of two neighboring hydrophobic segments connected by a short loop lacking in positively charged residues that insert en bloc. A purely sequential insertion process, on the other hand, where the orientation of all downstream transmembrane segments is dictated by the orientation of the most N-terminal one (47), is inconsistent with the leave one out topologies found in E. coli and with the N cyt -C ext topology found for constructs 3K/0K/0K/3K an 3K/88/0K/3K in the ER, as well as with the results of a recent study on the topology of P-glycoprotein mutants in the ER (48). Possibly, proteins with widely spaced hydrophobic segments insert in a sequential fashion (49,50), whereas regions with closely spaced hydrophobic segments insert according to a helical hairpin mechanism.
Finally, we note that the observation that the topological information present in certain of our constructs is interpreted differently in prokaryotic and eukaryotic cells suggests that problems encountered when trying to overexpress eukaryotic membrane proteins in prokaryotic hosts may in some cases be related to incorrect topologies. If so, redesigning the protein to conform better to the stricter prokaryotic version of the positive inside rule may be one way to obtain higher expression levels (51). FIG. 7. Topology of constructs with one long internal loop. A, constructs 3K/88/3K/0K (lanes 1-3) and 3K/88/0K/3K (lanes 4 and 5) were expressed either in the absence (Ϫ) or the presence (ϩ) of rough microsomes. Glycosylation acceptor sites in the long loop and P2 are indicated by an asterisk. Nonglycosylated products are indicated by a black circle, singly glycosylated products are indicated by a white circle, doubly glycosylated products are indicated by two white circles, and triply glycosylated products are indicated by three white circles. B, dominating topologies suggested by the glycosylation data in panel A. Nonmodified glycosylation sites are circled.