The topogenic contribution of uncharged amino acids on signal sequence orientation in the endoplasmic reticulum.

Signal sequences for insertion of proteins into the endoplasmic reticulum induce translocation of either the C- or the N-terminal sequence across the membrane. The end that is translocated is primarily determined by the flanking charges and the hydrophobic domain of the signal. To characterize the hydrophobic contribution to topogenesis, we have challenged the translocation machinery in vivo in transfected COS cells with model proteins differing exclusively in the apolar segment of the signal. Homo-oligomers of hydrophobic amino acids as different in size and shape as Val(19), Trp(19), and Tyr(22) generated functional signal sequences with similar topologies in the membrane. The longer a homo-oligomeric sequence of a given residue, the more N-terminal translocation was obtained. To determine the topogenic contribution of all uncharged amino acids in the context of a hydrophobic signal sequence, two residues in a generic oligoleucine signal were exchanged for all uncharged amino acids. The resulting scale resembles a hydrophobicity scale with the more hydrophobic residues promoting N-terminal translocation. In addition, the helix breakers glycine and proline showed a position-dependent effect, which raises the possibility of a conformational contribution to topogenesis.

Proteins destined for the endoplasmic reticulum (ER) 1 are synthesized with a hydrophobic signal sequence of typically 10 -20 uncharged, mainly apolar amino acids. This sequence is recognized by the signal recognition particle, which targets the nascent chain-ribosome complex via the signal recognition particle receptor to the ER membrane (1). The ribosome binds to the translocon, a gated pore made of several copies of the heterotrimeric Sec61 complex (2)(3)(4). The signal sequence inserts into the translocon, specifically contacting Sec61␣, in a manner that leads to translocation of either the C terminus or the N terminus across the membrane (5). Cleaved signals of secretory and type I membrane proteins (e.g. glycophorin) and signal anchor sequences of type II membrane proteins (e.g. transferrin receptor) translocate the C-terminal sequence, whereas the reverse signal anchors of cytochrome P-450, microsomal epoxide hydrolase, and opsin, for example, translocated the N-terminal sequence. The end of the signal that is translocated is determined by several factors. Charged residues flanking the apolar segment of the signal influence the insertion process in a manner that induces the more positive end to stay on the cytoplasmic side (6,7). However, the charge distribution is not generally sufficient to determine the orientation and to generate a unique topology (8,9). Hydrophilic sequences N-terminal of the signal may inhibit their translocation if they fold in the cytosol before targeting is completed (10). Similarly, we have recently observed that glycosylation at sites near the signal sequence can influence topogenesis by glycan attachment to polypeptide segments that are transiently exposed to the ER lumen (11).
In addition, the apolar segment of the signal itself makes a significant contribution to orienting the signal within the translocon and the membrane. In diagnostic mutant constructs, an increased fraction of N-terminal translocation was obtained with increasing length and hydrophobicity of this segment (12)(13)(14)(15). The influence of oligoleucine signals of different lengths on the topology was additive with the effects of flanking charges and of the N-terminal hydrophilic sequence (14). The topogenic contribution of the hydrophobic sequence was also shown to be important for natural proteins, since the correct and unique insertion of the signals of the vasopressin precursor (N cyt /C exo ) and of microsomal epoxide hydrolase (N exo /C cyt ) was compromised upon extending or shortening the apolar sequence, respectively (16).
In most constructs, both the length of the hydrophobic domain and its total hydrophobicity were altered simultaneously, hampering the distinction between these two factors with respect to their influence on topogenesis. Summing up the hydropathy indices for different sequences did not yield a good correlation between total hydrophobicity and the resulting topologies (14,15). The analysis suggested that neither the length of the apolar segment nor its hydrophobicity alone is responsible for the observed effects on insertion behavior. In fact, it was observed that orientation was also strongly affected by the distribution of hydrophobicity within the apolar segment, since the more hydrophobic terminus appeared to be preferentially translocated (15). Furthermore, it is obvious that two different sequences of the same length also differ in properties other than just hydrophobicity, such as the shape of the molecule and the propensity to assume ␣-helical conformation.
To identify the features within the hydrophobic segment of the signal that influence topogenesis and favor N-or C-terminal translocation, we have tested the effect of homo-oligomeric sequences of hydrophobic residues other than leucine. In addition, we tested the effect of individual residues of all uncharged amino acids within the context of a generic oligoleucine sequence. The resulting ranking of amino acids with respect to their effect on N-terminal translocation resembles a hydrophobicity scale yet differs from all existing major scales. This insertion scale may be useful to improve topology prediction and to characterize the functional properties of the inside of the translocation pore.

EXPERIMENTAL PROCEDURES
cDNA Constructs-The plasmids encoding H1⌬QL 16 and H1⌬QL 19 have been described previously (14). H1⌬Q derivatives with hydrophobic sequences Ala 16 , Ile 16 , Val 16 , Phe 16 , (Leu-Val) 8 , (Leu-Ala) 8 , Tyr 22 , Met 16 , and Met 22 were generated by annealing two complementary oligonucleotides encoding the sequence MGPQ and the respective hydrophobic sequences with 5Ј and 3Ј sticky ends for ligation into a KpnI and a BamHI site, respectively. For example, to construct H1⌬QV 16 , the oligonucleotides CATGGGACCGCAGGTAGTTGTCGTGGTGGTCGTA-GTTGTAGTTGTCGTGGTAGTTGTCGTGG and GATCCCACGACAA-CTACCACGACAACTACAACTACGACCACCACGACAACTACCTGC-GGTCCCATGGTAC were used. The annealed oligonucleotides were ligated into the plasmid pE-BE, digested with KpnI and BamHI. pE-BE corresponds to the expression vector pECE (17) with a modified polylinker and containing the H1 cDNA from BamHI to EcoRI, encoding the C-terminal domain (residues 60 -291) immediately following the transmembrane anchor of H1. Using annealing oligonucleotides designed to produce H1⌬QW 22 , constructs were obtained, which upon sequencing turned out to represent H1⌬QW 19 and H1⌬QW 21 .
Constructs encoding extended homo-oligopeptide sequences were generated by polymerase chain reaction with Vent polymerase (Roche Molecular Biochemicals) using an antisense oligonucleotide corresponding to the 3Ј-end of the oligopeptide sequence of an existing construct, extending it by three codons, and providing again a BamHI sequence, in combination with a 5Ј-primer corresponding to a sequence in the plasmid. For example, H1⌬QV 19 , H1⌬QV 22 , and H1⌬QV 25 were produced using a the mutagenic antisense oligonucleotides CGCGGATCCCAC-CACCACCACGACAACTACCACG, CGCGGATCCGACGACGACCAC-CACCACCACGACA, and CGCGGATCCTACTACAACGACGACGAC-CACCACC with the templates H1⌬QV 16 , H1⌬QV 19 , and H1⌬QV 22 , respectively. The polymerase chain reaction products were digested with KpnI and a BamHI and ligated into pE-BE.
To introduce different amino acids within the oligoleucine sequence of H1⌬QL 16 and H1⌬QL 19 for the constructs H1⌬QX 2/16 and H1⌬QX 2/19 , polymerase chain reaction was performed using an antisense oligonucleotide corresponding to the 3Ј-end of the oligoleucine sequence of H1⌬QL 16 or H1⌬QL 19 including the BamHI site and containing mutations to alter the sequence of Leu-8 and Leu-13 or the sequence of Leu-11 and Leu-16, respectively, to codons of the desired amino acid X. To introduce the amino acids in the N-terminal half of the oligoleucine sequence of H1⌬QL 19 for the constructs H1⌬QX 2/19N , polymerase chain reaction was performed using a sense oligonucleotide corresponding to the 5Ј-end of the sequence including the KpnI site and containing mutations to alter the sequence of Leu-4 and Leu-9 to the desired codons. All constructs were verified by DNA sequencing.
In Vivo Expression and Analysis of Receptor Constructs-Cell culture reagents were purchased from Life Technologies, Inc. COS-7 cells were grown in Eagle's minimal essential medium with 10% fetal calf serum at 37°C with 7.5% CO 2 . The media were supplemented with 2 mM L-glutamine, 100 units/ml penicillin, and 100 g/ml streptomycin. Transfection of COS-7 cells was performed using Lipofectin (Life Technologies, Inc.) in six-well clusters, and the cells were processed the second day after transfection. For in vivo labeling, transfected cells were incubated for 30 min in methionine-free medium, labeled for 30 min at 37°C with 100 Ci/ml [ 35 S]methionine (NEN Life Science Products) in starvation medium, transferred to 4°C, and washed twice with phosphate-buffered saline. To extract cytoplasmic proteins, the cells were incubated with 500 l of 0.1% saponin in phosphate-buffered saline for 30 min at 4°C. The saponin extract was removed, and the cells were lysed. Both fractions were immunoprecipitated using a rabbit antiserum directed against a synthetic peptide corresponding to residues 277-287 near the carboxyl terminus of the ASGP receptor H1 (anti-H1C). The immune complexes were isolated with protein A-Sepharose (Amersham Pharmacia Biotech) and analyzed by SDSpolyacrylamide gel electrophoresis.
For analysis with endo-␤-N-acetylglucosaminidase H (Roche Molecular Biochemicals), the immune complexes were isolated with protein A-Sepharose and boiled in 50 l of 50 mM sodium citrate, pH 6, 1% SDS. Aliquots were incubated with 1 milliunit of endo-␤-N-acetylglucosaminidase H for 2 h at 37°C. Finally, samples were boiled in SDSsample buffer and analyzed by SDS-polyacrylamide gel electrophoresis. The gels were fixed, soaked in 1 M sodium salicylate containing 1% glycerol, dried, and fluorographed on Kodak BioMax MR films. Quan-titation was performed using a PhosphorImager (Molecular Dynamics, Inc.).

RESULTS
Topogenesis of Hydrophobic Homo-oligomeric Signals-As a starting point to test the topogenic properties of different hydrophobic sequences, we used protein H1⌬Q, a derivative of subunit H1 of the human asialoglycoprotein receptor. Wildtype H1 is a typical type II membrane protein with a 40-residue N-terminal, cytosolic domain. Truncation of the 36 N-terminal amino acids in H1⌬ (Fig. 1A) did not affect its orientation in the membrane (14). In H1⌬Q, the arginine preceding the transmembrane domain was in addition mutated to glutamine, reducing the N-terminal positive charge. The charge distribution for the signal of H1⌬Q is still typical of a type II membrane protein: ϩ1 at the N terminus due to the ␣-amino group of Met-1 versus Ϫ1 in the C-terminal flanking region (calculated according to Ref. 7). Nevertheless, ϳ20% of the polypeptides inserted with an N exo /C cyt orientation (14). With a Leu 16 sequence in place of the wild-type 19-residue hydrophobic sequence, H1⌬QL 16 inserted with an increased fraction of ϳ60% N-terminal translocation (14). This construct was therefore a good model to test the insertion behavior of a variety of derivatives with different hydrophobic sequences.
Constructs were transiently expressed in COS-7 cells, labeled with [ 35 S]methionine for 30 min, immunoprecipitated, and analyzed by SDS-gel electrophoresis and fluorography. Their topologies with respect to the membrane (schematically shown in Fig. 1B) were derived from the glycosylation state of the proteins and from their extractability with 0.1% saponin. Translocation of the C terminus is apparent by glycosylation of the products. Translocation of the N terminus results in unglycosylated polypeptides, which can be distinguished from molecules that failed to insert by extraction with 0.1% saponin. This treatment allows soluble polypeptides to be released into the medium, whereas the membranes remain sufficiently intact to retain integrated proteins. As shown in Fig. 1C (lanes 1-4), both the endo-␤-N-acetylglucosaminidase H-sensitive, glycosylated form and the unglycosylated form of H1⌬QL 16 were completely resistant to extraction into the saponin supernatant (S) and were recovered with the remainder of the cells (C). This indicates that they were integral membrane proteins. In contrast, H1⌬QA 16 was entirely extracted with saponin ( Fig.  1C, lanes 5-8). Thus, a sequence of 16 alanines is not functional as a signal sequence for targeting and insertion into the ER membrane.
H1⌬Q with the wild-type transmembrane segment (lanes 9 -12) produced a small fraction of unglycosylated, inextractable polypeptides corresponding to proteins spanning the membrane with an exoplasmic N terminus. The majority of the polypeptides were glycosylated. However, a significant fraction was slightly smaller and saponin-extractable. This material is the product of signal peptidase cleavage, as has previously been observed for H1⌬ translated in vitro (18). There, the cleavage site had been identified by radiosequencing to be between glycine and serine at the end of the apolar segment. Depending upon the presence or absence of the N-terminal tail, the signal sequence is most likely positioned differently within the translocon or the membrane, exposing a cryptic site to signal peptidase. Interestingly, H1⌬ was not significantly cleaved in vivo (14,19), possibly because the kinetics of leaving the range of signal peptidase is much faster in vivo. Mutation of the Nterminal flanking arginine to glutamine in H1⌬Q appeared to further enhance accessibility of the cleavage site, resulting in more than 50% cleavage in vivo.
The other hexadecamer sequences tested in the context of H1⌬Q, i.e. Ile 16 , Phe 16 , Met 16 , and Val 16 , produced functional signal sequences giving rise to very few polypeptides that were both unglycosylated and extractable (Fig. 2). In comparison with Leu 16 , only Ile 16 generated a larger fraction of polypeptides with a translocated N terminus (86%; Fig. 2, lanes 3 and  4). All others exclusively translocated their C terminus (lanes 7-12). An alternating sequence of leucines and valines, ((LV) 8 ), yielded an intermediate distribution with 40% N-terminal translocation. The signal-anchor of H1⌬QV 16 and H1⌬Q(LV) 8 was almost completely cleaved.
In order to determine the relative topogenic preference of homo-oligomers of valine, phenylalanine, and methionine, we generated constructs with extended sequences (Val 16 to Val 25 , Phe 16 to Phe 25 , Met 22 ). In addition, we also analyzed oligomers of tryptophan (Trp 19 and Trp 21 ) and tyrosine (Tyr 22 ), i.e. of the amino acids with the most bulky side chains. Residues that are less hydrophobic than alanine were not tested, since they are unlikely to yield functional signal sequences. All of the sequences tested were capable of targeting and inserting the protein into the ER membrane, although not always completely (Fig. 2, lanes 19 -42). As observed before for oligoleucine sequences (14) (Fig. 3), longer oligomers of valine, tryptophan, or phenylalanine produced increasing fractions of polypeptides with N-terminal translocation (Figs. 2 and 3). The length dependence of orientation for the different amino acids is approximately parallel but shifted relative to each other (Fig. 3). Fifty percent N-terminal translocation is obtained with approximate lengths of 14 (leucine), 19 (valine and tryptophan), and 23 residues (phenylalanine).
In the oligovaline series, an increasing population of products was found not to be targeted (i.e. was not glycosylated but was extractable), indicating that the functionality of the signal sequence was reduced with increasing length. Yet, long hydrophobic sequences per se did not cause poor functionality, since even 25-mers of leucine (14) and phenylalanine (Fig. 2) were completely integrated. Incomplete insertion was also obtained for constructs with Trp 19 , Trp 21 , and Tyr 22 . In the case of Tyr 22 , however, even glycosylated polypeptides without apparent signal cleavage were partially extracted, suggesting that an oligotyrosine sequence may not be capable of efficiently anchoring a protein in the membrane. From the quantitation of the results (Fig. 3), the following ranking of the amino acids with respect to their ability as a homo-oligomer to translocate the N terminus can be derived: Signal Cleavage Does Not Influence Topogenesis-We have recently observed that glycosylation can influence topogenesis of proteins with ambiguous topogenic determinants (11). This indicated that, within the translocon, the nascent polypeptides undergo dynamic reorientation, which can be influenced by protein modifications occurring simultaneously. Since signal cleavage, like glycosylation, occurs cotranslationally, it might also affect topogenesis. To test this possibility, we compared the insertion behavior of the signals Val 16 , (Leu-Val) 8 , Leu 16 , and Ile 16 with sequences carrying individual mutations designed to prevent or to allow signal cleavage.
The most probable reason why Val 16 , (Leu-Val) 8 , and wild type, but not Leu 16 and Ile 16 , allow signal cleavage is the presence of valines near the C terminus that fit into position Ϫ3 or Ϫ1 relative to the cleavage site where signal peptidase requires small, uncharged residues (20). Indeed, mutation of the last two valines of Val 16 to leucines in V 14 LL blocked cleavage entirely (Fig. 4, lanes 3 and 4), as did mutation of the last valine of (Leu-Val) 8 to leucine in (LV) 7 LL (lanes 7 and 8). Conversely, exchanging the penultimate residue of Leu 16 to valine in L 14 VL (lanes 11 and 12) induced efficient cleavage of the glycosylated products. However, mutations to valine in the terminal positions of Ile 16 (I 13 VII, I 14 VI, and I 15 V; lanes [15][16][17][18][19][20] were ineffective. Most importantly, the mutations did not significantly affect the ratio of N-to C-terminal translocation, as is evident from quantitation of the products (values indicated in Fig. 4). Thus, signal cleavage did not detectably influence topogenesis. Therefore, cleavage of a construct does not interfere with the analysis of the topogenic properties of different sequences.

Topogenic Contribution of Uncharged Amino Acids as Part of a Hydrophobic
Signal-The hydrophobic segment of natural signal sequences is not exclusively composed of hydrophobic amino acids; it may also contain polar, uncharged residues. To determine the contribution of all uncharged amino acids in the context of a hydrophobic signal sequence on its orientation, two residues in the generic oligoleucine signal of H1⌬QL 16 or H1⌬QL 19 were exchanged for all uncharged residues (H1⌬QX 2/16 or H1⌬QX 2/19 ; Fig. 5). The nonleucine residues were placed at positions n and n ϩ 5, which places them on opposite sides of an ␣-helix. The expression products are shown in Fig. 5 (A and B), and the resulting topologies are quantified in Fig. 6. The residues that were tested in both H1⌬QX 2/16 and H1⌬QX 2/19 showed the same relative behavior: Leu Ϸ Trp Ͼ Tyr Ͼ Cys Ͼ Ala Ͼ Thr. Only isoleucine and valine increased the fraction of polypeptides with a translocated N terminus in comparison with an oligoleucine sequence. Tryptophan had no effect, whereas all the other amino acids reduced N-terminal translocation to various degrees. The amino acids that most effectively reduced N-terminal translocation are the most polar uncharged residues histidine, glutamine, and asparagine and the helix breakers proline and glycine.
In the initial series H1⌬QX 2/16 and H1⌬QX 2/19 , the nonleucine residues were placed in positions 4 and 9 from the C-terminal end of the hydrophobic sequence. To test whether the position in the second half of the hydrophobic sequence is important for topology, constructs were made in which Thr, Ser, Gly, Asn, Gln, His, and Pro, i.e. the residues most strongly affecting the topogenic behavior of the apolar sequence, were placed in positions 4 and 9 from the N-terminal end (H1⌬QX 2/ 19N; Fig. 5C and Fig. 6, circles). Moving these residues from the C-terminal to the N-terminal half of the signal inverts the hydrophobicity gradient along the hydrophobic sequence, which was expected to result in a slight reduction of N-terminal  Fig. 2 were quantified using a PhosphorImager. The fraction of polypeptides with a translocated N terminus (i.e. unglycosylated and inextractable) as a percentage of the total of products that were inserted into the ER membrane (i.e. all products except unglycosylated, extractable polypeptides) is plotted versus the length of the hydrophobic homo-oligopeptide sequence. Since glycosylated H1⌬QY 22 was partially extractable in the absence of signal cleavage (Fig. 2, lanes  37 and 38), the fraction of unglycosylated, membrane-integrated protein was corrected accordingly (Y). The mean and S.D. of 3-6 determinations are shown. translocation (15). This effect was observed for asparagine, whereas for threonine, serine, glutamine, and histidine, no statistically significant difference could be observed. In the case of proline and glycine, however, N-terminal translocation was significantly increased in H1⌬QX 2/19N versus H1⌬QX 2/19 . Since proline and glycine are the two strongest helix breakers, this result suggests that the conformation of the hydrophobic sequence also influences topogenesis.

DISCUSSION
Signal sequences have two distinct functions: to target the nascent chain-ribosome complex to the ER membrane and to initiate translocation of the sequence on either one of its ends. The first function is to recruit the signal recognition particle in an interaction that requires a sufficiently hydrophobic core of the signal (1,21). After targeting to the ER membrane, the signal is recognized by the Sec61 translocation complex, leading to a tight junction between the ribosome-nascent chain complex and the translocon (22). Photocross-linking analysis revealed that upon complete insertion into the channel, signal sequences are precisely positioned with respect to the protein components of the channel and lipids, suggesting a specific binding site of protein-protein interactions at the interface between the channel and the surrounding lipids (5,23). The hydrophobic segment of the signal was found to specifically contact the transmembrane helices 2 and 7 of Sec61p in yeast (24). An additional component, TRAM (translocating chain-associating membrane protein), was shown by in vitro crosslinking to be in contact with signal sequences (2,25,26) and in particular with the N-terminal hydrophilic segments (27). In reconstitution assays, TRAM was found to be particularly important for signals with relatively short N-terminal hydrophilic segments (25). The TRAP (translocon-associated protein) complex has also been shown to be in proximity to translocating signal sequences (28,29), and although it is not essential for translocation, its true role is yet to be elucidated. Recognition of the signal sequence by a binding site in or closely associated with the translocon is likely to be a key event in defining the orientation of the signal and thus whether the N terminus or the C terminus of the protein is translocated across the membrane.
The flanking charges of the signal are most likely to exert their effect on orienting the peptide by electrostatic interactions with the translocon. It is less clear how the apolar segment affects topogenesis. The effect of a hydrophobicity gradient along the apolar sequence, as observed by Harley et al. (15), could be explained by a similar gradient in the signal binding site of the translocon. A more hydrophobic surface near the luminal side would result in preferential orientation of the signal with its more hydrophobic end pointing toward the ER lumen. How longer and/or more hydrophobic sequences without hydropathy gradients induce increased N-terminal translocation is not obvious. To analyze the topogenic properties of the apolar signal sequence, we challenged the insertion machinery in vivo, using transfected COS-7 cells, with a variety of diagnostic constructs differing exclusively in the hydrophobic segment of the signal. This segment was either a homo-oligomer of a hydrophobic amino acid or oligoleucine sequences with two interspersed uncharged amino acids. The behavior of homo-oligomer signals showed that the machinery can cope with sequences that differ dramatically in size and shape, e.g. oligovaline versus oligotryptophan. Oligotryptophan and oligotyrosine sequences were sufficiently hydrophobic to be recognized by the signal recognition particle and by the translocon. This is not trivial, since various hydrophobicity scales differ considerably in the classification of these two residues (Table I). According to some criteria, tryptophan is considered the most hydrophobic amino acid, whereas according to others it is ranked below serine and threonine. Similarly, tyrosine is regarded as a very hydrophilic residue by some scales. Although the oligotyrosine sequence was functional for targeting and insertion, it appeared not to be an efficient membrane anchor, since glycosylated (and therefore initially integrated) polypeptides were partially extractable by saponin. Furthermore, our results confirm as a general rule what was previously observed for oligoleucine sequences of 7-25 residues (14); the longer a homo-oligomeric sequence of a given residue, the higher the proportion of proteins with a translocated N terminus.
Leucine is the most abundant amino acid in transmembrane and signal sequences (6,30). It can thus serve as a generic hydrophobic matrix to assess the relative effect of interspersed nonleucine residues, even if they are polar and cannot possibly function as a targeting signal by themselves. The ranking order of residues with respect to promoting or allowing N-terminal translocation in an oligoleucine context can be seen to resemble a hydrophobicity scale. However, existing hydrophobicity scales, which are based on different theoretical calculations and/or various experimental measurements, differ considerably from each other (Table I). Our topogenic scale resembles most closely the hydropathy scale by Kyte and Doolittle (31). The main disagreements are the rankings of tryptophan, tyrosine, glycine, and proline.
One observation suggests, however, that the hydrophobicity of the side chains is not the only criterion influencing topogenesis. The helix breakers glycine and proline showed a positiondependent effect. Proline is a helix breaker because it cannot form a hydrogen bond between its amide nitrogen and the backbone carbonyl of a residue at position Ϫ4 in an ␣-helix. For  Fig. 5 were quantified using a PhosphorImager. The fraction of polypeptides with a translocated N terminus (i.e. unglycosylated and inextractable) as a percentage of the total inextractable products is plotted for each amino acid X in H1⌬QX 2/16 (A, bars), H1⌬QX 2/19 (B, bars), and H1⌬QX 2 /19N (B,  open circles). The mean and S.D. of three or four determinations are shown.

TABLE I Topogenic preference of amino acids for N-terminal translocation versus hydrophobicity and helix propensity scales
The ranking of amino acids (one-letter codes) with respect to topogenic preference for N-terminal translocation as determined by homooligopeptide signal sequences in constructs H1⌬QX n (homo; from  (40)) or to ␣-helix propensity in an apolar environment (LD, Liu and Deber (33)) or as determined by statistical analysis of the structures of globular proteins (CF, Chou and Fasman (32)).

Topogenesis
Hydrophobicity scales ␣-Helix propensity this reason, a proline is tolerated in positions 1-4 of a helix. In H1⌬QP 2/19 and H1⌬QP 2/19N , there is one proline near the center of the hydrophobic sequence and a second one at position 4 either from the C terminus, where it destabilizes or kinks the ␣-helix, or from the N terminus, where it does not. In H1⌬QP 2/ 19N, the effect of the prolines on the topologies is smaller than in H1⌬QP 2/19 , correlating with the conformational effect of the second proline. Similarly, the effect of glycine, which destabilizes an ␣-helix by increasing the conformational freedom, is reduced near the N terminus of the helix. The conformation of the hydrophobic core of the signal may therefore play a role in topogenesis. There is clearly no correlation between the topogenic effects and the ␣-helix propensities of the amino acids as determined from the structures of soluble proteins (32) (Table I, column CF) or from circular dichroism analysis of peptides in aqueous solution (33). However, it has been shown that the helix propensities are dramatically different in a hydrophobic environment (33,34). Testing the effect of different amino acids X on the helix content of peptides with the sequence KKAAAXAAAAAXAAWAAXAAA-KKKK-amide in n-butanol yielded a helicity scale that is quite similar to hydrophobicity scales (33) ( Table I, column LD). This scale is also quite similar to the topogenic scale determined here. The major differences concern glycine, whose properties are position-dependent in our system, and tryptophan.
A recent study on the insertion of polypeptides with closely spaced conflicting signal sequences demonstrated that dynamic reorientation of polypeptides can occur within the translocon (11). This is also likely to happen for the signals analyzed here, since the topogenic determinants are somewhat ambiguous, resulting in a mixture of orientations. Signals that form a short, unstable, or kinked hydrophobic helix may reorientate more easily within the translocon than those with a long and stable helix. Since this correlates with preferential translocation of the C and the N terminus, respectively, it might indicate that the signal initially inserts into the translocon with its free N terminus and subsequently, depending on its flanking charges, tends to invert its orientiation as the nascent polypeptide grows at the C-terminal end. A sizable N-terminal hydrophilic extension would be expected to inhibit initial insertion with the N terminus and reduce N-terminal translocation. This is indeed the case, since N-terminal translocation, which for the tailless protein H1⌬Leu 25 amounted to ϳ80%, was entirely blocked for H1Leu 25 , which carries the 40-residue N-terminal extension of the wild-type receptor (14). Only in combination with inverted flanking charges was the effect of the long hydrophobic sequence restored (H1-4⌬Leu 19 -22 ).
In this study, we have presented a new scale ranking amino acids with respect to their tendency to induce N-terminal translocation of a signal sequence. The precise mechanism by which components of the ER membrane determine the final orientation of a protein remains unclear. However, this scale, determined in vivo, will be useful in giving a more accurate method to predict the behavior of amino acids in protein topogenesis.