Identification of Sequence Determinants That Direct Different Intracellular Folding Pathways for Aquaporin-1 and Aquaporin-4*

Homologous aquaporin water channels utilize different folding pathways to acquire their transmembrane (TM) topology in the endoplasmic reticulum (ER). AQP4 acquires each of its six TM segments via cotranslational translocation events, whereas AQP1 is initially synthesized with four TM segments and subsequently converted into a six membrane-spanning topology. To identify sequence determinants responsible for these pathways, peptide segments from AQP1 and AQP4 were systematically exchanged. Chimeric proteins were then truncated, fused to a C-terminal translocation reporter, and topology was analyzed by protease accessibility. In each chimeric context, TM1 initiated ER targeting and translocation. However, AQP4-TM2 cotranslationally terminated translocation, while AQP1-TM2 failed to terminate translocation and passed into the ER lumen. This difference in stop transfer activity was due to two residues that altered both the length and hydrophobicity of TM2 (Asn49 and Lys51 in AQP1 versus Met48 and Leu50 in AQP4). A second peptide region was identified within the TM3–4 peptide loop that enabled AQP4-TM3 but not AQP1-TM3 to reinitiate translocation and cotranslationally span the membrane. Based on these findings, it was possible to convert AQP1 into a cotranslational biogenesis mode similar to that of AQP4 by substituting just two peptide regions at the N terminus of TM2 and the C terminus of TM3. Interestingly, each of these substitutions disrupted water channel activity. These data thus establish the structural basis for different AQP folding pathways and provide evidence that variations in cotranslational folding enable polytopic proteins to acquire and/or maintain primary sequence determinants necessary for function.

Aquaporins comprise a conserved family of membrane proteins that form water-and/or solute-selective channels in cellular membranes (1,2). Hydropathy analysis, topologic studies and cryoelectron diffraction experiments indicate that aquaporins contain a hydrophobic core region with six transmembrane (TM) 1 helices (3)(4)(5)(6). Ten mammalian aquaporins have been identified to date. They exhibit distinct cellular and subcellular expression patterns and play important roles in regulating water homeostasis under a variety of normal and pathologic conditions (7)(8)(9)(10)(11)(12). In cell membranes, aquaporins exist primarily as higher ordered structures. AQP1 forms stable tetramers, with each monomer comprising a functional channel (13)(14)(15). A closely related aquaporin, AQP4, also forms tetramers which are organized into large macromolecular arrays (16,17). Both AQP1 and AQP4 form high capacity water-selective channels, suggesting that they share similar overall structural features (18).
Eukaryotic aquaporins are synthesized and assembled in the rough endoplasmic reticulum (ER). During this process TM segments are oriented with respect to the ER membrane and integrated into the lipid bilayer (19,20). These early biogenesis events are likely mediated via specific interactions between the Sec61 ER translocation machinery and topogenic sequence determinants (e.g. signal and stop transfer sequences) encoded within the nascent polypeptide (21)(22)(23)(24)(25). Consistent with this, topogenic determinants have been identified in AQP1 and AQP4 and shown to direct specific translocation and integration events (19,20).
Despite their homology and similar function, AQP1 and AQP4 encode topogenic determinants with very different translocation properties. AQP4 contains three internal signal anchor sequences and three stop transfer sequences that sequentially initiate and terminate polypeptide translocation as the nascent chain emerges from the ribosome (20). These determinants thus establish six TM segments in a vectoral and cotranslational manner. AQP1, in contrast, encodes only two distinct signal sequences (TM1 and TM5) and two stop transfer sequences (TM3 and TM6) that cotranslationally establish four TM segments (19). AQP1-TM2 and TM4 are initially directed into the ER lumen and cytosol, respectively. Recent studies have revealed that this four membrane-spanning structure represents a folding intermediate that undergoes a topological reorientation during and following the completion of AQP1 synthesis (26). Thus, whereas mature AQP1 and AQP4 both exhibit a similar six membrane-spanning topology, the translocation, integration, and folding events that establish this topology are quite different.
To determine why closely related proteins utilize different folding pathways, we used a series of chimeric AQP1-AQP4 proteins (diagrammed in Fig. 1) and compared the sequence determinants responsible for directing specific translocation events. Topologic analysis identified differences in two peptide regions. Hydrophilic residues flanking the N terminus of AQP1-TM2 (Asn 49 and Lys 51 ) prevented TM2 from functioning as a stop transfer sequence and terminating translocation. Similarly, residues within the TM3-4 peptide loop enabled AQP4-TM3 but not AQP1-TM3 to reinitiate translocation and cotranslationally span the membrane. Exchanging only these two regions in AQP1 with corresponding residues from AQP4 converted AQP1 to a cotranslational biogenesis pathway. However, the resulting chimeras lacked water channel activity. Taken together, our results demonstrate how minor sequence variations can significantly affect polytopic protein folding and provide an example in which a novel posttranslational biogenesis pathway is utilized in order to conserve structural determinants which are critical for function.

MATERIALS AND METHODS
cDNA Construction-AQP chimeras were engineered using the polymerase chain reaction (PCR) overlap extension method (27) (Vent DNA polymerase, New England Biolabs, Beverly, MA). Complimentary oligonucleotides were designed to span the desired fusion site between AQP1 and AQP4 proteins (Table I). These oligonucleotides together with flanking 5Ј and 3Ј primers were used to perform two initial PCR reactions using template plasmids pSP64.CHIP28 (AQP1) and SP64T-MIWC (AQP4) described previously (19,20). The overlapping PCR fragments were then hybridized and used as templates for a third PCR amplification, thus "ligating" AQP1 and AQP4 sequences at the fusion site. Products of PCR 3 were digested with restriction enzymes (NcoI and AvaI for AQP1, and HindIII and BstXI for AQP4) and ligated into appropriately digested and phosphatase-digested pSP64.CHIP28 and SP64T-MIWC vectors to regenerate full-length proteins. For constructs where internal peptide regions were mutated or exchanged, this process was repeated using wild type or mutant AQP chimeras as the DNA template. Specific residues exchanged are indicated in Table I, and locations of each fusion site are shown schematically in Fig. 1.
Fusion proteins containing the C-terminal translocation reporter were generated by amplifying wild type, mutant, or chimeric AQP coding sequences (10 -15 cycles of PCR) using a sense oligonucleotide (SP6 promoter) and antisense oligonucleotides encoding a BstEII restriction site at AQP4 codons Val 46 , Gly 72 , Lys 92 , Val 120 , Val 140 , or AQP1 codons Val 52 , Pro 77 , Arg 93 , Val 107 , and Val 139 . Antisense oligonucleotides were identical to those described previously (19,20), and fusion sites are indicated in Fig. 1B. PCR fragments were digested with NcoI and BstEII (AQP1 constructs) or HindIII and BstEII (AQP4 constructs) and ligated 5Ј to a translocation reporter (P), which encodes 142 Cterminal residues from the secretory protein bovine prolactin. This reporter contains no intrinsic topogenic information and faithfully fol-lows topogenic determinants in a wide variety of contexts (19,28). The resulting plasmids thus encode AQP1 or AQP4 extending from residue 1 to the engineered BstEII truncation site followed by the P reporter. Wild type, mutant, or chimeric TM3 constructs were generated by PCR amplification using sense oligonucleotides TCAACCCCATGGTCA-CACTGGGGCTGCTG (AQP1) and TCAACCTCATGATCACAGTGGC-CATGGTGTG (AQP4) that encode ATG start codons at residue Val 79 and Ala 77 , respectively, and antisense oligonucleotides encoding a BstEII site at residue Leu 139 (AQP1) or Val 140 (AQP4). Fragments were digested with NcoI and BstEII and ligated into an NcoI/BstEII-digested vector containing the P reporter. All regions of cDNA generated by the

TABLE I Chimeric constructs
Chimeric constructs are referred to by the region of AQP exchanged followed by the parent AQP protein. Fusion sites are indicated by /. Shown in parentheses are peptide regions that have been exchanged for internal segments. For example, AQP4-TM1/AQP1 indicates that AQP1 residues from the N terminus through TM1 have been replaced with corresponding residues from AQP4; AQP4(TM2)/AQP1 indicates that AQP1-TM2 has been replaced with AQP4-TM2. Specific fusion sites designed in PCR oligonucleotides are shown at right. ECL-1 refers to the first extracellular loop as diagramed in Figure 1.  GGACAGCTCACTGCAGGCCATGGGCTCCTGGTG PCR overlap extension were verified by DNA sequencing. Truncated constructs were verified by extensive restriction digestion and/or sequencing.
In Vitro Transcription, Translation, and Protease Digestion-mRNA was transcribed with SP6 RNA polymerase (Epicenter, Technologies, Madison, WI) using 2 g of plasmid DNA in a 10-l volume at 40°C for 1 h as described (19). Transcription mixture was added directly to translation mixture containing [ 35 S]methionine, 40% rabbit reticulocyte lysate (29), and canine pancreas rough microsomal membranes prepared as described (30). Translation was carried out for 1 h at 24°C (19,29). Final concentration of membranes was A 280 ϭ 8.0, which resulted in Ͼ90% translocation and ϳ80% glycosylation of control proteins (31). For protease digestion, the translation mixture was aliquoted on ice, and CaCl 2 was added to 10 mM final concentration (19). Proteinase K (PK) was then added (0.2 mg/ml) in the presence or absence of 1% Triton X-100. Samples were incubated on ice for 1 h, and residual protease was inactivated by rapid mixing with phenylmethylsulfonyl fluoride (10 mM) and heating to 100°C in 10 volumes of 1% SDS, 0.1 M Tris, pH 8.0, for 5 min. Samples were then added directly to SDS loading buffer and analyzed by SDS-PAGE.
Oocyte Water Permeability-Stage VI Xenopus laevis oocytes were harvested and maintained as described previously (32). cRNA was transcribed from linearized plasmids using SP6 polymerase and microinjected into oocytes. Water permeability was determined 24 -48 h following injection using a hypotonic swelling assay (32). Oocyte swelling was measured in response to a 20-fold dilution of the extracellular Barth's buffer with distilled water. Oocyte volume was measured in 1-s intervals by quantitative imaging. Temperature control was maintained by a circulating water bath. Oocyte P f was calculated from the initial rate of swelling, d( Autoradiography and Quantitation-SDS-PAGE gels were prepared using EN 3 HANCE (PerkinElmer Life Sciences) fluorography and autoradiography according to manufacturers instructions. Autoradiograms were scanned using a Pharmacia LKB Image Master DTS densitometer and quantitated using Image Master 1D software version 1.0 (Amersham Pharmacia Biotech). Prior to analysis, the densitometer was pre-calibrated with a Kodak photographic step tablet, and linearity was confirmed by serial dilution over a 40-fold concentration range and by phosphorimaging analysis (Bio-Rad Personal Molecular Imager Fx, Quant-1 software). Band intensities of autoradiograms and/or phosphorimaged gels were calculated as volume averaged pixel intensity (OD ϫ mm 2 ). Translocation efficiency was determined by correcting for the fractional methionine content of protease-protected peptide fragments relative to full-length polypeptides. Figs. were prepared using an Agfa Studio Scan II transmission scanner and Adobe Photoshop software.

AQP1 and AQP4 Topogenic Determinants Direct Different
Cotranslational Topology-To compare the cotranslational translocation events that generate AQP1 and AQP4 topology, polypeptides were truncated at homologous sites following TM1, TM2, or TM3 and ligated to a C-terminal translocation reporter (P) derived from bovine prolactin (ϳ15 kDa in size). Fusion sites and resulting constructs are diagrammed in Figs. 1 and 2A. Topology of the reporter was then determined by protease accessibility following expression in rabbit reticulocyte lysate supplemented with canine pancreas rough microsomes (CRM). Under these conditions, the reporter is protected from exogenous protease only when it resides in the ER lumen and only in the absence of detergent (28,33). Because the reporter domain encodes no intrinsic topogenic information and faithfully follows topogenic signals (19,28), it thus reflects the cotranslational topology of each fusion site at a particular stage of protein synthesis. Fig. 2 shows topology of the reporter engineered at five sequential fusion sites in AQP4 (gray) and AQP1 (black). Consistent with previous studies, each construct truncated after TM1 (residues Val 46 (AQP4) or Val 52 (AQP1) generated two protease-protected fragments (Fig. 2, B and C, lanes 1-3, downward arrows). For AQP4, we previously showed that the larger fragment represents the full-length construct, while the smaller fragment resulted from an internal cleavage event, likely at a cryptic signal peptidase recognition site (20). Similarly, the three bands generated from plasmid AQP1.52.P represent full-length glycosylated, full-length non-glycosylated, and signal peptidase-cleaved polypeptides (Fig. 2C, lane 1, top, middle, and bottom bands, respectively) (19). Based on the efficiency of PK protection, TM1 C-terminal flanking residues resided in the ER lumen in greater than 70% of AQP1 and AQP4 nascent chains. Non-glycosylated AQP1 chains were not protected and represent polypeptides that either failed to target to the ER or failed to effectively translocate the reporter. Previous experiments have confirmed that the N termini of both AQP1 and AQP4 reside in the cytosol, although they are inaccessible to protease (19,20).
AQP4 fusion sites at residues Gly 72 and Lys 92 in AQP4, resided in the cytosol and were protease accessible (Fig. 2B, lanes 4 -9). Residue Val 120 , at the C terminus of TM3, was also initially cytosolic (Fig. 2B, lanes 10 -12). However, after synthesis of 20 additional residues, the fusion site at Val 140 was translocated into the ER lumen in Ͼ50% of nascent chains as demonstrated by N-linked glycosylation at residue Asn 131 and generation of a protease-protected, 28-kDa glycosylated polypeptide fragment (Fig. 2B, lanes 13 and 14, downward  arrows). Glycosylation was confirmed by translation in the absence of CRM and/or in the presence of a tripeptide inhibitor of oligosaccharyltransferase (Ref. 20 and data not shown). Note that PK cleavage within the TM2-3 connecting loop would be expected to generate a fragment containing the 15-kDa P reporter, 3 kDa of N-linked carbohydrates, and 7 kDa of AQP4 polypeptide. Thus the protected fragment migrates slightly slower than its predicted size. The intensity of the fragment reflects that it contains only 44% of initial methionine residues. Taken together, these results support a cotranslational biogenesis model in which AQP4-TM1 targets the nascent chainribosome complex to the ER membrane and initiates translocation; TM2 terminates translocation, and TM3 (together with its C-terminal flanking residues) reinitiates translocation and spans the membrane in a type II topology.
In contrast to AQP4, AQP1 fusion sites within the TM2-3 peptide loop (residues Pro 77 and Arg 93 ) were directed into a protease-protected environment (Fig. 2C, lanes 4 -9). Both uncleaved as well as signal peptidase-cleaved nascent chains remained protected from PK digestion, indicating that AQP1-TM2 failed to terminate translocation. Moreover, following the synthesis of TM3, residues Val 107 and Leu 139 both remained accessible to protease, indicating that TM3 spanned the membrane with its N terminus in the ER lumen and C terminus in the cytosol (Fig. 2C, lanes 10 -15). Thus as AQP1 topogenic determinants emerge from the ribosome, they direct the nascent chain into a very different transmembrane orientation than corresponding determinants encoded within AQP4. Rather than spanning the membrane in a type I topology, AQP1-TM2 passes into the ER lumen. AQP1-TM3 then terminates translocation and initially spans the membrane in a type I topology (see diagrams in Fig. 2, B and C).
Effect of AQP4 Residues on AQP1 Topology-Previous studies of AQP1 and AQP4 have demonstrated no difference in the cotranslational topology of peptide loops C-terminal to TMs 4, 5, and 6 (19,20). We therefore reasoned that topogenic information responsible for different translocation events should reside within TM1-TM3. To test this hypothesis, a series of chimeric proteins was generated by systematically exchanging AQP1 and AQP4 TM segments and their flanking residues (diagrammed in Fig. 1). Each chimera was then truncated and fused to the P reporter, and topology at each fusion site was determined by protease protection.
Replacing AQP1-TM1 with corresponding residues from AQP4 produced no detectable change in AQP1 topology (Fig.  3A). Residues C-terminal to TM2 (Pro 77 ), were translocated into the ER lumen, and residues Val 107 and Leu 139 were directed toward the cytosol. The latter two constructs generated a 16-kDa protease-protected fragment (Fig. 3A, lanes 5 and 8, downward arrows) that was derived from the N terminus of AQP1 and contained TM1, TM2, and TM3 as described previously (19). Similar fragments were also observed in Fig. 2C upon longer exposure and did not contain the P reporter (data not shown). Exchanging the N terminus through the first extracellular loop (ECL1) of AQP1 also did not alter AQP1 topology, although signal peptidase cleaved at two different sites in the Pro 77 truncations (Fig. 3B). When the N terminus of AQP1 through TM2 was exchanged, the Pro 77 fusion site became accessible to protease, indicating that TM2 terminated translocation and its C-terminal residues were oriented toward the cytosol (Fig. 3C, lanes 1-3). Surprisingly, however, this did not enable AQP1-TM3 to achieve its proper type II topology. Fusion sites N-terminal and C-terminal to TM3 all remained accessible to protease (Ͻ20% protection of the P reporter; Fig. 3C, lanes 4 -9).
Effect of AQP1 Residues on AQP4 Topology-We next tested the reciprocal effects of replacing AQP4 residues with corresponding residues from AQP1. Exchanging TM1 or TM1-ECL1 had no effect on AQP4 topology (Fig. 4, A and B). In each case TM2 terminated translocation, and TM3 reinitiated translocation. When TM1 and TM2 were exchanged together, however, TM2 translocated into the ER lumen (Fig. 4C, lanes 1-3). In this context, residue Val 120 was initially oriented in the cytosol, indicating that AQP4-TM3 could also terminate translocation (Fig. 4C, lanes 4 -6). However, after synthesis through residue Val 140 , approximately 33% of nascent chains became doubly glycosylated, indicating that both Asn 42 and Asn 131 resided in the ER (Fig. 4C, lanes 7-9). In addition, PK digestion of these latter chains generated the 28-kDa protected fragment, demonstrating that TM3 spanned the membrane in a type II topology (Fig. 4C, lane 8). Thus, TM3 not only translocated its C-terminal residues into the ER lumen, it also reoriented Nterminal flanking residues from the ER lumen back into the cytosol. Of note, translocation was less efficient (Ͻ50%) than in wild type AQP4 (Fig. 2B), suggesting that TM3 topology was at least partially dependent on whether or not TM2 terminated translocation.
Role of TM2 in Directing AQP Topology-Results from Figs.  3. Effect of AQP4 residues on AQP1 topology. N-terminal residues of AQP1 (black) were replaced with corresponding residues from AQP4 (gray). Specific residues exchanged are listed in Table I 3 and 4 suggest that the topogenic behavior of TM2 plays a key role in AQP biogenesis. Consistent with this, the hydrophobic core of AQP4-TM2 efficiently terminated translocation when it was engineered into AQP1 (Fig. 5, A and B), and AQP1-TM2 failed to terminate translocation when it was inserted into AQP4 (Fig. 5C). The ECL1 loop had no effect on AQP topology. TM2 stop transfer activity is thus independent of the AQP context in which it is presented. We noted that TM2 sequences are highly homologous with the exception of two residues at the N terminus of the predicted membrane-spanning segment, Asn 49 and Lys 51 in AQP1 versus Met 48 and Leu 50 in AQP4 (Fig.  6A). As a result, the core of AQP1-TM2 is less hydrophobic and three residues shorter than AQP4-TM2. To test whether these residues were responsible for differences in TM2 stop transfer activity, Asn 49 /Lys 51 in AQP1 were mutated to Met 48 /Leu 50 (referred to as ML). Topologic analysis of fusion proteins confirmed that AQP1-TM2(ML) efficiently terminated translocation and exhibited a transmembrane topology identical to that observed for AQP4-TM2 (compare Figs. 6B and 2C, lanes 1-3). Interestingly, in all contexts where TM2 terminated translocation, residue Asn 42 , which is 9 residues from TM2, was inaccessible to oligosaccharyltransferase (Figs. 4B, 5B, and 6B). This finding suggests that transient translocation of TM2 into the ER lumen is required for Asn 42 glycosylation, and is consistent with previous studies demonstrating that utilized Nlinked consensus sites are usually located more than 12 residues from the end of TM segments (34 -36).
AQP4-TM3 but Not AQP1-TM3 Functions as a Signal Sequence-Two separate topogenic events thus appear to be responsible for different AQP biogenesis pathways, translocation termination by TM2 and translocation reinitiation by TM3. We therefore compared TM3 signal sequence activities by placing AQP1-and AQP4-TM3 coding sequences immediately N-terminal of the P reporter. In this context, AQP4-TM3 functioned as an efficient signal sequence, whereas no translocation was observed for AQP1-TM3 (Fig. 7B, lanes 4 -6 and 1-3, respectively). TM3 and its N-terminal flanking residues are highly homologous with the exception of Cys 85 , Arg 86 , and Lys 87 in AQP4, which correspond to Leu 85 , Cys 87 , and Gln 88 in AQP1 (Fig. 7A). Only weak homology is present in the TM3-4 peptide loop region (residues Thr 116 -Ile 141 versus Thr 115 -Val 140 ). As shown in Fig. 7B, introduction of L85C/C87R/Q88K (CRK) mutations into AQP1 had a minor effect on TM3 signal sequence activity (30% translocation efficiency) (lanes 7-9). Exchange of the TM3 C-terminal flanking residues increased translocation efficiency to 50% (lanes 10 -12), while exchanging both N-and C-terminal flanking residues increased translocation efficiency to 85%, similar to that observed for AQP4 (lanes 13-15). For unclear reasons, hybrid TM3 constructs were cleaved to various extents by signal peptidase, but cleavage did not correlate with translocation efficiency. We also noted that AQP4 encodes a glycosylation site 17 residues downstream of TM3, whereas AQP1 does not. Removal of the glycosylation consensus site from AQP4 (mutation N131Q) had only a minor inhibitory effect on TM3 signal sequence activity, and introduction of a consensus site at the corresponding location in AQP1 (G132N, N134T) had essentially no effect (lanes 16 -21). Thus structural determinants within the TM3-4 peptide loop rather than glycosylation per se appear to be responsible for differences in TM3 topology.
TM3 C-terminal Flanking Residues Play a Key Role in AQP1 Biogenesis-To confirm the importance of TM3 flanking residues in AQP1 biogenesis, full-length AQP1 constructs containing the CRK mutations and/or the TM3-4 peptide loop substitution were used to generate fusion proteins. As expected, the CRK mutations alone did not effect TM3 topology (Fig. 8A). In the presence of the ML substitution, where AQP1-TM2 terminates translocation, CRK-TM3 reinitiated translocation in only  4. Effect of AQP1 residues on AQP4 topology. N-terminal residues of AQP4 (gray) were replaced with corresponding residues from AQP1 (black) at sites listed in Table I. Upward arrows indicate truncation sites as in Fig. 3. In panel C (lanes 1-3), the fusion site was to Ala 73 rather than Gly 72 as a result of the chimeric exchange. Downward arrows in lane 7 (each panel) indicate glycosylated polypeptides. Downward arrows in lanes 2 and 8 indicate protease-protected fragments. Diagrams beneath autoradiograms indicate topology of each fusion site. Glycosylation consensus sites are indicated as in Fig. 3. ϳ25% of nascent chains (Fig. 8B). Remarkably, exchange of the TM3-4 peptide loop resulted in glycosylation of residue Asn 131 and increased protease protection of the P-reporter (60%) both in the absence and presence of the CRK mutations (Fig. 7C,  lanes 1-3 and 4 -6, respectively). In combination with ML substitutions, insertion of AQP4 residues Thr 115 -Val 140 again resulted in glycosylation and translocation of the TM3-4 peptide loop, and translocation was slightly enhanced by the CRK substitutions (80% glycosylation and 70% protection of the reporter (compare Fig. 8D with Fig. 6B). For unclear reasons, the TM2-3 peptide loop was not accessible to protease in these latter constructs even though TM2 spanned the membrane. As was observed for TM3 alone, the N-linked glycosylation site had essentially no effect (Fig. 6E). Taken together, these results indicate that TM3 flanking residues, particularly within the TM3-4 peptide loop, play an important role in directing the cotranslational topology of TM3 during AQP1 synthesis.
Functional Consequences of AQP1 Biogenesis Pathway-Sequence variations at the N terminus of TM2 and the C terminus of TM3 play a major role in directing AQP translocation events. We therefore tested whether these peptide regions might also be involved in other aspects of AQP1 physiology. Water channel activity of full-length, chimeric, and mutant AQP proteins was examined in microinjected X. laevis oocytes using an osmotic induced swelling assay (see "Materials and Methods"). Results shown in Fig. 9 demonstrate that all substitutions affecting AQP1 biogenesis events (e.g. substituting TM2, introducing ML mutations, or exchanging the TM3-4 peptide loop) reduced oocyte water permeability to base-line level. In contrast, CRK substitutions or exchange of TM1-ECL1, neither of which significantly affected AQP biogenesis, had only a partial effect on activity, reducing the P f to roughly 40% of wild type. These results demonstrate that sequence variations responsible for different biogenesis pathways also play important roles in the overall acquisition of water channel function.

DISCUSSION
Previous studies have demonstrated that AQP1 and AQP4 exhibit different initial topologic conformations in the ER membrane (19,20). We now define the translocation events and identify specific sequence variations that give rise to these alternate structures. During AQP synthesis, the first TM segment targets the nascent chain-ribosome complex to the ER membrane and initiates polypeptide translocation. For AQP4, topology of TM2 and TM3 is then acquired sequentially via independent stop transfer and signal anchor activities. For AQP1, however, TM2 transiently passes into the ER lumen and TM3 terminates translocation and initially spans the membrane in a Type I topology. As a result, AQP1 cotranslationally acquires only four transmembrane segments, and TM2 and TM3 must be posttranslationally reoriented during subsequent folding events (26). Analysis of AQP1/AQP4 chimeras identified specific residues at the N terminus of TM2 and the C terminus of TM3 that directly influenced TM2 stop transfer activity and TM3 signal sequence activity. Exchanging only these two peptide regions enabled AQP1 to cotranslationally acquire its mature, six membrane-spanning topology. However, substitutions that altered AQP1 biogenesis events also disrupted water channel function. This suggests that early events of polytopic protein topogenesis are, in part, constrained by structural features needed for later aspects of protein maturation and/or function.
In the endoplasmic reticulum, initial protein topology is established through interactions between topogenic determinants, the ribosome and ER translocation machinery (21,(23)(24)(25). Failure of TM2 to terminate translocation during AQP1 biogenesis suggests that TM2 is unable to disrupt the ribosome-membrane junction and direct the elongating nascent chain into the cytosol (23,24,37,38). Consistent with this, AQP1-TM2 was completely translocated into the ER lumen when it was independently engineered into an otherwise secretory protein (19). One possibility is that hydrophilic residues Asn 49 and/or Lys 51 at the N terminus of AQP1-TM2 interfere with receptor-mediated interactions that are involved in translocation termination (39 -41). Alternatively, Asn 49 /Lys 51 might simply decrease TM2 hydrophobicity below a critical threshold needed for recognition by the ribosome and/or translocation channel (31,37,41,42). Because AQP1-TM2 fails to terminate translocation, TM3 likely emerges from the ribosome into an open translocation channel where it functions as a stop transfer sequence. This behavior is consistent with other topogenic determinants whose function may be influenced by their mode of presentation to ER translocation machinery (31,33,43,44). However, AQP1-TM3 also failed to achieve its expected type II topology when it was preceded by a TM segment with stop transfer activity (e.g. AQP4-TM2). The inability of TM3 to function as a signal (anchor) sequence, together with the lack of TM2 stop transfer activity, therefore explains why neither TM segment cotranslationally achieves its proper topology during AQP1 biogenesis.
Surprisingly, when AQP4-TM3 was engineered downstream of AQP1-TM2, it not only translocated its C-terminal flanking residues into the ER lumen but also oriented its N terminus toward the cytosol (Fig. 4C). Thus under appropriate conditions, a strong signal anchor sequence can both initiate polypeptide translocation into the ER lumen and also direct a translocated peptide loop back into the cytosol. This finding is similar to that recently observed by Goder et al. (45) and provides further evidence that the translocation channel can simultaneously accommodate and integrate topogenic information encoded within multiple TM helices (25). Further studies are needed to determine how these unusual topogenic determinants coordinate the gating of both ends of the translocation channel without mixing cytosolic and ER lumenal contents (37,46).
For both AQP1 and AQP4, TM1 efficiently targets the nascent chain to the ER membrane and initiates translocation. Because TM1 sequences did not influence the subsequent topogenic behavior of TM2 and TM3, initial membrane targeting events do not appear to contribute significantly to different biogenesis pathways. Moreover, once the ribosome is targeted to the ER membrane it remains docked during the synthesis of short cytosolic loops (23,44,47). It is therefore likely that TM3 FIG. 8. Role of TM3 flanking residues in directing AQP1 topology. Diagrams above autoradiograms indicate AQP1 (black) and AQP4 (gray) residues and engineered AQP1 ML and CRK mutations. Truncation sites are shown above autoradiograms. Topology of fusion proteins is shown beneath autoradiograms. Downward arrows represent protease-protected polypeptides. Glycosylation consensus sites are indicated as in Fig. 3. det, Triton X-100.
FIG. 9. Water channel activity of mutant and chimeric AQP1 constructs. Plasmids encoding full-length wild type (WT), mutant, and chimeric AQP1 constructs were linearized, transcribed, and expressed in mature oocytes. Regions of AQP1 that were mutated or replaced with corresponding residues from AQP4 are shown at left. 24 h following injection, oocyte water permeability (P f ) was determined by osmotic induced swelling in response to a 20-fold dilution of medium. Shown are results from a representative experiment. n ϭ 5-8 oocytes in each group Ϯ S.E. flanking residues facilitate translocation by augmenting posttargeting events such as Sec61 gating (37, 48 -51) and/or lateral exit of the TM segment from Sec61 into the lipid bilayer (52,53). For several chimeric constructs we found that the P reporter was incompletely protected from protease even in chains where the adjacent N-linked consensus site, Asn 131 , was glycosylated (Figs. 4 and 5). This observation raises the possibility that AQP1-TM3, and to a lesser extent AQP4-TM3, may fall back into the cytosol because it is unable to stably span the membrane in the absence of TM4-TM6. A similar finding has been observed in experimentally engineered polytopic proteins (43,45) and may partially explain why glycosylation scanning and protease protection analysis may occasionally yield different topologic results. Interestingly, insertion or removal of the Asn 131 glycosylation had little effect on TM3 topology. Thus in contrast to previous studies where glycosylation may favor certain transmembrane orientations by inhibiting retrograde translocation (45,54), TM3 C-terminal flanking residues appear to influence topology through a direct effect on the ER translocation machinery.
Why then should homologous proteins with similar function and predicted structure utilize different folding pathways? Our results suggest that the answer lies in the close relationship between determinants required for protein function and determinants that influence protein topogenesis. In the simplest case, polytopic protein topology is established cotranslationally from N to C terminus through the sequential action of alternating, independent signal and stop transfer sequences (28,43,44,55). Any residue that interfered with these independent topogenic activities would thus disrupt protein folding. However, variations on this cotranslational model have been increasingly observed in a variety of engineered and naturally occurring polytopic proteins (23, 31, 45, 56 -61). For example, translocation across and integration into the bilayer may involve cooperative interactions between topogenic determinants that have either failed to acquire or lost the ability to independently carry out specific biogenesis activities (31,33,58,59,(61)(62)(63)(64)(65)(66). Topology may also be established posttranslationally by downstream signal sequences (59,67,68) or during late events of protein folding that likely involve intramolecular interactions within the nascent chain itself, as has been demonstrated for Sec61p (60). In the case of AQP1, the topogenic activities of TM2 and TM3 are directly influenced by residues required for protein function. This finding is consistent with the high degree of conservation of TM2 and TM3 flanking residues across different species (Fig. 10). Similarly, critical charged residues in the N terminus of the cystic fibrosis transmembrane conductance regulator disrupted TM1 signal sequence activity and required that a posttranslational pathway properly orient TM1 in the membrane (Refs. 59 and 69, and data not shown). Our data thus support a model in which alternate folding pathways provide an evolutionary advantage by enabling polytopic proteins to acquire sequence diversity that would not be possible if topogenesis were restricted solely to cotranslational events.
Our studies provide evidence that transient translocation of cytosolic loops occurs in normal and efficient physiologic folding pathways. This finding contrasts recent studies in which unusual or inefficient topogenic determinants have been implicated in protein degradation. In truncated forms of cystic fibrosis transmembrane conductance regulator and the Na,K-ATPase, predicted cytosolic peptide loops and/or TM segments translocate into the ER lumen (70,71). In both cases transient exposure of peptide segments to the lumenal environment has been proposed to facilitate recognition by ER quality control machinery. AQP1 is not a substrate for ER associated degradation and it is efficiently converted from an immature four membrane-spanning topology to its mature six membranespanning topology in cells (26). However, AQP1 topological reorientation was highly dependent on the synthesis of all six TM segments. Truncated AQP1 proteins containing four or five TM segments were markedly less efficient at reorienting TM2 and TM3 and were rapidly degraded (data not shown). Thus, polytopic protein topogenesis may involve complex folding information, and removing portions of this information may contribute to protein misfolding even among physically distant TM segments. Understanding how intramolecular interactions within the nascent polypeptide as well as intermolecular interactions with the ER translocation machinery determine the final topology and ultimate fate of polytopic membrane proteins remains a significant challenge in this area.