Structure and Intermolecular Interactions of the Luminal Dimerization Domain of Human IRE1α*

Accumulation of unfolded proteins in the lumen of the endoplasmic reticulum activates a signal transduction cascade that culminates in the transcriptional induction of genes encoding adaptive functions. One proximal sensor for this unfolded protein response is the protein kinase/endoribonuclease IRE1α. IRE1α is a type-I transmembrane glycoprotein for which the N-terminal luminal domain (NLD) senses the accumulation of unfolded proteins. Previously we demonstrated that the NLD forms a stable ligand-independent dimer linked by disulfide bridges. In this report we have identified the cysteine residues responsible for intermolecular disulfide bonding. However, this covalent interaction was not required for dimerization and/or signaling, suggesting that a cryptic dimer interface exists in the NLD that is independent of covalent disulfide interactions. Limited proteolysis of the NLD revealed characteristic fragments, all retaining the same N-terminal sequences as full-length NLD. Biochemical and functional studies using NLD truncation mutants indicated that the dimerization domain of the NLD is confined to the conserved motifs at the N-terminal regions where putative hydrophobic interactions exist. In addition, the peptide binding domain of the endoplasmic reticulum protein chaperone BiP interacted with the N-terminal region within the NLD. Our findings suggest that the NLD has at least two distinct types of interactions mediating dimerization and function in signaling,i.e. covalent interactions involving disulfide bond formation and hydrophobic interactions, with the hydrophobic interaction being the driving force for dimerization.

Accumulation of unfolded proteins in the lumen of the endoplasmic reticulum activates a signal transduction cascade that culminates in the transcriptional induction of genes encoding adaptive functions. One proximal sensor for this unfolded protein response is the protein kinase/endoribonuclease IRE1␣. IRE1␣ is a type-I transmembrane glycoprotein for which the N-terminal luminal domain (NLD) senses the accumulation of unfolded proteins. Previously we demonstrated that the NLD forms a stable ligand-independent dimer linked by disulfide bridges. In this report we have identified the cysteine residues responsible for intermolecular disulfide bonding. However, this covalent interaction was not required for dimerization and/or signaling, suggesting that a cryptic dimer interface exists in the NLD that is independent of covalent disulfide interactions. Limited proteolysis of the NLD revealed characteristic fragments, all retaining the same N-terminal sequences as full-length NLD. Biochemical and functional studies using NLD truncation mutants indicated that the dimerization domain of the NLD is confined to the conserved motifs at the N-terminal regions where putative hydrophobic interactions exist. In addition, the peptide binding domain of the endoplasmic reticulum protein chaperone BiP interacted with the N-terminal region within the NLD. Our findings suggest that the NLD has at least two distinct types of interactions mediating dimerization and function in signaling, i.e. covalent interactions involving disulfide bond formation and hydrophobic interactions, with the hydrophobic interaction being the driving force for dimerization.
The endoplasmic reticulum (ER) 1 monitors the folding status of newly synthesized secretory and transmembrane proteins and ensures that only properly folded proteins transit to the Golgi compartment. In response to accumulation of unfolded proteins in the ER, cells activate an intracellular signal transduction pathway called the unfolded protein response (UPR).
The yeast UPR is a linear pathway in which the protein kinase/ endoribonuclease Ire1p signaling mediates transcriptional activation of UPR target genes. When Ire1p in Saccharomyces cerevisiae is activated, it functions as a site-specific endoribonuclease (RNase) that splices HAC1 mRNA encoding Hac1p, a basic-leucine zipper-containing transcription factor. Hac1p binds to the UPR element in the promoter region and induces transcription of target genes, including KAR2, encoding the polypeptide-binding protein chaperone BiP/GRP78 that is a classical hallmark of UPR activation (1). IRE1, PERK, and ATF6 are the three proximal ER stress transducers that regulate UPR signaling in metazoan species. IRE1 (yeast scIre1p homolog) and PERK are two type-I ER transmembrane serine/threonine protein kinase receptors, and ATF6 is a type-II ER transmembrane-activating transcription factor. The mammalian UPR includes three adaptive cellular responses that are activated to cope with the accumulation of unfolded proteins in the ER; 1) transcriptional induction of ER chaperones and folding catalysts; 2) transcriptional activation of genes encoding components of ER-associated protein degradation; and 3) general translational attenuation (1)(2)(3). Recent studies at the organismal level showed that IRE1 plays critical roles in normal embryogenesis in early development, and PERK function is required for glucose homeostasis in vivo (4 -7). IRE1 and PERK are structurally similar to serine/threonine protein kinase receptors. Dimerization and trans-autophosphorylation is a universal mechanism for activation of this class of cell surface receptors (8 -9). However, the biochemical and structural basis for this transmembrane signaling in response to conditions of ER stress is not understood. A biochemical and structural analysis of the NLD should provide insights into this novel transmembrane signaling event, which will in turn establish the foundation to understand the physiological functions of IRE1 and PERK.
IRE1 and PERK contain a remarkably large N-terminal domain that resides in the ER lumen. The N-terminal luminal domains (NLDs) of IRE1 and PERK sense the accumulation of unfolded proteins by a common mechanism and transmit the signal across the ER membrane to induce receptor activation (10). To provide an experimental system amenable to study the biochemical and structural basis for transmembrane signaling mediated by the NLD, the entire IRE1␣ luminal region was produced in a soluble form by transient DNA transfection in COS-1 cells, termed the NLD (11). The soluble NLD formed homodimers in a ligand-independent manner. In addition, the NLD interacted with the membrane-bound full-length IRE1␣ receptor and the ER chaperone BiP. Interestingly, the NLD homodimer was stabilized by disulfide bridges.
In this report we analyzed the biochemical and structural properties of the purified NLD homodimer. The cysteine residues responsible for intermolecular disulfide bond formation * This work was supported in part by National Institutes of Health Grant HL52173. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18  were identified, and their requirement for UPR signaling was examined. The core domain required for dimerization was defined by limited proteolysis and analysis of truncation mutants. Our studies demonstrated that sequences required for dimerization and signaling are confined to conserved motifs at the N terminus. The existence of a cryptic dimer interface and the significance of multiple types of intermolecular interactions within the NLD dimer are discussed.

Materials
Proteases Glu-C, Lys-C, and trypsin, N-␣-tosyl-L-lysine chloromethyl ketone, phenylmethylsulfonyl fluoride, and other protease inhibitors were purchased from Roche Applied Science. Mouse ␣-His5 antibody and Ni-NTA agarose were from Qiagen. Mouse ␣-NLD antibody was previously described (12). S-protein-agarose and S-protein horseradish peroxide conjugate were purchased from Novagen. Dithiothreitol was from Calbiochem, and ␤-mercaptoethanol was from Sigma. All other reagents were from Sigma, Fisher, or Calbiochem.
Yeast Expression Constructs-pRS316-IRE1, pRS316-IRE1-AD (AccIII and PacI sites introduced), C193 and C35 were described previously (10). To make C6, a pair of primers encoding ISCSNS was designed. The two primers were annealed and inserted into AccIII and PacI sites in pRS316-AD. In pRS316-AD-hsIRE1␣-NLD, the entire yeast scIre1p-NLD was replaced with the entire NLD of hsIRE1␣ that was functional for UPR induction (10). The DNA fragments corresponding to the truncations of hsIRE1␣-NLD including 2M, 3M, 4M, D1, D2, and D4 were amplified and inserted into pRS316-AD at AccIII and PacI sites to replace the full-length NLD.
Site-directed Mutagenesis-Site-directed mutagenesis by overlapping PCR was performed to generate single, double, or triple cysteine mutants in pED-NLD mammalian expression vector and in pRS316-AD-hsIRE1␣-NLD. for human IRE1␣-NLD, the mutants are C109S ( BiP Expression Constructs-The wild-type hamster BiP expression vector pEmcBiP was described previously (13). A sequence encoding the entire hamster BiP cDNA was amplified using pEmcBiP as a template. The 5Ј sense primer (CTGCAGGACCGCTGAGCACTGGCC) within the 5Ј-untranslated region introduced a PstI site. The 3Ј antisense primer was designed so that it introduced an XbaI site at the 3Ј end, and a 24-amino acid sequence motif was inserted into the C terminus of BiP between GEEDTS and EKDEL. This 24-amino acid sequence included a thrombin cleavage site (LVPRGS), an extra glycine residue, and an S-tag sequence (KETAAAKFERQHMDS). The DNA was cloned into the PstI and XbaI sites of pED vector to generate pED-BiP-S. To make pED-BiP-1AD-S, an MscI DNA fragment (850 bp) from pEmc-1AdelBiP (14) was inserted into the MscI site of pED-BiP-S to replace the corre-sponding original sequence. In 1AD construct, a 27-residue sequence (Tyr 175 -Glu 201 ) was deleted from the 1A domain of ATP binding cleft. To generate expression constructs for the peptide binding domain (PBD) of BiP, we first constructed pED-BiP-S (I33V), in which ATCGAC (encoding Ile 33 -Asp 34 ) was mutated to GTCGAC (encoding Val 33 -Asp 34 ) to introduce an SalI site. Val 33 is the 15th residue in the mature BiP protein. The 5Ј sense primer used in the PCR contains a SalI site and encodes 410 DGDLVLLD, the first 8 amino acids of the PBD. All the 3Ј antisense primers contain an XbaI site at the 3Ј end and encode 9 amino acids corresponding to the C terminus of the 32-(32K), 20-(20K), and 15-kDa (15K) protein, respectively, followed by an S tag and an EKDEL sequence. These fragments were inserted into SalI and XbaI sites of pED-BiP-S (I33V). PBD-32K-S encodes the full-length PBD encompassing Asp 410 -Ser 649 (␤1-␤8, ␣A-␣E). 20K-S encodes Asp 410 -Asp 578 (␤1-␤8, ␣A), and 15K-S encodes Asp 410 -Thr 527 (␤1-␤8).

Limited Proteolysis of NLD-N176Q
N176Q was subjected to proteolysis with endoproteinase trypsin, Lys-C, or Glu-C. Lys-C cleaves at the carboxylic side of lysine. Glu-C, also known as V8 protease, cleaves specifically at the carboxylic side of glutamate in the presence of ammonium ion. In the absence of ammonium ion, it cleaves at both aspartic and glutamic acids. The reaction buffer for trypsin and Lys-C was 25 mM sodium phosphate, pH 7.9, 150 mM NaCl, 0.5 mM dithiothreitol. To ensure the specificity of Glu-C, 50 mM ammonium bicarbonate was included in the reaction buffer. In a typical reaction, 20 l of protein (1 mg/ml) was used in a 50-l reaction. The reaction was allowed to proceed at room temperature for 30 min and then terminated by adding protease inhibitors followed by boiling. Initially the reaction was tested in a series of reactions with protease to protein ratios of 1/2, 1/10, 1/30, 1/100, 1/300, 1/1000. The optimum ratio was 1/30 -1/10 for Lys-C, 1/10 -1/2 for Glu-C, and 1/300 -1/100 for trypsin.

Characterization of Proteolytic Peptide Fragments
The proteolytic fragments were immediately analyzed by SDS-PAGE. For amino acid sequencing, separated proteins were transferred onto polyvinylidene difluoride membrane (Schleicher & Schuell) by electroblotting. The blot was stained with Coomassie Blue R-50 and destained with 20% methanol. Individual protein bands were excised and rinsed with water, and dried membrane slices were subjected to N-terminal amino acid sequencing. For matrix-assisted laser desorption/ionization mass spectrometry, proteolytic reactions were stopped by the addition of 1 mM N-␣-tosyl-L-lysine chloromethyl ketone or 1 mM phenylmethylsulfonyl fluoride. Mass spectrometry and protein sequencing were done either in the Protein and Carbohydrate Structure Facility at the University of Michigan Medical School or in the Howard Hughes Medical Institute/Keck Biotechnology Resource Laboratory at Yale University.

Other Methods Used in This Study
Transfection of COS-1 cells and cell lysate preparation, protein binding assay using S-protein-agarose and Ni-NTA agarose, protein gel electrophoresis, Western blotting, and fast protein liquid chromatography gel filtration were performed essentially as described previously (11). Yeast cell lysates were prepared, and ␤-galactosidase activity was determined as described previously (10). Prediction of potential BiPbinding sites was performed based on previously published methods (15,16). The program for calculating statistical energy distribution (G k ) to predict BiP-binding sites was obtained from Dr. Bernd Bukau at the University of Heidelberg, Berlin, Germany (15).

IRE1␣ Residues Cys 148 and Cys 332 Participate in Intermolecular Disulfide Bond
Formation-Previous studies demonstrated that NLD dimer is linked by disulfide bonds (11). Amino acid sequence alignment showed that there are two highly conserved cysteines among IRE1 proteins and three among PERK proteins (Fig. 1A). Interestingly, the positions of the conserved cysteines between IRE1 and PERK have diverged. There are three cysteines in human IRE1␣, Cys-109, Cys-148, and Cys-332. To examine the roles of cysteines in the formation of covalently linked dimers, Cys to Ser single mutations were introduced to generate C109S, C148S, and C332S mutants of the NLD. COS-1 cells were transfected with plasmid DNAs. After 48 h COS-1 cells were pretreated with 20 mM N-ethylmaleimide, and cell lysates were prepared in the presence of N-ethylmaleimide. N-Ethylmaleimide is a membranepermeable SH-group alkylating agent. The inclusion of N-ethylmaleimide modifies free SH groups of cysteines and prevents post-lysis NLD self-association. Cell extracts were subjected to SDS-PAGE and analyzed by immunoblotting. Under reducing conditions (100 mM ␤-mercaptoethanol), all the NLD mutants were detected as monomers. Under non-reducing conditions, wild-type NLD and all the mutants were present mostly as dimers on SDS-PAGE (data not shown). Thus, dimerization appears to involve more than one intermolecular disulfide bond.
To identify cysteines that form intermolecular disulfide bonds, mutant NLD-D5 with two or three mutations from Cys to Ser were made. The NLD-D5 is a functional truncation of the NLD (see "Discussion"). Cys mutations did not affect protein expression in transfected COS-1 cells. Under non-reducing conditions, whereas a majority of wild-type D5 was detected as dimers and higher multimeric forms by Western blot analysis, a substantial amount of the double mutants D5-C109S/C148S and C109S/C332S were also present as dimeric forms (Fig. 1B,  lanes 2-4). In contrast, the double mutant C148S/C332S and triple mutant C109S/C148S/C332S migrated at a position corresponding to the monomeric form (Fig. 1B, lanes 5-6). These results showed that double mutations at Cys-148 and Cys-332 completely eliminated the intermolecular disulfide bonding.
The difference in migration between the different dimeric forms likely represents different conformations resolved by electrophoresis. Lanes 3 and 4 in Fig. 1B likely represent homodimers with Cys-332/Cys-332 and Cys-148/Cys-148 bonds, respectively. Lane 2 in Fig. 1B likely represents a mixture of these two species. Note that, unlike wild-type D5, the higher multimeric forms were not observed in the Cys mutants. Taken together, Cys-148 and Cys-332 are essential for formation of intermolecular disulfide linkages. These results support that two disulfide bonds bridge each homodimer. One occurs through Cys-148/Cys-148, and the other occurs through Cys-332/Cys-332. At this point we cannot rule out that a disulfide bridge occurs between Cys-148 and Cys-332 in some homodimers.
NLD Dimer Formation Is Independent of Covalent Disulfide Interactions-Next we directly examined the ability for these Cys mutant NLD to form dimers. Wild-type or mutant D5-H (His 6 -tagged) were cotransfected with wild-type or mutant NLD-S (S-tagged). NLD-S bound to S-protein agarose ( Fig. 2A, top and middle panel, pellet, lane 1), whereas wild-type and mutant D5-H did not ( Fig. 2A, bottom panel, pellet, lanes 1-7). All the D5 constructs were pulled down by S-protein-agarose through their associations with NLD-S or with NLD-S-C109S/ C148S/C332S ( Fig. 2A, top and middle panel). In all the transfection experiments, D5 and NLD were expressed to a similar level as assessed by Coomassie Blue staining of total lysate (data not shown). The two asterisks ( Fig. 2A, top, lane 2, and middle, lane 7) represent NLD and D5 heterodimer formation in the presence or absence of intermolecular disulfide bridges, respectively. This result demonstrates that NLD dimer formation is independent of disulfide bonding. However, elimination of the two disulfide bridges within the NLD did weaken the subunit association in the dimer ( Fig. 2A, top, lanes 2-4, middle, lanes 3-7), suggesting that disulfide interaction within the NLD dimer contributes to affinity and/or stability. It should be noted that the intermolecular interactions between NLD and D5-C148S/C332S, between NLD and D5-C109S/C148S/C332S, and between NLD-C109S/C148S/C332S and D5 were weak, with only a fraction of D5 proteins pulled down in the assay ( Fig. 2A, top, lanes 5-7, middle, lane 2). This observation indicates that dimer formation is favored when the two subunits in the dimeric complex contain either C148/C332 or S148/S332. When one subunit of the homodimer contains C148/C332 and the other contains S148/S332, the subunit association within the dimer is weaker. The basis and significance for this observation is not known and awaits structural determination.
Intermolecular Disulfide Bonding in the NLD Is Not Required for the UPR-Previously we developed an assay in S. cerevisiae to examine the function of the NLD (10). In this assay system, ⌬ire1 yeast cells harbor a single copy of the lacZ gene under the control of the UPR element from KAR2. Activation of this ␤-galactosidase reporter upon tunicamycin treatment requires the introduction of a single copy of wild-type Ire1. When the NLD of yeast Ire1p (scIre1p) was replaced by the NLD of human IRE1␣ while retaining the signal peptide sequence, transmembrane sequence, and the cytoplasmic domain of the scIre1p, the resultant chimeric Ire1p receptor restored UPR-dependent ␤-galactosidase induction in ⌬ire1. Introduction of the luminal domain of PERK also restored UPR signaling (10). Therefore, mutations in the NLD region can be introduced into chimeric Ire1p receptors, and their function can be tested by monitoring UPR-dependent ␤-galactosidase expression. Receptors carrying single cysteine point mutations in hsIRE1␣-NLD, C109S, C148S, and C332S, were able to sense ER stress in an identical manner to wild-type NLD (Fig. 2B). The double cysteine mutant C148S/C332S and the triple mu- tant C109S/C148S/C332S also restored Ire1p receptor function in ⌬ire1 (Fig. 2B). Mutation of Cys to Ala in human IRE1␣-NLD and in yeast Ire1p-NLD (C263A, C274A, and C325A) did not reduce receptor signaling (data not shown). Finally, receptors carrying Cys mutations in the cePERK-NLD including C166A/C171A, C346A and C166A/C171A/C346A were also functional in replacing the scIre1p-NLD (Fig. 2C).
These results demonstrated that intermolecular disulfide bonds are not required for either receptor dimerization or UPR signaling upon ER stress. Our data suggest that the driving force for NLD dimerization is from interactions other than intermolecular disulfide bonds.
Proteolytic Cleavage Analysis of the NLD-To identify the molecular interactions responsible for dimerization and to understand the structural organization of the NLD, we performed limited proteolysis of the non-glycosylated NLD-N176Q. Limited proteolysis by Lys-C gave rise to one major and two minor bands (Fig. 3A, 3, 6, and 7). Proteinase Glu-C digestion also generated one major and two minor, albeit different bands (Fig.  3A, 1, 2, and 4). Trypsin digestion resulted in yet another different major band (Fig. 3A, 5) and five other minor bands. Western blotting confirmed that all these trypsinized bands originated from the NLD (Fig. 3B). Each of the three major bands and two relatively strong minor bands were excised from a polyvinylidene difluoride blot and sequenced at the N-terminal end by Edman degradation. The N-terminal amino acid sequences of all the five proteolytic fragments were STVTL-PETLL, identical to that of the full-length NLD (Fig. 3C) (11). It should be noted that there are 32 Glu and 28 Lys residues that are distributed throughout the entire NLD sequences. These results indicate that the NLD possesses relatively few sites accessible to proteases despite many potential cleavage sites.
All five bands were subjected to analysis by mass spectroscopy. The determined molecular masses were 41.2, 39.9, 39.0, 34.5, and 30.3 kDa. Based on their masses, we mapped five major accessible cleavage sites to Glu 388 , Glu 377 , Lys 374 , Glu 331 , and Lys 288 , respectively (Fig. 3C). The fact that all the major protease accessible sites are located at the C-terminal part of the NLD indicated that the C-terminal part of the molecule is much more flexible than the N-terminal part. In addition, mass spectroscopy analysis showed that the three major proteolytic fragments by Glu-C were dimers (data not FIG. 2. Intermolecular disulfide bonding is not required for NLD function. A, the His 6 -tagged wild-type D5 or D5 mutants (D5-H, 2 g each) were co-transfected into COS-1 cells with 2 g of S-tagged NLD (NLD-S) (top), mutant NLD-S-(C109S/C148S/C332S) (middle), or pED empty vector (bottom). Cell lysates were incubated with S-proteinagarose, and bound protein complexes were analyzed by SDS-PAGE and silver-staining. Lanes 5 and 6 represent two individually isolated mutant clones of C148S/C332S. The asterisks in lane 2 and lane 7 indicate two types of heterodimer-forming partners, wild-type NLD⅐D5 or NLD⅐D5, with each monomer carrying triple Cys mutations. B and C, different chimeric Ire1p receptors carrying wild-type or mutant hsIRE1␣-NLD in B and cePERK-NLD in C were tested for their ability to induce lacZ expression upon ER stress. K1058A is an RNase mutant of scIre1p as a negative control. Cells expressing the indicated forms of receptors were grown to mid-log phase in media lacking uracil and treated with tunicamycin (2 g/ml) for 0, 1, or 3 h as indicated. At each time point, cells were harvested, and extracts were prepared. Specific ␤-galactosidase activity (milliunits/mg of protein crude extract) represents an average of three independent assays with at least two independent clones. Error bars indicate S.D.

FIG. 3. Proteolytic analysis of the soluble NLD-N176Q. A and B,
proteolytic cleavage of NLD-N176Q with Lys-C, Glu-C, or trypsin generated characteristic fragments. The proteolytic reaction was analyzed on a 4-15% gradient SDS-PAGE and visualized by Coomassie Blue staining. The trypsinized fragments were confirmed in B by Western blotting with ␣-NLD antibody. Fragments 1-7 are the major or strong minor cleavage products. C, proteolytic cleavage mapping of the NLD-N176Q. The full-length N176Q is presented as a thin line, and mature proteins are shown as a dot-filled rectangles. The full-length protein contains 442 HHHHHHEKDEL 452 at the C terminus as represented by a filled rectangle. Met 1 -Thr 23 is the signal peptide. The five fragments (1-5) from A were sequenced, and their molecular masses were determined. The vertical arrows indicate these five cleavage sites.
shown). Therefore, the N-terminal region is likely to contain the core structure for dimerization.

C-terminal Truncation Mutants Can Form Dimers-
To test the hypothesis that the core structure for NLD dimerization is located at the N terminus, we constructed C-terminal truncation mutants of the NLD: D5, D4, D3, D2, D1, 4M, and 3M (Fig.  4A). Truncations were expressed in COS-1 cells by transient DNA transfection. The D5 was expressed to a level comparable with full-length NLD, and expression levels of D3 and D4 were lower. Surprisingly, D2 and D1 expression levels were significantly lower than D3 (Fig. 4B), and the expression of 4M and 3M could not be detected (data not shown). Although the mechanism for this decreased protein expression is not examined in this study, it is likely that the C-terminal regions may contribute to the stability of the soluble NLD.
These deletion mutants were used to test for their ability to form heterodimers with full-length NLD. NLD-S (S-tagged) was co-transfected into COS-1 cells with the truncation mutants, and heterodimer formation was detected by an S-protein binding assay. All the five truncations were pulled down by S-protein-agarose through their interaction with NLD-S (Fig.  4C, lanes 7-11). In the absence of NLD-S, no interacting protein was detected (Fig. 4C, lanes 2-6). Because 3M and 4M did not express well in transfected cells, their dimerization with the NLD was not examined. These results demonstrated that the N-terminal region of the NLD can form dimers.
The Conserved Motifs at the N Terminus of the NLD Are Sufficient and Required for the UPR-Within the N-terminal region of the NLD there are four motifs that are conserved among IRE1 homologues from different species. These motifs are also conserved in the NLD of PERK (10). Therefore, we asked whether the N-terminal region of the NLD alone is able to signal the UPR. First of all, removal of the N-terminal region of the NLD or the entire luminal domain from Ire1p receptor abolished its ability to induce the UPR (Fig. 5A) (10). It is noted that none of the N-terminal deletion mutants contains the four conserved motifs. To define the sequence requirement for NLD function, Ire1p receptors containing deletions from the C terminus of hsIRE1␣-NLD were generated, and their function in UPR activation was tested using the ␤-galactosidase reporter assay. Like full-length hsIRE1␣-NLD, truncations of D4, D2, D1, 4M, and 3M were all able to induce lacZ expression upon ER stress. However, 2M was defective in UPR induction, suggesting that deletion of motif 3 and motif 4 abolished NLD function and Ire1p signaling (Fig. 5B).

FIG. 4. The C-terminal truncations of the NLD can form dimers.
A, schematic diagram of the C-terminal truncation constructs of the NLD. All the mature NLD proteins contain a Ser 24 at the N terminus and an 11-residue sequence (HHHHHHEKDEL) at the C terminus as indicated by a filled rectangle. Their relative expression in transfected cells is shown on the right. B, analysis of transfected cell lysates by Coomassie Blue staining showing the relative expression of D1, D2, and D3. C, the NLDs deleted of the C-terminal regions can form heterodimers with full-length NLD. The full-length NLD-S (S-tagged) (2 g) was co-transfected into COS-1 cells with NLD truncation constructs. Two g of plasmid DNA was used for D3, D4, and D5, and 4 g was used for D1 and D2. Cell lysates were incubated with S-proteinagarose, and bound protein complexes were analyzed by SDS-PAGE and silver-staining. Note that the endogenous BiP was pulled down through its association with the NLD-S. There are only about 16 identical/similar residues (ϳ6%) among the NLDs of IRE1 and PERK including five glycines, and yet both NLDs were able to substitute for yeast Ire1p-NLD to signal the UPR. All of the highly conserved residues are localized to the four motifs at the N terminus of the NLD (10). To test the requirement of the conserved residues, they were each singly or doubly mutated to alanine in scIre1p-NLD. All of the single or double mutant receptors constructed were able to restore tunicamycin-dependent ␤-galactosidase induction in ⌬ire1 cells (Fig. 5C) (10). Similarly, single mutations of the conserved residues of human IRE1␣-NLD including D39N, W54A, D79N, P97A, R158E, and N176Q did not affect human NLD function in inducing the UPR when compared with wildtype human IRE1␣-NLD (data not shown).
Taken together, these results demonstrated that the three conserved motifs at the N terminus are sufficient and required for dimerization and UPR induction. It is interesting to note, however, that neither of the conserved residues within this region of the NLD is critical for NLD structure and/or function.
The Peptide Binding Domain of BiP Interacts with the NLD-We asked whether BiP interacts with the NLD. BiP has two major domains, the N-terminal ATPase domain with no peptide affinity and the C-terminal PBD. The crystal structures of ATPase domain from bovine HSC70 and of PBD-peptide complex of Escherichia coli DnaK aided our understanding of the mechanism of BiP function (17)(18). The ATPase domain has a four-domain structure with the nucleotide bound in a deep cleft. The PBD consists of a compact ␤ sandwich followed by ␣ helical elements, and the peptide substrate is bound in an extended conformation in the ␤ sandwich. The ␣ helical domain acts as a hinged lid to regulate the peptide binding and release in communication with the ATPase domain. We first constructed wild-type hamster BiP expression constructs, pED-BiP-S (S-tagged). 1AD-S is a 27-amino acid deletion mutant of the 1A domain of the ATP binding cleft. Although 1AD is capable of binding ATP and immunoglobulin heavy chain, this mutant BiP has no ATPase activity (19). 32K-S is the fulllength PDB without the ATPase domain containing all the eight ␤-strands (␤1-␤8) and five ␣-helices (␣A-␣E). The 20K-S contains ␤1-␤8 and ␣A, and 15K-S contains only ␤1-␤8.
Wild-type BiP interacted with D5 and D3 by a protein binding assay using S-protein-agarose (Fig. 6, A, lane 6, and B, lane  2). Mutant 1AD also interacts with D3 (Fig. 6B, lane 3), D4, and D5 (data not shown). Second, all the three PBD constructs, 32K, 20K, and 15K, bound to D5 or D3 (Fig. 6, A, lanes 7-9, and  B, lanes 4 -6). Third, although the interaction between BiP and D2 was detectable (Fig. 6C, lanes 1-3), the interaction was much weaker compared with that of D3 (Fig. 6C, lanes 4 -6). The amounts of D2 and D3 proteins bound to Ni-NTA beads represent their expression levels in total cell lysates (Fig. 6C,  bottom panel). The interaction between D1 and BiP was weak also (data not shown). We were unable to obtain 4M protein in COS-1 cells, and therefore, its interaction with BiP was not examined. Under our experimental washing conditions, all the NLD truncations from D1 to D5 did not bind to S-proteinagarose to a detectable level by either silver-staining or Western blot analysis (Fig. 6, A, lane 1, B, lane 1, and D and C, lanes  2-6). Our results showed that the PBD of BiP interacts with the N-terminal regions of the NLD. However, the Val 307 -Ile 334 region, which is absent in D2 but present in D3, contributes to high affinity BiP binding.
Potential BiP-binding Sites in the NLDs of Human IRE1␣ and PERK Are Distributed Similarly-To further analyze potential BiP-binding sites within the NLD, we used an algorithm based on peptide binding motifs for DnaK identified by an extensive peptide scan (15). This algorithm allows for predic-tion of DnaK-binding sites with high accuracy in natural proteins by sequence alignment. The DnaK binding motif consists of 13 residues, a central hydrophobic core of 5 residues, and two 4-residue flanking regions. The nature of this consensus DnaK recognition motif is similar to the extended hydrophobic heptapeptide consensus BiP binding motif previously described (16). This algorithm is applicable to BiP also because the general features of the peptide-binding sites of HSP70 family proteins are conserved. In particular, all of the amino acids that contact the bound peptides in DnaK are conserved in BiP (18). This algorithm calculates the statistical energy distribution (⌬⌬G k ) of each amino acid in the motif. ⌬⌬G k is a measure of the potential for a particular peptide to bind BiP. The lower the ⌬⌬G k value obtained for a specific segment, the higher the predicted affinity for BiP. A positive score by this scoring system indicates the peptide motif has a low probability to bind to BiP, whereas scores below Ϫ5 indicate a significant probability of binding to BiP (15). In the NLD of human IRE1␣ and human PERK, four motifs (M1-4) at the N terminus and three other regions (R5-7) at the C terminus were found to contain peptides with scores below or close to Ϫ5 (dotted line) and, therefore, represent potential BiP-binding sites (Fig. 7A). Consistent with this finding, these sites are also predicted to be hydrophobic in nature (data not shown). However, it remains to be determined if any of these predicted sites are in a bindingcompetent conformation in the folded native NLD. Sequence  H) in B (2 g each) were co-transfected into COS-1 cells with S-tagged BiP constructs as indicated (3 g each, except 6 g for 1AD). Cell lysates were incubated with S-protein-agarose, and bound protein or protein complexes were analyzed by Western blotting using mouse ␣-His5 antibody and S-protein horseradish peroxide conjugate (␣-S). D5 and D3 expression was examined by Western blot (IB) analysis as shown in the lower panel. 1AD expression was significantly lower compared with other BiP constructs. C, the interaction of PBD with D2 is weaker compared with that of D3. Pull-down percentages (%) indicate the ratio of D2 or D3 protein that were pulled down by S-protein-agarose relative to total D2 or D3 protein in cell lysates, respectively. Protein quantitative analysis was performed using Quantity One software (Bio-Rad). Protein binding analysis was performed using S-protein-agarose and Ni-NTA agarose, and bound proteins were analyzed by silver-staining. D2 and D3 proteins bound on Ni-NTA reflect their relative expression in transfected cells. D, as a control, single transfected 32K-S did not bind to Ni-NTA, and D2-H and D3-H did not bind to S-protein agarose. alignment suggested that these potential BiP binding regions are similarly distributed between IRE1 and PERK (Fig. 7B). The four conserved motifs at the N terminus each contain one or more potential BiP-binding sites, and the positions of these sites are also well conserved. A number of potential BiP-binding sites are distributed throughout the non-conserved C-terminal region of the NLD. These sites can be grouped into regions 5, 6, and 7, although their positions are not well conserved (Fig. 7B). In summary, although the primary sequences of the NLDs of IRE1 and PERK have diverged, they share a common feature of hydrophobicity and multiple potential BiPbinding sites, which may provide the biochemical basis for a common mechanism of dimerization and function.

DISCUSSION
The NLD Dimerizes through Hydrophobic Interactions-In this study we analyzed the molecular nature by which the luminal domain of IRE1␣ forms dimers. Both full-length IRE1␣ receptor and the NLD form homodimers through intermolecular disulfide bonds (11). However, mutants carrying Cys 3 Ser mutations did not disrupt dimer formation and UPR induction, suggesting that a cryptic dimer interface exists that is inde-pendent of the covalent interaction. To probe the subdomain structure of the NLD, we performed limited proteolysis. The results indicated that all the major sites accessible to cleavage are located at the C-terminal region of the NLD. C-terminaltruncated forms of the NLD were functional in their ability to associate with full-length NLD and to induce the UPR in yeast. Furthermore, we showed that deletion of the conserved motifs from the N terminus abolished UPR signaling. In search for the interactions responsible for dimerization, we demonstrated that the cysteine-less NLD dimer is sensitive to SDS detergent, indicative of noncovalent interactions such as hydrophobic and electrostatic interactions. We showed that the N-terminal fragment (D1, Ser 24 -Ala 246 ) formed dimers and was functional. Although the low level expression and/or instability precluded analysis of dimer status of shorter truncations, 3M (Ser 24 -Leu 147 ), which constitutes the first three conserved motifs at the N terminus, was functional in sensing ER stress to signal the UPR. We conclude that the NLD dimerizes mainly through these three conserved motifs at the N terminus. Previously we demonstrated that a basic-leucine zipper dimerization motif conferred ER-stress inducible Ire1p receptor activation and UPR induction (10). This implies that dimerization and the subsequent receptor activation can occur simply through hydrophobic interactions within the leucine zipper. It is interesting to note that the conserved motifs in the NLDs of IRE1 and PERK are characterized by an abundance of hydrophobic amino acids and hydrophobic motifs. Taken together, we propose that hydrophobic interactions within the N terminus of the NLD, rather than intermolecular disulfide bonding, are the driving force for dimer formation.
The NLD Evolved as a Robust ER Stress Sensor-The ER is a unique protein-folding compartment and differs from cytosol in its high oxidizing potential, high Ca 2ϩ concentration, and the presence of glycosylation machinery. Perturbation of this unique environment elicits the up-regulation of chaperones and folding catalysts that are present at basal levels under normal conditions. Our results support that the NLD structure is not sensitive to conditions that activate the UPR. First, because NLD dimerization does not depend upon covalent disulfide interactions, its function will not be disturbed by reductive stresses such as dithiothreitol or ␤-mercaptoethanol treatment. Second, although IRE1␣ is a glycoprotein, NLD glycosylation is not essential for dimer formation or UPR signaling. This allows the NLD to function even under conditions that disrupt oligosaccharide addition and/or removal. Third, the maintenance of the purified dimeric receptor does not require nucleotides such as Mg 2ϩ -ATP or metal ions such as Ca 2ϩ , for which the ER is the major storage site (11). Thus, the function of the NLD is not sensitive to fluctuations in ATP or calcium concentrations in the ER. Fourth, the NLD can tolerate amino acid mutations of conserved residues and even extensive deletions from the C terminus. The only region within IRE1 and PERK that is conserved, although very weakly, is at the N terminus of the NLD. A common feature between these regions is the hydrophobic character of amino acids that may provide a hydrophobic interface for dimerization. This may account for the ability for these two divergent domains to respond to the same ER stress despite their low sequence homology. Finally, the NLD displays high affinity self-association, also suggesting that a large hydrophobic dimer interface exists. Indeed, prediction of potential BiP-binding sites identified extensive hydrophobic regions in the conserved N-terminal motifs (see below). Our data demonstrate that the NLD can tolerate a local disturbance of its three-dimensional framework without disrupting the dimer interface. This ensures that the NLD maintains structural integrity and function even in the presence of mu- FIG. 7. Potential BiP-binding sites in the NLDs of human IRE1␣ and PERK are distributed similarly. A, multiple potential BiP-binding sites were identified by averaged energy distribution (⌬⌬G k ) analysis. Four motifs (M1-4) and three other regions (R5-7) were found to contain peptides with scores below Ϫ5 (dotted line). Energy scores of Ϫ5 or lower possess significant probability of binding to BiP (15). The x axis represents relative residue numbers where the first residue is Ser 24 for human IRE1␣-NLD and Glu 96 for human PERK-NLD. B, sequence alignment of conserved BiP binding regions between human IRE1␣ and PERK. The five-residue hydrophobic core for each potential BiP-binding site is underlined, and the ⌬⌬G k values are shown above the core sequence for IRE1 and below the core sequence for PERK. Dots denote identical/similar residues. Dashes represent omitted sequences, and spaces represent gaps to obtain maximum sequence alignment. Arrows show the three truncations of IRE1␣-NLD (D1, D2, and D3) and a deletion mutant of PERK-NLD (⌬4). tations under conditions that reduce the fidelity of protein translation or under conditions that disrupt protein folding in the ER. We, therefore, conclude that the NLD has evolved as a robust ER stress sensor, suitable for responding to conditions that induce protein misfolding in the ER.
Molecular Nature of UPR Regulation by BiP-BiP is a negative regulator of the UPR and interacts with all three ER stress sensors, IRE1, PERK, and ATF6, under non-stress conditions (10 -11, 20 -23). Upon accumulation of unfolded proteins in the ER, these stress sensors are released from BiP to initiate downstream signaling. It is proposed that unfolded proteins bind and sequester BiP so that it is no longer available for interaction with IRE1, PERK, and ATF6. Upon release from BiP, ATF6 transits to the Golgi compartment, where it is processed to its transcriptionally active form (21). The release of IRE1 and PERK from BiP promotes their respective homodimerization and trans-autophosphorylation for activation. In this manner BiP senses unfolded proteins in the ER lumen and activates the UPR.
Although BiP is the master regulator of UPR activation, the molecular nature of its interaction with the luminal domains of the stress transducers remains to be defined. Three distinct regions within the ATF6 luminal domain were identified that have different affinities for BiP (23). A region (Lys 411 -Leu 481 ) in human PERK-NLD was identified as a strong BiP-binding site (20). In the current study, we analyzed interaction of IRE1␣-NLD truncations with BiP to conclude that BiP interacts with the N-terminal region of the NLD, with apparent higher affinity to fragments that contain residues Val 307 -Ile 334 . Interestingly, both the Val 307 -Ile 334 region of IRE1␣-NLD and the ⌬4 region (Lys 411 -Leu 481 ) of PERK-NLD are in region 6. This region in human IRE1␣-NLD is not required for signaling the UPR in yeast (10). It remains to be determined whether this region in human PERK is required to confer ER stress-dependent UPR activation.
Our studies also demonstrated that the NLD binds to the eight-stranded compact ␤ sandwich within the peptide binding domain of BiP (Asp 410 -Thr 527 ). Therefore, we assume that the NLD binds to the peptide binding pocket in BiP. Our results support that there are multiple regulatory BiP-binding sites in the NLD. and these sites may be functionally redundant. BiP binding to the N-terminal region of the IRE1-NLD may be functionally significant. Our studies suggest that the Val 307 -Ile 334 motif binds BiP with high affinity. Interestingly, proteolytic and deletion analysis showed that the Val 307 -Ile 334 motif is not present in the dimer interface, allowing its association with BiP. It is, therefore, likely that hydrophobic motifs within the N terminus up to residue Val 307 form a dimer interface, and the Val 307 -Ile 334 motif is an exposed site available for BiP binding. To test this hypothesis, we constructed N-terminal deletion mutants of the NLD, ⌬2M (Ser 112 -Leu 441 ), ⌬4M (Leu 185 -Leu 441 ), ⌬D1 (Ala 246 -Leu 441 ), and ⌬D3 (Ile 334 -Leu 441 ). We showed that all these truncations also interacted with BiP (data not shown), although sequences in ⌬4M were not required for the UPR. It remains to be examined whether these regions including the Val 307 -Ile 334 motif are the bona fide BiP-binding sites in vivo for regulating NLD dimerization. We speculate that the weak BiP interaction with the N-terminal region, compared with the high affinity homodimer association, may be sufficient for BiP to regulate IRE1 dimerization. The great excess of endogenous BiP over the level of endogenous IRE1 provides a kinetic advantage for BiP to interact with the NLD. The high affinity self-association of the NLD would provide the driving force for receptor dimerization and activation under conditions of ER stress.
In summary, we have identified cysteine residues responsible for intermolecular disulfide bond formation. We showed both biochemically and functionally that removal of the intermolecular disulfide linkages does not affect receptor dimerization or its function in UPR signaling. The structural organization of the NLD was defined by limited proteolysis and functional analysis. We showed that the functional dimerization domain is located at the N terminus of the NLD. The characterization of the NLD permitted the production of functional truncations that are neither glycosylated nor covalently disulfide-linked. These proteins should prove useful in generating crystals for structure determination of IRE1 to elucidate how the accumulation of unfolded proteins in the ER leads to receptor activation.