Structural Basis for the Recognition of Tyrosine-based Sorting Signals by the μ3A Subunit of the AP-3 Adaptor Complex*

Background: Tyrosine-based, YXXØ-type signals mediate protein sorting through binding to adaptor μ subunits. Results: X-ray crystallography shows how YXXØ signals bind to the immunoglobulin-like fold of μ3A. Conclusion: The binding site for YXXØ signals on μ3A is similar to that of μ2 but distinct from that of μ4. Significance: The study explains the basis for the recognition of diverse YXXØ signals by μ subunits. Tyrosine-based signals fitting the YXXØ motif mediate sorting of transmembrane proteins to endosomes, lysosomes, the basolateral plasma membrane of polarized epithelial cells, and the somatodendritic domain of neurons through interactions with the homologous μ1, μ2, μ3, and μ4 subunits of the corresponding AP-1, AP-2, AP-3, and AP-4 complexes. Previous x-ray crystallographic analyses identified distinct binding sites for YXXØ signals on μ2 and μ4, which were located on opposite faces of the proteins. To elucidate the mode of recognition of YXXØ signals by other members of the μ family, we solved the crystal structure at 1.85 Å resolution of the C-terminal domain of the μ3 subunit of AP-3 (isoform A) in complex with a peptide encoding a YXXØ signal (SDYQRL) from the trans-Golgi network protein TGN38. The μ3A C-terminal domain consists of an immunoglobulin-like β-sandwich organized into two subdomains, A and B. The YXXØ signal binds in an extended conformation to a site on μ3A subdomain A, at a location similar to the YXXØ-binding site on μ2 but not μ4. The binding sites on μ3A and μ2 exhibit similarities and differences that account for the ability of both proteins to bind distinct sets of YXXØ signals. Biochemical analyses confirm the identification of the μ3A site and show that this protein binds YXXØ signals with 14–19 μm affinity. The surface electrostatic potential of μ3A is less basic than that of μ2, in part explaining the association of AP-3 with intracellular membranes having less acidic phosphoinositides.

Sorting of transmembrane proteins to different compartments of the endomembrane system is most often mediated by recognition of signals in the cytosolic domains of the proteins by adaptor molecules that are components of protein coats (1). Recognition leads to selective incorporation of the transmembrane proteins into coated vesicles that serve as vehicles for intercompartmental transport. Studies over the past three decades have identified a variety of sorting signals and adaptors that participate in different transport steps. Many signals are linear arrays of amino acids that fit one of several canonical motifs (2). Among them, tyrosine-based signals conforming to the YXXØ motif (where X is any amino acid and Ø is an amino acid with a bulky hydrophobic side chain) (3) have prominent roles in endocytosis (4), as well as sorting to lysosomes (5), the basolateral plasma membrane of polarized epithelial cells (6), and the somatodendritic domain of neurons (7). YXXØ signals are recognized by the homologous 1, 2, 3, and 4 subunits of the heterotetrameric adaptor protein (AP) 5 complexes AP-1 (␥-␤1-1-1), AP-2 (␣-␤2-2-2), AP-3 (␦-␤3-3-3), and AP-4 (⑀-␤4-4-4), respectively (subunit composition in parenthesis) (8 -12). The 1 and 3 subunits occur as two isoforms (denoted A and B) that are encoded by different genes. The amino acid sequence identity among subunits from different AP complexes is 25-38%, whereas that of 1 and 3 isoforms is 79 -84%. All of the subunits have a conserved organization consisting of an N-terminal domain that mediates assembly into the corresponding AP complex and a C-terminal domain that binds subsets of YXXØ signals (13). Two other proteins, the 5 subunit of the AP-5 complex (14) and the ␦ subunit of the COPI complex (15), are homologous to the APsubunits over their entire sequence, but to date they have not been shown to recognize any signals. Finally, several monomeric proteins, including the human proteins Stonin 1 and Stonin 2 (16,17), and FCHO1, FCHO2, and SGIP1 (18), have a domain that is homologous to the C-terminal domain of the subunits. These proteins also function as cargo adaptors, although likely through recognition of folded structures rather than linear motifs, as shown for Stonin 2 (16,17) and FCHO1 (19).
X-ray crystallographic analyses have provided insights into the mechanisms of signal recognition by 1A, 2, and 4 (20 -22). The C-terminal domain of these proteins consists of an elongated immunoglobulin-like ␤-sandwich fold with 16 ␤-strands organized into two subdomains (A and B). In 2, YXXØ signals bind to a site on strands ␤1 and ␤16 in subdomain A, with the Y and Ø residues fitting into two hydrophobic pockets (20). The structure of 1A was solved as part of a ternary complex with the cytosolic tail of an MHC class I (MHC-I) molecule and the Nef protein of HIV-1. The MHC-I tail has a Tyr residue that fits into a pocket similar to that in 2 but lacks an Ø residue that could bind to the other pocket (22). Instead, both the MHC-I tail and Nef establish additional interactions with other parts of 1A (22). Of the subunits that have been characterized to date, 4 exhibits the most distinct specificity of YXXØ signal recognition. Although 4 weakly binds some generic YXXØ signals (10,11,12), it displays a strong preference for a subset of YXXØ signals fitting the YX(FYL)(FL)E motif, which occur in the cytosolic tails of members of the amyloid precursor protein family (21). Surprisingly, the latter signals bind to a distinct site located on strands ␤4, ␤5, and ␤6 in subdomain A, which also has hydrophobic pockets for the Tyr and (FL) residues (21). The crystal structure of 4 predicts the presence of an additional site similar to that on 2 (21). Mutations in this site abolish the weak binding of a canonical YXXØ signal (YEQF) from the lysosomal membrane protein Lamp-2 (12), suggesting that this site also functions in signal recognition. Thus, 4 has two binding sites for YXXØ signals on opposite faces of subdomain A. This raises the possibility that other subunits have more than one signal-binding site as well.
The 3A and 3B subunit isoforms also bind YXXØ signals (9,23,24), but the structural basis for this recognition remains to be elucidated. In light of the diversity of YXXØ-binding modes, outstanding questions concern the location and characteristics of the YXXØ-binding site on 3A and 3B. To address these questions, we solved the crystal structure of the C-terminal domain of 3A in complex with a YXXØ-containing peptide from the trans-Golgi network (TGN)-localized protein TGN38 at 1.85 Å resolution. We found that the C-terminal domain of 3A possesses an immunoglobulin-like ␤-sandwich fold made up of 16 strands, similar to the C-terminal domains of 1A (22) 2 (20), and 4 (21). The TGN38 peptide binds to 3A at a site equivalent to that on 2, albeit with fewer stabilizing contacts. Yeast two-hybrid (Y2H) analyses validated the identity of this binding site and, consistent with the crystallographic data, isothermal titration calorimetry (ITC) showed that 3A has lower affinity for YXXØ signals relative to 2. Analysis of the surface of 3A revealed a less basic electrostatic potential compared with that of 2, providing a likely explanation for the preference of AP-3 for binding to endosomes rather than the plasma membrane (24 -26).

Recombinant DNAs, Site-directed Mutagenesis, and Y2H
Assays-To generate a His 6 -fusion construct with the C-terminal domain of 3A, the sequence encoding residues 165-418 of rat 3A was amplified by PCR and cloned in-frame into the EcoRI and SalI sites of pHis-Parallel-1 (27). TGN38, CD63, and Lamp-1 constructs for Y2H assays were described previously (12). Single amino acid substitutions were introduced using the QuikChange mutagenesis kit (Stratagene, La Jolla, CA). The nucleotide sequences of all recombinant constructs were confirmed by dideoxy sequencing. Y2H assays were performed as described previously (21).
Expression and Purification of 3A C-terminal Domain Constructs-Recombinant 3A C-terminal domain (3A-C) constructs tagged with an N-terminal His 6 tag followed by a Tobacco edge virus protease cleavage site were expressed in Escherichia coli B834(DE3)pLysS (Novagen, Madison, WI) after induction with 0.5 mM isopropyl 1-thio-␤-D-galactopyranoside at 25°C for 16 h. Pellets were resuspended in 50 mM Tris-HCl (pH 8.0), 0.5 M NaCl, 5 mM ␤-mercaptoethanol and protease inhibitors (Sigma-Aldrich), and lysed by sonication. The clarified supernatant was purified on nickel-nitrilotriacetic acid resin (Qiagen, Valencia, CA) and eluted with 300 mM imidazole. N-terminal, His 6 -tagged Tobacco edge virus protease was used to cleave the His 6 moiety from 3A-C. The His 6 moiety and His 6 -tagged Tobacco edge virus were removed by an additional passage through nickel-nitrilotriacetic acid resin, and 3A-C was further purified on a Superdex 200 column (GE Healthcare) equilibrated with buffer containing 25 mM Tris-HCl (pH 8.0), 150 mM NaCl, 5% glycerol, and 2.5 mM ␤-mercaptoethanol.
Crystallization, Data Collection, and Structure Determination-Unless otherwise stated, solutions and crystallization reagents were from Hampton Research (Aliso Viejo, CA). Crystals of the 3A C-terminal domain in complex with the TGN38 peptide SDYQRL (New England Peptide, Gardner, MA) were grown by the hanging drop method at 21°C. The reservoir solution contained 0.1 M sodium acetate (pH 5.0) and 1.75 M sodium formate. Drops contained 1 l of reservoir solution and 1 l of 5 mg/ml protein-peptide complex. Prior to crystallization, the protein was incubated at room temperature for 1 h with 2.5 mM peptide. Under these conditions, crystals appeared after 48 -60 h. Crystals were cryoprotected in the reservoir solution supplemented with 30% glycerol and then flash-cooled in liquid nitrogen. Crystals belonged to space group C2 and diffracted to 1.85 Å resolution. The structure was determined by molecular replacement using as search model rat 2 C-terminal domain (PDB code 1BXX) (20). A native data set was collected from a single crystal using a MAR CCD detector at the SER-CAT beamline 22-ID at Advanced Photon Source, Argonne National Laboratory. Diffraction images were processed and scaled with the program HKL2000 (28). Data collection statistics are shown in Table 1. Iterative manual model building and initial refinement were done using COOT (29) and REFMAC. The final model has a single chain of 248 residues with 136 water molecules, and five residues (DYQRL) from the TGN38 cytosolic tail peptide. Molecular model figures were generated with PyMOL software. Crystallographic coordinates and structure factors have been deposited in the Protein Data Bank under 4IKN.
Isothermal Titration Calorimetry-Recombinant 3A-C constructs were dialyzed overnight at 4°C against excess ITC buffer (50 mM Tris-HCl, pH 7.4, 150 mM NaCl). TGN38 and CD63 peptides (SDYQRL, SDAQRL, SGYEVM, and SGAEVM; New England Peptide) were also prepared in ITC buffer. All ITC experiments were carried out at 28°C using an iTC 200 instrument (MicroCal LLC, Northampton, MA). Typically, the chamber contained 0.2 ml of 100 -375 M 3A-C constructs, and the peptides (1-3.75 mM) were added in 18 injections of 2.45-l each. Titration curves were analyzed using Origin software (MicroCal). The binding constant was calculated by fitting the curves corresponding to 3A-C to a one-site model.

RESULTS AND DISCUSSION
We solved the crystal structure of the C-terminal domain of rat 3A (residues 165-418) in complex with a SDYQRL peptide derived from the cytosolic tail of rat TGN38 at 1.85 Å resolution ( Fig. 1 and Table 1). The SDYQRL peptide encodes a YXXØ signal in which the tyrosine residue is referred to as Y0 (position 0 corresponds to the most critical residue of the motif) and the leucine residue at the Ø position is denoted as L3 (position ϩ3 from Tyr-0). Similar to 1A (22, 30), 2 (20), and 4 (21), the 3A C-terminal domain has an immunoglobulin-like ␤-sandwich fold consisting of 16 strands organized into two subdomains, A and B (Fig. 1A and Figs. 2 and 3). The overall root mean square deviation for superimposable C␣ coordinates for the C-terminal domain of 3A and the C-terminal domain of the other subunits is 1.50 Å for 1A, 1.70 Å for 2, and 3.65 Å for 4 (Fig. 2).
Only the DYQRL segment from the peptide is visible in the density map (Fig. 1B). This segment binds in an extended conformation to parallel strands ␤1 and ␤16 of 3A subdomain A (Fig. 1, A and B), similarly to the binding of the TGN38 peptide to 2 (Figs. 2B and 4) (20). Two hydrophobic pockets accommodate the Y0 and L3 residues of the signal on either side of strand ␤16 (Figs. 1 and 4). The signal-binding site on 3A (Figs. 1, 2, and 4C) is at a location similar to that on 1A (22) and 2 (20) (Figs. 2, A and B, and 4A). It differs, however, from the binding site for YX(FYL)(FL)E signals on 4, which is on the opposite face of the protein (Fig. 2C) (21). The area of the interface involving the YXXØ signal from TGN38 is 416 Å 2 on 3A and 434 Å 2 on 2, comparable with that of the YX(FYL)(FL)E signal bound to 4, which is 431 Å 2 , as calculated by the PISA server (31).
The 3A-YXXØ signal interface has substantial polar character, with four direct hydrogen bonds (distance Յ 3.1 Å) between peptide and protein (Fig. 4D); this polarity is lower than that of the 2-YXXØ signal interface, which has seven direct hydrogen bonds (Fig. 4B). The phenolic hydroxyl group of Y0 forms the shortest side chain to side chain hydrogen bond with the carboxylate of Asp-182 in 3A (Fig. 4D). The critical role of this interaction was demonstrated by Y2H analyses. Substitution of alanine for Y0 in the YXXØ motif from the cytosolic tails of TGN38 or the lysosomal membrane proteins CD63 (YEVM) or Lamp-1 (YQTI) completely abolished binding to   (Fig. 5A). Reciprocally, substitution of alanine or serine for Asp-182 in 3A precluded binding to the TGN38, CD63 and Lamp-1 signals (Fig. 5B). These determinants of interaction were confirmed in vitro by ITC using purified components. We found that a synthetic TGN38 SDYQRL peptide, but not a substituted SDAQRL variant, bound to a single site on recombinant 3A C-terminal domain with K d of 14.0 Ϯ 2.8 M (Fig.  6A). Similarly, a synthetic CD63 SGYEVM peptide, but not a substituted SGAEVM variant, bound to a single site with K d of 18.7 Ϯ 1.6 M (Fig. 6B). Single substitution of serine for Asp-182 rendered the interaction with both peptides undetectable (Fig. 6, A and B).
In addition to hydrogen bonds, there are hydrophobic interactions between Tyr-0 in the peptide and Tyr-180 and Phe-402 of 3A, as well as stacking on the side chain of Lys-406 of 3A (Fig. 4, C and D). Y2H analyses showed that the interaction of YXXØ signals from TGN38, CD63, or Lamp-1 with 3A was completely abrogated by substitution of alanine for Tyr-180 or serine for Phe-402 (Fig. 5B). Substitution of alanine for Lys-406 resulted in varied effects, with interaction with TGN38 being seemingly unaffected, Lamp-1 completely abolished, and CD63 partially diminished (Fig. 5B). The differential effects of the Lys-406 mutation inversely correlate with the overall binding affinity of the signals (TGN38 Ͼ CD63 Ͼ Lamp-1) (Figs. 5 and 6), a fact that can be explained by the loss of the hydrophobic stacking interaction on Y0 having a greater effect on the weaker signals.
Unlike 2, in which the hydroxyl group of Y0 participates in a network of hydrogen bonds with Asp-176, Lys-203, and Arg-423 (Fig. 4, A and B) (20), in 3A, the hydroxyl group of Y0 forms a hydrogen bond only with Asp-182 (Fig. 4, C and D). In place of 2 Lys-203, 3A contains Cys-209, which is too far to contribute to the binding of Y0 (Figs. 3 and 4C). Consistent with this observation, Y2H analysis showed that substitution of alanine for Cys-209 did not affect the interaction of 3A with FIGURE 2. Comparison of the crystal structure of subunits. A, superposition of rat 3A (red) and mouse 1A (green; PDB code 4EN2) (22). B, superposition of 3A (red) and rat 2 (blue; PDB code 1BXX) (20); and C, superposition of 3A (red) and human 4 (orange; PDB code 3L81 (21) shown in ribbon representation. The bound peptides SYSQAAGSDSAQ on 1A (shown with carbon atoms colored blue; oxygen colored red; nitrogen colored blue), DYQRLN on 2 (carbon atoms colored magenta), DYQRL on 3A (carbon atoms colored yellow), and TYKFFEQ on 4 (carbon atoms colored green) are shown in stick representation. YXXØ signals from TGN38 or CD63 (Fig. 5B). Interaction with Lamp-1, however, was reduced (Fig. 5B).
The binding pocket for the peptide L3 is lined by the aliphatic side chains of Phe-181, Val-389, and Leu-392 in 3A (Fig. 4, C  and D). The size of this pocket accommodates L3 in the same way as the pocket formed by Leu-175, Val-401, and Leu-404 in 2 (Fig. 4, A and B) (20). Peptide library screening has revealed a preference for an arginine residue at position Yϩ2 (Arg-2) (9). In 3A, R2 forms mainly hydrophobic interactions with Phe-402. In contrast, in 2 R2 is stabilized by hydrophobic interac-  A and C, surface complementarity between TGN38 peptides and 2 (A) and 3A (C). Surface colors for residues in contact with the TGN38 peptide are gray for hydrophobic interactions, except for Leu-175 in 2 and Phe-181 in 3A that are colored black. Residues forming hydrogen bonds are colored orange, except for Asp-176 in 2 and Asp-182 in 3A, which are colored green. The bound peptides DYQRLN on 2 (shown with carbon atoms colored magenta; oxygen is colored red; nitrogen is colored blue; PDB code 1BXX) and DYQRL on 3A (carbon atoms colored yellow) are shown in stick representation. B and D, two-dimensional, schematic representation of the interactions shown in A and C using LIGPLOT (48).
tions with Ile-419 and Trp-421, but also by hydrogen bonding between its N⑀ and the carbonyl group of Lys-420 (Fig. 4, A and  B) (20). It has been suggested that replacement of Trp-421 in 2 by Gly-404 in 3A would remove the specificity for arginine at the Yϩ2 position (20). However, both Gly-404 and Phe-402 (Fig. 4, C and D) contribute to binding, as their substitution by lysine and alanine, respectively, abrogates binding to the YXXØ signals from TGN38, CD63, and Lamp-1 in Y2H assays (Fig.  5B).
Because the YX(FYL)(FL)E-type signal from amyloid precursor protein (YKFFE) binds to a different site on 4 (21), it was of interest to test whether residues on the equivalent site on 3A played any role in the recognition of YXXØ signals from TGN38, CD63, and Lamp-1. Y2H assays showed that single substitution of Phe-255 to alanine or Arg-283 to aspartate drastically reduced binding of the amyloid precursor protein tail to 4 (21). In contrast, single substitution of the corresponding Phe-233 to alanine or Ser-261 to aspartate in 3A did not affect binding to YXXØ signals from TGN38, CD63, and Lamp-1 (Fig.  5C). Likewise, single mutation of other residues predicted to be in this binding site, such as Pro-235 to glutamine, Phe-239 to alanine, Trp-242 to alanine, or Glu-243 to alanine, produced essentially no effect on the binding of 3A to the YXXØ signals (Fig. 5C). This corroborates and extends the structural finding that the YXXØ signals bind to 3A exclusively through the conserved, canonical binding site revealed by the crystal structure.
AP complexes are organized as a "core" with two "hinge -ear" projections. Structural analyses have shown that the AP-2 core occurs in two conformations: a locked conformation in which the binding sites for YXXØ signals and for dileucine-based sorting signals fitting the (DE)XXXL(LI) motif are occluded by the ␤2 subunit of the complex and an open conformation in which both sites are accessible for binding (Fig. 7, A and E) (32)(33)(34). The structure of the AP-3 core has not yet been solved but, based on structural homology, the YXXØ-binding site in 3A would likewise be expected to be accessible in the open core conformation (Fig. 7, C and G).
The basic electrostatic potential of 2 near the binding site for the YXXØ motif in the open conformation of the AP-2 core has been postulated to be important for interaction with the negatively charged head groups of phosphatidylinositol 4,5bisphosphate at the plasma membrane (Fig. 7, A and E) (34,35). The same region has a considerably lower positive electrostatic potential in 3A (Fig. 7G), as well as in 1A and 4 (Fig. 7, F and  H). Unlike AP-2, which binds phosphatidylinositol 4,5-bisphosphate, AP-1 and AP-3 preferentially bind to the less negatively charged phosphatidylinositol 4-phosphate and phosphatidylinositol 3-phosphate, respectively (36,37). In particular, phosphatidylinositol 4,5-bisphosphate binding residues Lys-341, Lys-343, Lys-345, and Lys-354 in 2 are replaced by Ser-329, Thr-331, Asp-333, and Asp-342 in 3A (Fig. 7, I and J). These differences might contribute to the preferential binding of AP-1 and AP-3 to intracellular membranes enriched in less acidic phospholipids.
The ability of 3A to recognize YXXØ signals explains the requirement of AP-3 for efficient sorting of a subset of lysosomal membrane proteins such as CD63, Lamp-1, and Lamp-2 from endosomes to lysosomes in various cell types (26,38,39). This activity may also contribute to the sorting of YXXØ-containing proteins to lysosome-related organelles such as pigment granules/melanosomes and platelet-dense bodies, a process in which AP-3 is critically involved (38, 40 -42). In this regard, it is noteworthy that the affinity of YXXØ-signal binding to 3A (Fig. 6) is one order of magnitude lower than that of 2 (43), FIGURE 5. Y2H analysis of the interaction of 3A with cytosolic tails containing a YXXØ motif. A-C, yeast were co-transformed with plasmids encoding Gal4bd fused to the wild-type or Tyr-to-Ala mutant of the cytosolic tails of TGN38, CD63, or Lamp-1 constructs indicated on the left, and Gal4ad fused to wild-type or mutant 3A constructs indicated on top of each panel. B, Y2H analysis of 3A with mutations on the YXXØ-binding site. C, Y2H analysis of 3A with mutations on a putative YX(FYL)(FL)E-binding site. Mouse p53 fused to Gal4bd and SV40 large T antigen (T Ag) fused to Gal4ad were used as controls. Co-transformed cells were spotted onto His-deficient (ϪHis) or His-containing (ϩHis) plates and incubated at 30°C. Growth is indicative of interactions. consistent with the smaller number of interactions that stabilize the binding of YXXØ signals to 3A (Fig. 4). This difference is in line with results from previous combinatorial Y2H screens showing that 2 exhibits the strongest binding and broadest specificity for YXXØ signals among all family members (9,12). We believe that this explains why most YXXØ signals mediate AP-2-dependent endocytosis, whereas only a subset function in AP-3-dependent intracellular sorting events (2). The lower affinity of 3A relative to 2 might also explain the observation that changing the spacing of the YXXØ signal relative to the transmembrane domain of Lamp-1, a manipulation that affects optimal presentation of the signal, decreases transport from endosomes to lysosomes without affecting the rate of endocytosis (44).
The 3A structure presented here corresponds to the first portion of the AP-3 complex and only the second subunit (after 2) in complex with a canonical YXXØ signal to be solved by x-ray crystallography. Our findings allow us to demonstrate the conservation of the canonical YXXØ binding site and thus the generality of the signal-recognition mode first shown for 2 (20). Biochemical and structural analyses indicate that 1 (A and B isoforms) (7,22,45) and 4 (12) are likely to have a similar binding site (21,22,30,(45)(46)(47), but this remains to be definitively established by x-ray crystallographic studies of 1 and 4 in complex with canonical YXXØ signals. It also remains to be determined whether 1, 2, and 3 have a second site similar to that binding YX(FYL)(FL)E signals in 4 (21).