Novel interaction partners of the CD2BP2-GYF domain.

The GYF domain of CD2BP2 serves as an adapter that recognizes proline-rich sequences in intracellular proteins. Although the T cell adhesion molecule CD2 and the core splicing protein SmB/B' were previously shown to interact with CD2BP2-GYF, we are now using a general approach to identify putative GYF domain target sites within the human proteome. The phage display-derived recognition motif for CD2BP2-GYF is PPG(W/F/Y/M/L). SPOT analysis confirmed that the GYF domain interacts with peptides from human proteins containing the consensus site. Epitope mapping by NMR spectroscopy performed for several peptides revealed a conserved binding surface. A direct interaction of the CD2BP2-GYF domain with the novel protein interaction partners PI31 and NPWBP was verified by yeast two-hybrid analysis.

The GYF domain of CD2BP2 serves as an adapter that recognizes proline-rich sequences in intracellular proteins. Although the T cell adhesion molecule CD2 and the core splicing protein SmB/B were previously shown to interact with CD2BP2-GYF, we are now using a general approach to identify putative GYF domain target sites within the human proteome. The phage display-derived recognition motif for CD2BP2-GYF is PPG(W/F/Y/M/L). SPOT analysis confirmed that the GYF domain interacts with peptides from human proteins containing the consensus site. Epitope mapping by NMR spectroscopy performed for several peptides revealed a conserved binding surface. A direct interaction of the CD2BP2-GYF domain with the novel protein interaction partners PI31 and NPWBP was verified by yeast two-hybrid analysis.
The CD2BP2 protein is a cellular adapter protein that was originally identified as a binding partner of the T cell adhesion protein CD2 in the context of T cell signaling (1). Subsequently, a nuclear role for CD2BP2 was suggested, based on reports that identified CD2BP2 as a protein associated with spliceosomal complexes and proteins involved in splicing (2,3). CD2 and SmB/BЈ share the presence of (R/K/G)XXP-PGX(R/K) motifs that were shown to bind to CD2BP2 in vitro and in vivo (1,3,4). CD2BP2 contains a C-terminal fragment of ϳ60 amino acids that confers binding to these proline-rich sequences. Structurally, this so-called GYF domain represents a small ␣/␤ protein that displays a set of aromatic residues from a unique bulge-helix-bulge motif creating a hydrophobic pocket accommodating the PPG core of the binding peptide (1,4,5). In addition to this core sequence, the positively charged arginine residues in the vicinity of the PPG motif contribute to binding of the CD2 and SmB/BЈ ligand. Interestingly, the PPPPGHR motifs in CD2 can also be bound by the SH3 domain of the Fyn tyrosine kinase and show that proline-rich sequence recognition has ultimately to be viewed across domain borders (5). Functionally the protein-protein interactions mediated by the PPPPGHR motifs are associated with a modulation in CD2-mediated signaling events, as for example interleukin-2 production (6,7). In the case of the CD2BP2-SmB/BЈ interaction, it was shown that both proteins colocalize to the nucleus of Jurkat and HeLa cells and that the GYF domain of CD2BP2 can coprecipitate SmB/BЈ from cellular lysates (3). However, CD2BP2 has not been detected in the active spliceosome (8), and its role in splicing or splicingassociated processes is still poorly understood. Because proline-rich sequence recognition is characterized by high off-rates and limited specificity (9), additional intracellular proteins are likely to interact with CD2BP2-GYF. Therefore, we attempted to find new binding partners for CD2BP2 based on the recognition code of its GYF domain. Our results show that in addition to CD2 and SmB/BЈ several human proteins contain high affinity target sites for the CD2BP2-GYF domain. Yeast two-hybrid analyses confirm that two of these novel interactions, namely with NPWBP and PI31, can take place under cellular conditions.
Cloning of the focused library RKRSHRXXPPPXXXVQ into PC89 was similar to the procedure described elsewhere (10). PC89 and the PC89-nonapeptide-library (X 9 ) were a gift from Gianni Cesareni (Dipartimento di Biologia, Universita di Roma).
Phage Display-Phage displaying the nonapeptide (X 9 ) or the focused peptide library fused to the major capsid protein was produced by transforming Escherichia coli XL-1 Blue cells with PC89 constructs followed by superinfection with the VCS-M13 helper phage (Stratagene). After overnight incubation in 2xYT medium in the presence of ampicillin and kanamycin (30°C, 270 rpm), phage particles were purified by three polyethylene glycol/NaCl precipitations (11). Library screening was performed as follows: 30 -50 l of GST-GYF-loaded glutathione-Sepharose 4B gel (Amersham Biosciences) were incubated with 5 ϫ 10 9 -5 ϫ 10 11 infectious particles at 4°C overnight in phosphate-buffered saline. After washing three times with phosphate-buffered saline, bound phages were eluted with 100 mM glycine-HCl, pH 2.2. For phage amplification, E. coli XL-1 Blue cells were infected with eluted phage followed by superinfection with helper phage and subsequent incubation as described above. After three or six rounds of panning, the inserts of selected phage were sequenced to identify their displayed peptide.
SPOT Analysis-Peptides were synthesized on Whatman 50 paper using an Auto-Spot Robot (ASP 222; Invartis AG, Köln, Germany). Pep-* This work was supported by Grant FR 1325/2-1 from the Deutsche Forschungsgemeinschaft (to C. F.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. 1 To whom correspondence should be addressed. tide synthesis using Fmoc chemistry on ␤-alanine-functionalized cellulose membranes was performed according to standard protocols (12). Membranes were probed with GST-GYF protein as described elsewhere (13). Briefly, the membranes were incubated with GST-GYF (40 g/ml) overnight. After washing, bound GST fusion protein was detected with rabbit polyclonal anti-GST antibodies (Z-5; Santa Cruz Biotechnology) and horseradish peroxidase-coupled anti-rabbit IgG antibodies (Rockland). An enhanced chemiluminescence substrate (SuperSignal West Pico; Pierce) was used for detection by a Lumi-Imager TM (Roche Applied Science). Preparation of Peptides-Peptides were synthesized automatically (ABI 433A) by the solid phase method using standard Fmoc chemistry in a batch-wise mode as described previously for the synthesis of CRF analogs (14). After final cleavage/deprotection using trifluoroacetic acid/H 2 O (9/1), crude peptides were purified by preparative HPLC to give final products of 95% purity according to HPLC analysis. The expected masses of peptides were verified by mass spectrometry.
NMR Titration Experiments-All NMR experiments were performed at 297 K using either Bruker DRX600 or DMX750 instruments equipped with standard triple resonance probes. Data processing and analysis were carried out with the XWINNMR (Bruker) and SPARKY Yeast Two-hybrid Analysis-Yeast two-hybrid experiments were performed with the MATCHMAKER GAL4 two-hybrid system 3 according to the manufacturer's manual (Clontech). The yeast strain AH109 was cotransformed with the pGBKT7 bait constructs encoding the GAL4 DNA binding domain fused to the proline-rich regions of selected candidates and full-length CD2BP2 or its GYF domain in pGADT7. Colonies were grown on low, medium, or high stringency (synthetic drop-out) media without the amino acids Leu and Trp, or His, Leu, and Trp, or Ala, His, Leu, and Trp, respectively).

Phage Display and Peptide Substitution Analysis-
The sequence requirements for the binding of peptides to the CD2BP2-GYF domain were analyzed by screening either a focused library of the format X 2 PPPX 3 , or of a randomized X 9 phage display library. For both libraries, similar results were obtained (TABLE ONE). Almost all sequences contain the PPG motif, in agreement with the previously defined recognition signature (R/K/G)XXPPGX(R/K) (3), and SPOT analysis confirmed that only peptides comprising the PPG motif conferred binding to the CD2BP2-GYF domain (TABLE ONE). A third, N-terminal proline is often present in peptides selected from the X 9 library, but other amino acids are also found at this position, indicating that an extended stretch of PPII helix is not required for binding. Tryptophan is preferably selected as the amino acid C-terminal adjacent to glycine. Other hydrophobic amino acids like phenylalanine, tyrosine, methionine, and leucine were also observed at this position, whereas the ␤-branched valine and isoleucine side chains were not found. Of 62 sequences analyzed only one sequence contained the CD2 motif PPPGHR. Interestingly, for peptides containing the PPGW motif, no requirement for the presence of a positively charged amino acid was observed. This is in contrast to the recognition signature PPGX(R/K) of peptides derived from previous studies on CD2 and SmB/BЈ and termed the CD2 class of peptides in this study (1,3). Therefore, based on the phage display results, we suggest a second class of ligands for the CD2BP2-GYF domain, the PPGW class, which does not depend on positively charged amino acids, flanking the motif. In the case of the CD2 class, the nonconserved residue neighboring glycine within the PPGX(R/K) motif points away from the GYF domain surface, whereas the arginine or lysine side chain at position iϩ2 (relative to glycine at position i) contributes van der Waals interactions via its aliphatic CH 2 groups, thereby placing its charged head group relatively close in space to the negatively charged side chains of Glu-9, Glu-15, and Asp-36 of the GYF domain (5) (Fig. 1). In contrast, for the PPG(W/F/Y/M/L)-containing peptides, a Phage display results after 3 and 6 rounds of panning with the X 9 and the X 2 PPPX 3 libraries The CD2BP2 GST-GYF fusion construct was used for screening. The consensus is derived from results of both library screens, except of the number of prolines, which was deduced exclusively from the screen with the X 9 library. The relaxed consensus is defined to include all binding peptides. The tripleproline stretch in the focused library as well as prolines of the binding motif are depicted in white letters with gray background. Other residues with significant presence in both screens are shown in white with black background. Motifs of selected sequences after 6 rounds of panning and the CD2-derived control peptide SHRPPPPGHR were subject to SPOT analysis, performed as described in Fig 2. hydrophobic interaction between the peptide residue following glycine (W/F/Y/M/L) and residue Trp-8 of the domain is likely to take place.
To further corroborate the conclusions drawn from the phage display analysis, the peptide of the melanoma-associated antigen D1 (MAGD1) containing the central PPGW motif was subjected to SPOT analysis. Fig.  2A shows the result for a cellulose membrane, where each position of the binding peptide was individually exchanged by all other naturally occurring amino acids. The motif PPG is indispensable for binding and a preference for Trp, Phe, Tyr, and Leu at the position adjacent to the PPG core could be observed. The Trp 3 Arg/Lys mutants also show strong spot intensity, and according to our model ( Fig. 1) a partial switch from the PPGW class binding mode to the CD2 class binding mode for the resulting PPGϩ motif would account for the binding.
In accordance with the results from phage display, we do not observe a modulation of binding affinities for positions flanking the PPGW motif. This implies that charge complementarity, as was observed for the CD2 peptide, does not play a significant role for the MAGD1 peptide. To better understand this difference between the two classes of peptides, we performed analysis of peptides, where two positions were mutated. In Fig. 2B the results of a double substitution analysis for peptides of the sequence SHX 1 PPPPG(H/A/F/L/W)X 2 V are shown. Although the excess of a single negative charge in the PPG(H/A/L)containing peptides abolishes binding, the PPG(F/W)-containing peptides tolerate a negative net charge of Ϫ1. The H9W and H9F substitutions obviously override, to a certain degree, the influence of the arginine residues flanking the core motif. This suggests that the aromatic residues interacting with the domain, presumably with Trp-8, substitute for the interactions mediated by arginine in the CD2 class of ligands (Fig. 1).

Selection of Putative Interaction Peptides by Data Bank Analysis-
The phage display results define the relaxed consensus PPG(A/E/F/H/ I/L/M/S/T/W/Y) as the signature that should comprise all high affinity binding motifs for the CD2BP2-GYF domain. In addition, the presence of two such motifs that are close in sequence will enhance avidity and represent more likely interaction sites, as has been shown for the repetitive PPPPGHR motif in the CD2 cytoplasmic tail (5). The SwissProt and TREMBL databanks were searched for proteins that contain at least two of the relaxed consensus sites or a single strict consensus motif PPGW. Subsequently the individual peptides from the respective proteins were synthesized on a cellulose membrane and tested for binding to a GST-GYF fusion protein. The results are presented in Fig. 3 and show that many of the natural sequences are bound by CD2BP2-GYF. We observed that the domain bound preferentially to neutral or positively A, substitution analysis of a sequence from the candidate MAGD1, containing the recognition motif PPGW found by phage display. All possible single substitution analogs of the peptide were synthesized on a membrane. The single letter code above each column indicates the amino acid that replaces the corresponding wild-type residue; the row defines the position of the substitution within the peptide. The membrane was incubated with the CD2BP2-GYF-GST fusion construct. Bound domain was detected with an anti-GST primary antibody and a horseradish peroxidase-coupled secondary antibody. The relative spot intensities correlate qualitatively with the binding affinities (25). B, influence of charge on the CD2 and PPGW class of binding motifs. CD2-derived peptides comprising the CD2 binding motif PPGX(R/K) (top panel, left) or the phage display motif PPG(A/L/F/W) were subject to a double substitution analysis. Analogs of the peptides were synthesized on a membrane and incubated with GST-GYF of CD2BP2 as described in panel A. Positions X 1 and X 2 in the peptides were simultaneously substituted by Ala, Glu, or Arg. Rows and columns indicate X 1 and X 2 substitutions, respectively. charged peptides, including CD2 class ligands (data not shown). In the few cases of efficient binding to a negatively charged peptide, tryptophan or phenylalanine following the PPG motif was found, in agreement with our results from the double substitution analysis (Fig. 2B). Based on the intensities of the spots, the peptides were sorted and a set of candidates containing several binding sites, of which at least one fits to the strict consensus PPG(W/F/Y/M/L), were selected for further analysis (TABLE TWO).
Analysis of Selected Peptides in Vitro-To independently confirm the binding of the CD2BP2-GYF domain to peptides identified as binders by phage display and SPOT analysis, NMR titration experiments were performed (Fig. 4). Because GYF domain and peptide are in fast exchange with respect to the NMR time scale, the change of individual NH resonances in dependence of ligand concentration could be followed and a dissociation constant (K D ) could be determined when a simple twostate binding mechanism was assumed. All of the putative target peptides bound to the GYF domain, and the K D varied between 13 M for the phage display-derived peptide (PD1 peptide) EFGPPPGWLGR and 159 M for the CD2 control peptide SHRPPPPGHRV. This result shows that the phage display method indeed selected for a high affinity ligand. The NMR experiments also enabled the binding epitope for the different peptides to be determined. The differences in chemical shift between unligated and bound GYF domain for individual residues are shown in Fig. 4B and clearly indicate that the major interaction surface is comprised by the bulge-helix-bulge motif of the domain, as was previously shown in detail for the GYF-CD2 peptide complex. The number of NH groups of the GYF domain that display significant chemical shift changes is similar for all peptides, showing that an extension of the GYF domain interaction epitope is not required for binding of the different peptide classes (Fig. 4B).
Yeast Two-hybrid Analysis-To obtain evidence that the GYF-peptide interactions can take place in the context of the corresponding proteins within a cellular environment, full-length CD2BP2 or its GYF domain alone were cloned into the prey vector and fragments of putative interaction partners were cloned into the corresponding bait vector of the Clontech yeast two-hybrid system. Possible interaction partners selected for the analysis were AKNA, MAGD1, NEDD4, NPWBP, SWAN, and WWP2, comprising PPGW class interaction sites, the two known interaction partners CD2 and SmB, belonging to the CD2 class, and PI31, which contains a binding site of each class (TABLE TWO). In Fig. 5, the results of this interaction trap analysis are shown. Some of the putative binding partners, namely MAGD1, NEDD4, and WWP2, show autoactivation when transformed as bait construct into yeast; therefore, potential interactions cannot be analyzed by this method. AKNA fails to interact with any of the CD2BP2 constructs. SWAN seems to promote only weak or even unspecific interactions with the CD2BP2-GYF construct under growth conditions repressing its intrinsic autoactivity. In these two cases, folding of the fragments might render the binding motifs inaccessible. Alternatively, these fragments may not be stably expressed in yeast in order to promote interactions with the CD2BP2-GYF domain. However, NPWBP and PI31, in addition to CD2 and SmB, clearly result in selective colony growth, indicative of protein-proteinmediated activation of the reporter genes. This interaction occurred for the full-length CD2BP2 and the GYF domain-only constructs, verifying that these interactions are mediated by the GYF domain.

DISCUSSION
In this study we have used phage display and peptide substitution analysis to derive the recognition code for the CD2BP2-GYF domain. Furthermore, the experimental data were used to highlight binding FIGURE 3. Identification of CD2BP2-GYF PPGW class binding sites of the human proteome. Swissprot and TREMBL databases were searched for sequences meeting the relaxed and strict consensus for CD2BP2-GYF. The corresponding sequences were synthesized on a membrane and tested for CD2BP2-GYF binding as described in Fig. 2. Consensus sequences applied for the search are depicted above the membranes (upper membrane, relaxed consensus; lower membrane, strict consensus). The relaxed consensus had to be present twice in the protein with a maximal distance of 40 amino acids to each other to be selected for synthesis on a membrane. motifs within natural proteins by profile-based data base searches. Peptides derived from these proteins comprising the suggested binding motif were validated for binding on cellulose membranes, and subsequently proteins of interest were subjected to a yeast two-hybrid analysis. Interestingly, the CD2 motif SHRPPPPGHRV is a suboptimal ligand that binds with a 10-fold lower affinity when compared with the optimal phage display motif EFGPPPGWLGR. However, the presence of two such suboptimal motifs enhances the interaction and results in yeast two-hybrid interactions that lead to growth on selective media (Fig. 5). The same argument holds true for the SmB/BЈ C-terminal tail, but in this case 3-4 partially intercalated GYF binding motifs are displayed by the protein. According to the results presented here, a second class of ligands was defined, the PPGW class, in addition to the previously described CD2 class (3). The first class of ligands is characterized by the motif PPG(W/F/Y/M/L) and shows reduced charge dependence. Especially aromatic residues render the interaction tolerant to negative charges. The second class requires a positive charge two positions C-terminal to the PPG core sequence and is defined in this study as PPGX(R/K). For this class, it was shown that avidity effects due to the presence of repetitive motifs compensate the low affinity of individual binding sites and constitute relevant in vivo interaction sites (3,5).
The occurrence of a defined set of GYF domain-containing proteins in most, if not all, eukaryotic species raises the question whether the function of GYF domain-containing proteins has been conserved. In humans CD2BP2 has been identified as an interaction partner of SmB/ BЈ, and other proteins associated with spliceosomal complexes also comprise potential GYF domain binding sites (3). The yeast homolog of CD2BP2, LIN1, was identified as an interaction partner of the essential splicing protein PRP8 (15). The authors of this study suggest two proline-rich sequences at the N terminus of PRP8 to be responsible for binding to the GYF domain of LIN1. One of these motifs, PPGF, is also found at the C terminus of the yeast homolog of SmB. Therefore, the interaction between GYF domains of CD2BP2-like proteins and splicing-associated proteins, notably SmB, might be conserved across species.
The PI31 protein was characterized recently as a protein that inhibits the proteasome in vitro (16). Later the protein was shown to be a regulator of the immunoproteasome, thereby modulating the presentation of peptides by the major histocompatibility complex (17). Our analysis clearly shows the PI31 protein to contain favorable interaction sites for the CD2BP2-GYF domain in vitro. However, the critical questions are whether the interaction will take place in living cells and whether the two proteins compartmentalize to the same locale in vivo. Confocal microscopy of GFP and YFP variants of the two proteins shall reveal whether colocalization can be observed under certain cellular conditions.
A further interesting candidate that matched the selection criteria of our protocol is the nuclear protein NPWBP (alternatively named WBP11, SNP70, or SIPP1). This protein was initially described as a nuclear protein with RNA binding capability that associates with the protein Npw38 (18). A subsequent study that more extensively investigated the localization of NPWBP showed a cytoplasmic localization in dividing cells, while partial colocalization with perinuclear intermediate filaments in human embryonic kidney 293 cells could be observed (19). Recently, NPWBP has been described as a component of the spliceosome (20) and was specifically identified as a protein of the U4/U6 snRNP (8). The proline-rich sequences of NPWBP that interact with CD2BP2-GYF were also shown to interact with the WW domain of Npw38 (18). Interestingly, both Npw38 and the GYF domain of CD2BP2 were found to interact with U5-15K (21,22), an essential splicing protein associated with the U5 snRNP (23,24). Although CD2BP2 is not a component of the active spliceosome, it is suggested to be important for the assembly and/or stability of the U5 snRNP prior to the formation of the U4/U6.U5 tri-snRNP (21). Therefore, a possible interaction between NPWBP and CD2BP2 fits well to the functional context of splicing. Future experiments have to define the exact role of CD2BP2 and associated proteins in the maturation of the snRNPs during the spliceosomal assembly process.