Systematic Peptide Array-based Delineation of the Differential β-Catenin Interaction with Tcf4, E-Cadherin, and Adenomatous Polyposis Coli*

Nuclear accumulation of the complex between β-catenin and proteins of the T-cell factor (Tcf) family is a hallmark of many cancers. Targeting this interaction for drug development is complicated by the fact that E-cadherin and adenomatous polyposis coli (APC) bind to overlapping sites on β-catenin. Inhibiting their interactions might actually promote tumor growth. To identify selective β-catenin binding hot spots of Tcf4, E-cadherin, and APC, array technology with peptides of up to 53 amino acids length was used. Interactions were monitored by a quantitative fluorescent readout, which was shown to represent a monitor of true equilibrium binding constants. We identified minimal binding motifs in the β-catenin ligands and showed that most of the 15-mer and 20-mer repeats of APC did not interact, at least when non-phosphorylated, and defined a consensus binding motif also present in APC. We confirmed previously found hot spots and identified new ones. The method allowed us to locate a hydrophobic pocket that was relevant for the Tcf, but not the E-cadherin interaction, and would thus constitute an ideal drug target site.

In humans there are around 35,000 protein-coding genes (1,2). Alternative splicing and posttranslational modifications further increase the number of different proteins. To understand their function, it is necessary to know the interactions they are involved in. In yeast, with its ϳ6000 genes (3), different highthroughput methods have defined a total of around 80,000 protein-protein interactions, with the real number probably being somewhat lower (4). These searches have also shown that most of the proteins bind multiple other proteins. This may be coupled to diverse and sometimes opposing biological effects. Many interactions are involved in signal transduction pathways leading to differentiation, growth, or apoptosis.
However, therapeutic intervention with such complexes has proven to be challenging because protein-protein interfaces are rather large and often flat, such that they are difficult to disrupt with small molecules. However, in some cases, it has been shown that this is possible (5)(6)(7)(8). Even though a growing number of complex structures are available, the relation between interface features and binding strength is not well understood from a theoretical point of view. No parameter observable in crystal structures is correlated to its importance for the interaction. On the other hand, empirical analysis of interfaces has shown that generally only a few residues, the so-called hot spots, contribute the bulk of the binding free energy (9,10). A detailed understanding of protein-protein interfaces and the rapid identification of hot spot residues are thus required to understand the thermodynamics and specificity of interactions and to apply such knowledge for the development of specific and potent inhibitory drugs.
␤-Catenin is an important paradigm for those studies. It exists in two distinct cellular pools, where it carries out two different functions through its binding to a number of diverse proteins. For its role in adherens junctions at the cell membrane, it interacts with the cytoplasmic tail of E-cadherin and with ␣-catenin. This is crucial for stable cell adhesion (11). The cytoplasmic ␤-catenin pool serves as a signaling molecule in the Wnt pathway, which plays an important role in development and tissue maintenance and is often found deregulated in cancers (12,13). In the absence of an extracellular Wnt signal, the cytoplasmic ␤-catenin concentration is kept low through proteolysis. For ␤-catenin degradation, its interaction with APC 1 in a multiprotein complex is essential (14). Wnt stimulation results in stabilization of ␤-catenin, which then translocates into the nucleus, where its complex with Tcf proteins (Tcf1, LEF1, Tcf3, and Tcf4) transcriptionally activates a number of target genes (15,16).
␤-Catenin accumulates in the nucleus of most colorectal tumors and certain other cancers. This is due to mutation in either ␤-catenin itself or in negative regulators such as APC (17). The ␤-catenin⅐Tcf4 complex plays a central role in oncogenesis (18), and its inhibition by dominant negative Tcf4 stops proliferation of colorectal cancer cells and induces differentiation (19), proving that ␤-catenin is an important anti-cancer target (20). Recently, compounds inhibiting the association between ␤-catenin and Tcf4 have been described (7). Targeting ␤-catenin implies another level of complexity due to its interactions with E-cadherin and APC, which should remain unaffected by potential drugs. Loss of the E-cadherin interaction is causal in the transition from adenomas to invasive carcinomas (21), so that its inhibition might actually promote tumor metastasis. Because the ␤-catenin⅐APC interaction regulates the level of ␤-catenin in healthy cells and its localization (22), interfering with it might cause inappropriate activation of ␤-catenin signaling. The structures of ␤-catenin complexes with Tcf (23-25), E-cadherin (26), and APC (27) have shown that these proteins all bind to the superhelical groove in the central armadillo repeat region of * The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
␤-catenin and have partially overlapping binding sites, so that selectivity might be difficult to achieve (Fig. 1). The molecules described by Lepourcelet et al. (7) leave the E-cadherin interaction undisturbed but inhibit APC binding, which indicates they need to be optimized. To accomplish this goal, it is necessary to know the specific and differential binding determinants of the respective interactions.
For ␤-catenin, hot spots have been defined by introducing alanine mutations in the armadillo domain and measuring the effect on LEF-1, APC, and conductin binding (28). This has identified residues that are equally or differentially important for these interactions. Similarly, hot spots were defined in the catenin-binding domain (CBD) of Tcf (29 -31), which is located in the first 50 -60 residues (32).
We have analyzed ␤-catenin binding of Tcf4, E-cadherin, and APC in a systematic and comprehensive manner using peptide arrays and fluorescence-labeled ␤-catenin. The method allowed us to quickly identify specific and overlapping hot spots for different protein-protein interactions. Peptide array data could be directly correlated to affinity by equilibrium measurements using fluorescence polarization titration. We also further narrowed down the minimal binding fragment of Tcf4 and defined binding regions of E-cadherin. Surprisingly, the peptide array technology also showed that ␤-catenin binding of APC is not strictly defined by or restricted to its 15-mer and 20-mer repeats and allowed us to more rigorously define ␤-catenin binding regions in APC.
During preparation of the manuscript, our conclusions about binding motifs in APC were corroborated by x-ray structural studies (33,34).

EXPERIMENTAL PROCEDURES
Expression and Purification of ␤-Catenin, Tcf4, and E-Cadherin-Recombinant protein expression was done in Escherichia coli BL21(DE3) using the vector pGEX-4T1 and TB medium supplemented with 100 mg/liter ampicillin. After induction with 0.3 mM isopropyl 1-thio-␤-D-galactopyranoside, the temperature was lowered from 37°C to 18°C for overnight production of the GST fusion proteins. The bacteria were lysed in a Microfluidizer M-110S (Microfluidics, Newton, MA). For human ␤-catenin, the armadillo domain (residues 151-666) was used. The buffer was 30 mM Tris/HCl, pH 8, 300 mM NaCl, and 3 mM dithioerythritol. For cell lysis, 1 mM phenylmethylsulfonyl fluoride was added. After affinity chromatography on reduced glutathione-Sepharose (Amersham Biosciences), ␤-catenin was cleaved off the column by thrombin and further purified by size exclusion chromatography on Superdex 75 (Amersham Biosciences). For fluorescent ␤-catenin Although E-cadherin binding to ␤-catenin shows two leucines in the same position, these interactions are not important for the interaction (see "Results"). In C and D, the backbone is shown throughout, and only those side chains are shown that contribute significantly to binding energy (see "Results"). The figure was generated with PyMOL (pymol.sourceforge.net).
(bcat-GFP), a GST-␤-catenin-(151-666)-enhanced GFP fusion protein was produced as described above, and bcat-GFP cleaved off the reduced glutathione column was used directly. A Cys(Ser-Gly) 3 linker was engineered to the N terminus of the human Tcf4 CBD (residues 1-53) for fluorescence labeling. Point mutations were introduced by QuikChange (Stratagene, Amsterdam, the Netherlands). The buffer was 30 mM Tris/HCl, pH 8, 200 mM NaCl, and 3 mM dithioerythritol. For lysis, 5 mM EDTA and 1 mM phenylmethylsulfonyl fluoride were added. The proteins were purified by reduced glutathione affinity columns, thrombin cleavage of the GST tag on the column, and size exclusion chromatography on Superdex 75. Two constructs of mouse Tcf4, amino acids 652-700 and 628 -723, were used. Point mutations and an N-terminal linker were introduced as described for E-cadherin. The buffer was 30 mM Tris/HCl, pH 7.5, 200 mM NaCl, and 3 mM dithioerythritol. For lysis, 5 mM EDTA and 1 mM phenylmethylsulfonyl fluoride were added. Purification was done as described for Tcf4.
Synthesis of Peptide Arrays-Membranes with peptide arrays were synthesized by the SPOT technique (35) using an ASP222 automated SPOT synthesizer (Intavis Bioanalytic Instruments AG, Cologne, Germany). Briefly, acid-resistant ACS01 amino-pegylated cellulose membranes (0.6 mol amino groups/cm 2 ; AIMS Scientific Products, Braunschweig, Germany) were derivatized in a grid of spots of 0.3 cm in diameter, defining the location of the peptides by applying to each 0.1 l of a solution containing 0.03 M Fmoc-␤-alanine and 0.27 M N-acetylalanine, both as in situ formed 1-hydroxybenzotriazole esters. After 2 h of reaction while covered with glass plates, the intermediate membrane parts were blocked with 2% acetic anhydride in N,N-dimethylformamide overnight. Throughout solid phase synthesis, Fmoc-amino acid 1-hydroxybenzotriazole esters were used. After every coupling round, unreacted amino groups were blocked with acetic anhydride, Fmocprotection was cleaved with 20% piperidine in N,N-dimethylformamide, and the terminal amino functions of the growing peptides were stained with bromphenol blue. Finally, all peptides were acetylated at their N termini, and the side chain protecting groups were cleaved by treatment with 82% trifluoroacetic acid, 3% triisobutylsilane, 5% dichloromethane, and 10% water for 16 h.
K d Determinations-Exchange of Tcf4 and E-cadherin into a degassed buffer (30 mM Tris/HCl, 200 mM NaCl, and 10 mM ascorbic acid) was performed by ultrafiltration. For fluorescence labeling, a protein concentration of 100 M and a 3-5-fold excess of a thiol-reactive fluorophore were used. After overnight incubation at 4°C, the excess fluorophore was removed by buffer exchange to 30 mM Tris/HCl, pH 7.5, 200 mM NaCl, and 3 mM dithioerythritol. Fluorescence polarization equilibrium titrations were performed in a FluoroMax II spectrometer (Spex Instruments, Grasbrunn, Germany) using 20 -200 nM fluorescent protein and adding ␤-catenin-(151-666). The excitation/emission wavelengths were 491/513, 494/516, and 346/449 nm for fluorescein, Alexa-488 (Molecular Probes, Leiden, the Netherlands), and 7-amino-4-methylcoumarin-3-acetic acid, respectively. Isothermal titration calorimetry was performed in a MicroCal instrument. A 100 M solution of the Tcf4-(1-53) peptides was titrated to a 10 M solution of ␤-catenin-(151-666). Both binding partners were in the same buffer as described above, plus 10% glycerol and 0.005% Tween 20. Fitting was done with the MicroCal software package using a one-site binding model.

Design and Preparation of Peptide
Arrays-In order to get a comprehensive view of the energetics and specificity of the different ␤-catenin interactions with Tcf4, E-cadherin, and APC, we applied a systematic single amino acid replacement strategy. Synthetic peptide arrays as available by the SPOT technique (36) can easily be assembled in parallel on a microscale. The longest peptide successfully synthesized thus far by this technique consists of 44 residues (37). Because the CBD of Tcf contains an additional 9 residues, its synthesis is, by itself, challenging. However, because the respective CBDs are unstructured (27,31,38), we reasoned that their synthesis and presentation in a binding assay should not be complicated by secondary structure and folding problems. We were therefore confident in extending the synthetic aim to longer sequences.
By using the peptide companion software (www.5z.com/ csps/), we predicted the absence of difficult synthetic steps. To avoid inhibitory molecular crowding due to the large molecular weight of the product, we found it necessary to reduce the concentration of starting amino functions on the membrane by blending the Fmoc-␤-alanine with a 10-fold excess of the elongation-incapable N-acetyl-alanine (data not shown). The completeness of the coupling reaction was assured by repeating the spotting procedure up to four times, if necessary. Additionally, we have chosen a new, acid-resistant cellulose membrane, which allows an extended treatment with high concentrations of trifluoroacetic acid and thus a complete deprotection of the peptides (39).
Alanine Scan of Tcf4 Binding to ␤-Catenin-Peptide arrays with individual alanine point mutants of the complete CBD of human Tcf4 (residues 1-53) (24) were incubated with a fusion protein of the armadillo domain of human ␤-catenin and enhanced GFP (bcat-GFP). The amount of bcat-GFP bound to Tcf4 was quantified with a fluorescence imager. The quantitative evaluation is shown in Fig. 2A. The error bars represent the standard deviation from results obtained with three individually synthesized peptide arrays, showing that interaction measurement by the SPOT method was quite reproducible.
Mutation of Asp 16 or Phe 21 had the strongest effect on ␤-catenin binding. Replacement of either residue for Ala resulted in a signal indistinguishable from the negative controls (randomized sequence), indicating almost complete loss of ␤-catenin binding to Tcf4. At least 50% reduction in signal resulted from the mutations of Glu 17 , Leu 18 , Ile 19 , Asp 23 , Leu 41 , or Leu 48 . The mutations V49A and E51A in the C-terminal part of the Tcf4-CBD resulted in a Ͼ30% signal reduction. These data confirm and extend previous hot spot studies (see below), establishing peptide array analysis as a valuable method to study ␤-catenin interactions. In contrast, the acidic region from residue 24 to residue 29 of Tcf4, which binds an important ␤-catenin lysine through alternative conformations (24), was of only moderate importance for ␤-catenin binding. Surprisingly, mutation of either Gly 13 or Asn 15 led to a consistent 30% increase in the amount of bcat-GFP bound to the membrane spot.
Solution Assay to Characterize Hot Spot Mutants-We wondered whether peptide arrays as a non-equilibrium method reflect true relative affinities. Thus, recombinant wild-type and mutant Tcf4 peptides were prepared with an additional linker and a cysteine at the N terminus, to allow coupling of a fluorophore. The affinity of the peptides was determined by titration with the armadillo domain of human ␤-catenin and measurement of the increase in fluorescence polarization due to formation of a large molecular weight complex. Representative titration curves are shown in Fig. 3A for the L12A and V49A mutants, and the results are summarized in Table I.
The equilibrium dissociation constant K d of the wild-type CBD peptide was very low (0.84 nM). The most important ␤-catenin binding residue in Tcf4 was Asp 16 , in agreement with previous studies using various qualitative and quantitative methods. Mutation of Asp 16 to Ala lowered the affinity 1700fold, corresponding to a ⌬⌬G of 4.3 kcal/mol at 20°C. A strong effect was also observed for Phe 21 , Leu 41 , and Leu 48 , whose substitution each reduced affinity 250 -300-fold (3.2-3.3 kcal/ mol). A considerable contribution to binding was also made by Glu 17 , Leu 18 , and Ile 19 (1.5-2.1 kcal/mol). In support of the reliability of the peptide array analysis, mutation of Gly 13 slightly increased the affinity.
Isothermal titration calorimetry (ITC) was also employed to measure the interaction between unlabeled recombinant Tcf4 peptides and ␤-catenin. Here, weak binding mutants were employed because the affinity of the wild-type peptide was too high to be measured accurately by ITC. For the mutants D16A and F21A, a K d of 0.56 and 0.22 M, respectively, was determined (Fig. 3B). This was in good agreement with the polarization data and showed that the fluorescence label did not perturb the interaction.
A comparison between the membrane-bound fluorescence intensity data and the equilibrium binding constants measured in solution showed a good correlation, demonstrating that relative affinities can be correctly assigned by the peptide array method (Fig. 4). The comparison also showed, however, that peptide arrays, due to the washing steps involved, were not capable of measuring true equilibrium constants but more likely reflect the different dissociation rate constants, which are likely to be related to affinities.
The Minimal ␤-Catenin Binding Region of Tcf4 -Structures of Xenopus Tcf3-CBD and human Tcf4-CBD in complex with ␤-catenin showed that Tcf can adopt different conformations and secondary structures in the binding groove of the armadillo domain (Fig. 1, A and B). In addition to an elongated core structure that makes similar contacts to ␤-catenin, the three structures showed variations in the ␤-catenin-Tcf interface (23)(24)(25). To further delineate and quantify the contributions of different CBD parts and search for a minimal ␤-catenin binding sequence, a peptide array of overlapping 15-mer peptides covering the complete CBD of Tcf4 was synthesized. We reasoned that a peptide of 15 amino acids might be of sufficient length to bind to ␤-catenin (27). Filter binding showed that not all of the 53 residues of the CBD as defined previously have to be present. Whereas residues 7-9 were marginally important for binding, strong (relative) binding encompassed the 15-mers starting from residue 11 to residue 15 (Fig. 2B). 15-mers starting from residue 17 or higher did not show any binding by themselves, although some residues in the C-terminal part of the CBD such as Leu 41 and Leu 48 did contribute to the binding energy (see Fig. 2A and Table I). Minimal binding sequences required both of the two most crucial hot spots, Asp 16 and Phe 21 , for detectable affinity. Affinity measurements of soluble peptides by fluorescence polarization titration showed that FIG. 2. Peptide array analysis of the ␤-catenin⅐Tcf4 interaction. A, alanine scan. Arrays with all single alanine point mutants of the catenin-binding domain of human Tcf4 (residues 1-53) were incubated with a fluorescent variant of ␤-catenin (bcat-GFP). Binding was quantified by fluorescence imaging. All signals were normalized to the wild-type signal. u, alanine mutants; f, negative controls (randomized sequence); Ⅺ, wild-type peptides. The sequence is indicated above the bars of the alanine mutants. B, scan for a minimal Tcf4 binding domain. A membrane with overlapping 15-mer peptides covering the whole Tcf4-CBD was probed with bcat-GFP as described previously. Different-colored bars are from three different experiments and are normalized to the strongest signal.
Hot Spot Scan of E-Cadherin Binding to ␤-Catenin-Having established that peptide arrays can be used to elegantly define minimal binding epitopes and interaction hot spots, we applied the method to other ␤-catenin binding partners. The complete ␤-catenin-binding region of E-cadherin, as deduced from visible residues in the complex structure (26), might contain as many as 96 residues (residues 628 -723 in mature mouse E-cadherin). Because this exceeds the maximum length of a synthetically accessible peptide, a somewhat shorter construct (residues 652-700) was chosen for a systematic investigation of hot spots. Its central region contains sequences that are highly conserved among classical cadherins (residues 667-673) and also contains the extended region (residues 674 -682) that superimposes well with the central core region of Tcf and forms essential contacts with ␤-catenin (26, 40) (Fig. 1). It comprises the complete minimal binding region of 23-27 residues that has been identified for Drosophila E-cadherin (41) and covers a similar surface of ␤-catenin as the Tcf4-CBD. To account for the effects of phosphorylation, every serine residue phosphorylated in the ␤-catenin⅐E-cadherin complex plus every glycogen synthase kinase-3␤ and casein kinase II site (Ser 684 , Ser 686 , Ser 692 , Ser 693 , Ser 697 , and Ser 699 ) (42) was mimicked by a glutamate (41). As for Tcf, variants of the minimal CBD of E-cadherin (residues 652-700) with all possible single alanine mutations were synthesized on a membrane, and ␤-catenin binding was quantified as described above.
The two most essential residues for ␤-catenin binding in E-cadherin-(652-700) are Asp 674 and Phe 679 because their mutation reduced the fluorescence signal by Ͼ90% (Fig. 5). These amino acids are homologous to and superimposable onto Asp 16 and Phe 21 , respectively, in Tcf4 (24 -26) (Fig. 1C). Other crucial amino acids in this region are Leu 676 and Leu 677 (80% reduction), which correspond to Tcf4 Leu 18 and Ile 19 in the overlay. The E-cadherin region from Tyr 681 to phospho-Ser 684 (mimicked by a Glu) appeared to be more relevant for ␤-catenin binding than the corresponding residues 23-26 of Tcf4. In the C terminus of the E-cadherin peptides, no hot spots were detected. In particular, apart from position 684, the mimicked phosphoserines did not have an appreciable influence on binding. Notably, Leu 691 and Leu 694 , which are in the same structural position as the important Tcf4 leucines 41 and 48, were dispensable for ␤-catenin binding. In contrast to Tcf4, the E- cadherin region N-terminal to the essential aspartate (Asp 674 ) contained some relevant binding determinants (Leu 661 , Asp 665 , Asp 667 , Pro 668 , and Tyr 673 ). As observed for the Tcf interaction, we also found residues such as Asn 660 and Gly 685 , whose mutation apparently increased affinity.
In order to determine the equilibrium binding affinites of the shortened version (residues 652-700) versus the long version (residues 628 -723) of E-cadherin and quantify the effect of the six phosphoserine mimics on binding, affinities of recombinant peptides for ␤-catenin were measured using fluorescence polarization titration as described above (Table I). The short peptide had an affinity in the low micromolar range. Upon glutamate substitution of the six relevant serine phosphorylation sites, the affinity increased by a factor of 18, confirming that glutamate was a suitable phosphoserine mimic. The long peptide with the respective glutamates had a K d of 0.52 nM. This 190-fold difference in affinity, corresponding to a ⌬⌬G of 3.1 kcal/mol at 20°C, indicated that for E-cadherin, and in contrast to Tcf, 1 or more residues outside the elongated binding motif are required to achieve sub-nanomolar affinity. As in the case of Tcf4, peptide array data were a monitor of equilibrium affinities of wild-type and mutant E-cadherin peptides (Fig. 4).
␤-Catenin Binding Motifs in APC-APC interacts with ␤-catenin through a region of Ͼ1000 amino acids (43,44). This region does not show any recognizable compact domains and contains four conserved 15-amino acid repeats (repeats A-D) and seven 20-amino acid repeats (repeats 1-7). These are generally accepted to constitute the ␤-catenin binding regions (27), which were, however, examined predominantly in the presence of flanking sequences. The peptide array methodology appeared to be a perfect tool to analyze the binding motifs in the Ͼ1000-residue APC fragment to see whether all the 15-and 20-mer repeats interact with ␤-catenin and to delineate the binding requirements.
For a first screen, we used a peptide array with overlapping 50-mer peptides covering the 15-and 20-mer region of human APC (residues 1000 -1049, residues 1010 -1059, and so forth, until residues 2049 -2099). Repeat A (residues 1021-1035), for example, was thus present in the 50-mer peptides starting from amino acid 1000, 1010, and 1020, all of which showed strong ␤-catenin binding when probed with bcat-GFP (Fig. 6A). In the 50-mers starting from residue 1110 to 1170, at least one of repeats B, C, or D was present. Surprisingly, we found that the presence of 15-or 20-mer repeats did not necessarily coincide with strong binding. The peptide starting from residue 1170, which contained repeat D, showed no detectable binding, but the peptide starting from residue 1100, which did not contain any of repeats A-D, apparently bound to bcat-GFP. Furthermore, we found 50-mer peptides containing the 20-mer repeats 3 and 5 that bound more weakly to ␤-catenin than adjacent peptides that did not contain the respective repeat. As an example, whereas the 1450 -1499 peptide outside repeat 3 bound, the 1490 -1539 peptide containing repeat 3 had no appreciable affinity.
For a higher resolution of binding sequences, a peptide array with overlapping 15-mer peptides of the region encompassing repeats A-D (residues 1001-1214) was used. The highest fluorescence signal was observed for the peptide 1021-1035 (Fig.  6B), which corresponded to the 15-mer repeat A, whose binding to ␤-catenin has been analyzed structurally (27). Additional significant peaks were found for 15-mers starting with residue  1135, 1140, 1158, or 1159, which overlapped with but were not identical to repeats B, C, and D.
In order to explain these surprising results and find a common binding principle, we compared the sequences of certified ␤-catenin binding motifs from Tcf proteins, inhibitor of ␤-catenin and Tcf, and E-cadherin with the APC sequences and searched for motifs close to repeat 3 (residues 1450 -1529) and repeat 5 (residues 1810 -1899) that would explain the positive and negative binding data. We indeed found homology between sequences in the repeat 3 and repeat 5 areas and previously established binding sites, which is also present in the 15-mer repeat APC-A (Fig. 6C). Alignment of these sequences allowed us to define a consensus motif D-{K}-[LMPV]-[HILM]-X-[FY]-X(2,7)-E. This pattern does not coincide with the conventional repeat alignment. An invariant Asp is followed by a position with selection against positively charged residues, two hydrophobic residues, and any residue. Following this is a position with either Phe or Tyr and, with a spacing of 2-7 arbitrary residues, an invariant Glu (Fig. 6C). Several ␤-catenin complex structures and various mutagenesis studies including the present one provide a rationale for this particular binding motif (see below). Fig. 6C also shows that the other 15-mer repeats (repeats BϪD) and all the actual 20-mer repeats do not contain the consensus motif, in agreement with the binding data shown in Fig. 6, A and B. However, the consensus binding motif is present in sequences immediately N-terminal to the 20-mer repeats 3 and 5, but not in the others.
As before, the sequence requirements for binding of the 15mer APC repeat A were analyzed by a peptide array alanine scan as described above, and 8 of the 15 amino acids were found to be crucial for binding (Fig. 6D). As expected from sequence alignment (Fig. 6C) and the structural analysis, where the extended regions of the ␤-catenin binding partners overlapped, Asp 1022 (homologous to the essential Asp in Tcf4 and E-cadherin) and Tyr 1027 (homologous to Phe) were important binding determinants (Fig. 1C). These residues mark the boundaries of a region in which the conformation of all three peptides bound to ␤-catenin is virtually identical (24 -27) (Fig. 1C). Ile 1025 was equally as important as the homologous residues in Tcf4 and E-cadherin, whereas the important Asn 1026 had no equally important counterpart. Residues Leu 1029 , Tyr 1031 , Asp 1033 , and Glu 1034 , which were also apparently required for tight binding, are in a region that allows more conformational freedom for the different ␤-catenin binding partners. In general, we found that certain residues imperfectly conserved among the 15-amino acid repeats AϪD, such as Asp 1022 or Ile 1025 , were much more crucial for the interaction than, for example, the invariant Ser 1028 , arguing that conservation might not reflect the importance for binding in this case. DISCUSSION Here, we used peptide arrays for the first complete alanine mutagenesis scan of ␤-catenin-binding domains from E-cadherin, Tcf4, and a 15-amino acid repeat of APC to determine crucial determinants (hot spots) for the interaction with ␤-catenin. The peptides covalently bound to a membrane support were synthesized by automated SPOT synthesis and incubated with a fluorescent ␤-catenin protein. Furthermore, using this method, minimal binding sequences were defined for APC and Tcf4. Quantitative peptide array data correlated well with fluorescence polarization equilibrium titrations of the respective peptides in solution and were again correlated with ITC. The results obtained showed that peptide arrays are a method well suited to identify ␤-catenin binding hot spots for different ligands.
The affinity we found for the wild-type Tcf peptide was in the same range but somewhat higher than the previously published values obtained by ITC (29). However, an affinity below the nanomolar level might be difficult to determine accurately by ITC because concentrations high above the K d have to be used in order to measure reasonable enthalpy changes. An IC 50 of 15 nM has been found using an enzyme-linked immunosorbent assay (30) in which GST-Tcf was bound to a solid support, which might not be ideally suited to measure true equilibrium constants. The vital importance of Asp 16 of Tcf proteins for interaction with ␤-catenin has been noted previously (28 -31), and the structure has shown that this residue contacts one of the two positively "charged buttons" formed by crucial ␤-catenin lysines, which are the major anchor site for Tcf (23)(24)(25).
For several other residues, including Phe 21 , conflicting results have been presented. Fasolini et al. (29) determined a K d of 1.9 nM for the F21A mutant by ITC, an affinity even higher than that of the wild-type peptide. In our hands, a consistent reduction of ␤-catenin affinity was found for this mutant using peptide arrays, fluorescence polarization, and ITC. Moreover, FIG. 5. Alanine scan of the ␤-catenin⅐E-cadherin interaction. Arrays with all single alanine point mutants of the catenin-binding domain of a minimal catenin-binding domain of mouse E-cadherin (residues 652-700) were probed for ␤-catenin binding as described in the Fig. 2 legend. The sequence is indicated above the bars of the alanine mutants. Potentially phosphorylated serines were mimicked by glutamates (bold letters).
FIG. 6. Peptide array analysis of the ␤-catenin⅐APC interaction. Membranes contained peptide fragments of the human APC protein and were probed with bcat-GFP as described previously. A, overlapping 50-mer peptides of the region encompassing all 15-mer (repeats AϪD) and 20-mer repeats (repeats 1-7). 50-mers containing the respective repeats are labeled with a horizontal bar (see "Results" for explanation). B, overlapping 15-mer peptides of the region encompassing 15-mer repeats AϪD, which are each indicated with an arrow. C, structural alignment of certified ␤-catenin binding motifs (ϩ) and the binding region adjacent to repeats 3 and 5. These are contrasted with the APC repeat regions that were defined in this study to fail to bind ␤-catenin (Ϫ). D, alanine scan of the ␤-catenin⅐APC-A interaction. Peptides with single alanine point mutants were probed with bcat-GFP as described in the Fig. 2 legend. for the homologous LEF1, an alanine mutation of this phenylalanine, which is invariant in the Tcf family, completely blocks ␤-catenin interaction in a quantitative yeast two-hybrid assay and abolishes nuclear translocation of ␤-catenin (28). In the case of Tcf4, a double mutant of Ile 19 and Phe 21 to alanine consistently reduces reporter gene transcriptional activation (25). In contrast to Poy et al. (25), we also found an important contribution by Leu 18 , Ile 19 , and Asp 23 , by using both peptide arrays and fluorescence polarization. Supporting our data, a lack of ␤-catenin nuclear translocation for the LEF1 mutant homologous to Asp 23 has been detected (28).
The importance of the extended region of Tcf family members (residues 16 -24 in Tcf4) for interaction with ␤-catenin has been well described (29 -31). This is also reflected in the peptide array and fluorescence polarization measurements reported in this work. The importance of Leu 41 and Leu 48 in the C-terminal helix of Tcf4 has also been noticed previously (29,31). We found two previously undetected residues, Val 49 and Glu 51 , with significant energetic contribution to the interaction with ␤-catenin. The slight increase in affinity after mutation of Gly 13 to alanine, which has not been described previously, might be due to a reduced conformational flexibility and thus enhanced interaction of the region N-terminal to residue 11, which is invisible in any ␤-catenin⅐Tcf4 structure (24,25) but visible in the ␤-catenin⅐Xenopus Tcf3 structure (23). Xenopus Tcf3 differs from human Tcf4 in only two positions in the region before Gly 13 , one of which is a conservative substitution (Glu instead of Asp), and the other is a Ser instead of a Gly in a stretch of multiple glycine residues.
For anticancer drugs targeted against the oncogenic ␤-catenin⅐Tcf4 complex such as the one found recently by highthroughput screening (7), it is mandatory not to affect the binding of ␤-catenin to both E-cadherin and APC because this could potentially lead to metastasis of tumor cells or, even worse, an inappropriate accumulation of ␤-catenin in healthy cells, respectively. Our comparative studies on the binding of Tcf4, E-cadherin, and APC showed that although the extended region of Tcf4 comprising the residues from Asp 16 to Phe 21 is most relevant for affinity, it does not represent the ideal target site. Residues of Tcf4 in this region almost perfectly superimpose with the respective residues of E-cadherin and APC, and our binding studies suggest that they were also of comparable importance for binding. In E-cadherin, Asp 674 and Phe 679 made the largest contributions to binding, and two intervening residues were also very relevant for binding, as were Asp 1022 and Tyr 1027 for APC. Although we did identify differences in this region of the three ligands with respect to their binding interactions, it might prove too difficult to produce specific inhibitors targeted to this region of ␤-catenin.
A more promising region would be one in which there are crucial interactions for Tcf4, but none for APC or E-cadherin. One possibility would be the region of the C-terminal ␣-helix of Tcf4, in which Leu 41 and Leu 48 binding in their respective pocket (Fig.  1B) provide strong binding energy; but in E-cadherin, the corresponding leucines in the same position are irrelevant for the interaction. Because mutation of each leucine reduced the affinity by 300-fold and the distance between both sites on ␤-catenin is 10 Å, the region appears to be suitable for the development of a specific bivalent inhibitor, i.e. by using the structure-activity relationship NMR technique (45). Thus, a simultaneous blockade of both interaction sites should not interfere with E-cadherin binding. Furthermore, due to the larger interaction surface as compared with Tcf4, the E-cadherin interaction might be more resistant to inhibition. Affinity measurements 2 indicated that E-cadherin region 701-723 also contributes significantly to binding. No interactions of the APC 15-amino acid repeat are found in this region. Provided that the analysis of an isolated repeat mirrors the situation with the full-length protein in vivo, the interaction of ␤-catenin with APC should not be disturbed either.
For E-cadherin, minimal binding regions of about 25 residues have been defined (40,41), which overlap with the important residues identified in this study. We showed that mutation of individual glutamates, which have been demonstrated to be a suitable mimic of phosphoserines in E-cadherin (41), had only a small effect on ␤-catenin binding. This agrees well with previous data (40,41) that mutation of more than one serine of this cluster is necessary to abolish the interaction in vivo. Clustered mutation of several residues in an E-cadherin minimal binding region has been tested for in vivo ␤-catenin bind-ing (41), which agrees well with the individual hot spots identified here. We identified some additional hot spots N-terminal to the extended core region; however, the most important ␤-catenin binding residues are inside the core, namely, the conserved Asp 674 and Phe 679 .
For APC, the correlation between the 15-mer/20-mer repeats and ␤-catenin binding has not been unequivocal in previous studies. A 27-amino acid fragment containing the 15-mer repeat A interacts with ␤-catenin (43), and the x-ray structure of the ␤-catenin⅐APC-A complex has been solved (27). However, most of the previous studies addressing binding of individual repeats contained additional flanking sequences. On the other hand, Rubinfeld et al. (46) found that, in the context of a longer fragment that bound to ␤-catenin, mutation of several invariant serines inside repeat 2 did not inhibit the interaction, arguing against an important role of repeat 2. In the same study, it was shown that a fragment containing repeat 2 (residues 1342-1476) does not bind at all. The isolated unphosphorylated APC-3 does not bind to ␤-catenin, suggesting that flanking amino acids are required (47). This is all under the assumption that any of the 15-mer repeats BϪD and the 20mer repeats 1-7 are unstructured in the absence of ␤-catenin, as shown for 15-mer repeat A (27) and the catenin-binding regions of E-cadherin (38) and Tcf4 (31). Our results showed that in the unphosphorylated state, ␤-catenin binding was not mediated by the sequences defined as the 20-mer repeats 1-7 and the 15-mer repeats BϪD, but only by repeat A and two motifs in the close vicinity of repeats 3 and 5. These regions align perfectly with validated ␤-catenin binding partners and allowed us to propose a general ␤-catenin binding motif D-{K}-[LMPV]-[HILM]-X-[FY]-X(2,7)-E. This canonical motif is imperfect in the APC repeats that failed to bind to ␤-catenin. Its presence close to repeat 3 and 5 explains the binding seen with longer fragments in previous studies. We would expect a mode of binding similar to those of the 15-mer repeat APC-A and the extended regions of Tcf and E-cadherin.
APC is phosphorylated in vivo by at least two kinases, glycogen synthase kinase-3␤ and casein kinase-1⑀. ␤-Catenin binding of an APC fragment covering repeats 2 through 7 is strongly augmented by phosphorylation (46), and mimicking of phosphoserines by aspartates in repeat 3 increases the affinity of an 88-residue fragment (47). It has been proposed that the 20-mer region of APC can bind to two alternative sites on ␤-catenin and that phosphorylation serves as a switch (27,48). We cannot exclude, however, that phosphorylation would allow binding of the 20-mers that failed to bind in our study with unphosphorylated peptides.
While this work was in preparation, two studies on the structure and biochemistry of 20-mer repeat binding to ␤-catenin appeared (33,34). They collectively show that the previously identified 15-and 20-mer repeats may not define a consensus binding motif, that some of these do not seem to bind at all, and that motifs outside the repeats are necessary for binding, which is modulated by phosphorylation (33,34).