Structure of the Cadherin-related Neuronal Receptor/Protocadherin-α First Extracellular Cadherin Domain Reveals Diversity across Cadherin Families*

The recent explosion in genome sequencing has revealed the great diversity of the cadherin superfamily. Within the superfamily, protocadherins, which are expressed mainly in the nervous system, constitute the largest subgroup. Nevertheless, the structures of only the classical cadherins are known. Thus, to broaden our understanding of the adhesion repertoire of the cadherin superfamily, we determined the structure of the N-terminal first extracellular cadherin domain of the cadherin-related neuronal receptor/protocadherin-α4. The hydrophobic pocket essential for homophilic adhesiveness in the classical cadherins was not found, and the functional significance of this structural domain was supported by exchanging the first extracellular cadherin domains of protocadherin and classical cadherin. Moreover, potentially crucial variations were observed mainly in the loop regions. These included the protocadherin-specific disulfide-bonded Cys-X5-Cys motif, which showed Ca2+-induced chemical shifts, and the RGD motif, which has been suggested to be involved in heterophilic cell adhesion via the active form of β1 integrin. Our findings reveal that the adhesion repertoire of the cadherin superfamily is far more divergent than would be predicted by studying the classical cadherins alone.

In recent years, the cadherins have emerged as an important superfamily, and their structures and biological functions are proving to be complex. Originally thought of as calcium-dependent cell adhesion molecules, the cadherin superfamily molecules are now known to be involved in many biological processes, including cell recognition, cell signaling, cell communication during embryogenesis, and the formation of neural circuits in the central nervous system (1)(2)(3).
However, the classical cadherins account for only a fraction of the cadherin superfamily, which has a multitude of diverse members. Protocadherins are now known to constitute the largest subgroup within the cadherin superfamily (see Fig. 1A) (4,9,11,14). Most protocadherins have a divergent cytoplasmic domain and six or seven EC domains, with low sequence similarities to the EC domains of the classical cadherin group (15). Here, we have focused on one of the major cluster-type protocadherins, the cadherin-related neuronal receptor/protocadherin-␣ (CNR/Pcdh␣) family. The CNR/Pcdh␣ genome is organized into an unusual gene cluster that is similar to the organization of the immunoglobulin and T-cell receptor genes (see Fig. 1B) (16). The mouse CNR/Pcdh␣ gene cluster is composed of 14 variable-region exons and a set of three constantregion exons (17,18). Mature CNR/Pcdh␣ mRNAs are generated from one of these variable-region exons and the three constant-region exons, and their differential and combinatorial expression is observed at the individual neuron level (19 -21).
Among the six tandemly repeated EC domains of the CNR/ Pcdh␣ protein, the N-terminal EC1 domain shows several unique features. First, the sequence of the EC1 domain is well conserved among mouse CNR/Pcdh␣ family members (Fig. 1B) (19). Second, the EC1 domain has a sequence containing the RGD motif, which is known to function in protein-protein interactions. Recently, the RGD motif of the CNR/Pcdh␣ EC1 domain was shown to be involved in the adhesion activity of the CNR/Pcdh␣4 EC1 domain in HEK293T cells, which occurs via ␤1 integrin (22). Third, the CNR/Pcdh␣ EC1 domain lacks Trp 2 , which is highly conserved among the classical (type I), type II, and desmosomal cadherins and is critical for their homophilic binding activity (11). In fact, CNR/Pcdh␣ appears to possess no homophilic binding activity (22).
These characteristics of the CNR/Pcdh␣ EC1 domain raise the possibility that the EC1 domains of the cadherin superfamily, which are thought to be essential for adhesion, may be structurally and functionally rather divergent. However, this issue has been difficult to investigate because no structures of cadherin superfamily EC1 domains, except those of classical cadherins, have been determined. Furthermore, current knowledge about the functions of CNR/Pcdh␣ is still meager compared with what is known about the classical cadherins. Here, to better understand the adhesion repertoire if the cadherin superfamily and to help elucidate the unknown functions of the CNR/Pcdh␣ family proteins, we determined the solution structure of the CNR/Pcdh␣4 N-terminal EC1 domain by NMR. We further investigated the function of the domain on the basis of its determined structure.

EXPERIMENTAL PROCEDURES
NMR Sample Preparation-The DNA encoding the EC1 domain of mouse CNR/Pcdh␣4 (Glu 1 -Asn 103 ) was amplified by PCR from mouse CNR/Pcdh␣4 cDNA and subcloned into the pET-11d vector (Novagen). The recombinant protein was expressed from the resulting plasmid, which contained an additional methionine at the N terminus. The CNR/Pcdh␣4 EC1 domain was labeled uniformly with stable 15 N and 13 C isotopes and expressed by culturing the bacteria (strain BL21(DE3)) in M9 minimal medium containing 1.0 g/liter D-[U-13 C]glucose and 0.5 g/liter 15 NH 4 Cl as the sole carbon and nitrogen sources, respectively. For expression of the 15 N-labeled sample, 1.0 g/liter D-glucose instead of [ 13 C]glucose was added to M9 minimal medium. The cells were incubated at 37°C with shaking, and protein expression was induced by adding 1 mM isopropyl ␤-Dthiogalactopyranoside and culturing overnight. The protein was purified as described (23). The solvent for the purified protein solution was exchanged with 50 mM Tris-HCl (pH 8.0) containing 80 mM NaCl, 1.5 mM CaCl 2 , 0.02% (w/v) NaN 3 , 0.3 mM Pefabloc, and 8% (v/v) D 2 O by ultrafiltration (Ultra-4, Amicon). The final concentration of the protein was ϳ0.1 mM. The sample solution was sealed in a 5-mm Shigemi microtube (CMS-005B) with a cylindrical piston. To confirm that the recombinant protein used for the structural analyses was functional, we used it in an assay that detects the cell adhesion activity of the CNR/Pcdh␣4 EC1 domain. Our previous study showed that the CNR/Pcdh␣4 EC1-Fc fusion protein produced by mammalian cells possesses adhesion activity for parental HEK293T cells (22). In the same assay system, the recombinant CNR/Pcdh␣4 EC1 domain also adhered to HEK293T cells (data not shown), indicating that the recombinant protein used in the structural analyses kept its physiologically and functionally active conformation.
NMR Spectroscopy-NMR spectra for the resonance assignments of the main chain and nuclear Overhauser effect (NOE) were acquired at 303 K with a Bruker DRX-800 NMR spectrometer equipped with a triple resonance ( 1 H, 15 N, and 13 C) cryogenic probe with a z axis gradient coil. Spectra for the side chain assignments were acquired with a DRX-500 spectrometer equipped with a triple resonance probe with a self-shielded triple axis gradient coil.
Resonance Assignments-To assign the 1 H, 15 N, and 13 C resonances, a series of two-and three-dimensional experiments were performed. These were 15 N-1 H HSQC-wg, 13 C-1 H HSQC-se, and 1 H-1 H TOCSY with a mixing time of 57.6 ms; 15 N-edited TOCSY-se with a mixing time of 69.0 ms; and HNCO-wg, HN(CA)CO-se, CBCA(CO)NH-se, HNCACB-wg, HBHA(CBCACO)NH-se, C(CO)NH-se, H(CCO)NH-se, and HCCH TOCSY with mixing times of 20.2 ms each, where "wg" refers to the WATERGATE and water flip-back method and "se" refers to the sensitivity enhancement and gradient echo method (24). In the NMR experiments in which amide protons were detected directly, the spectral width of the 15 N dimension was set to 14.0 ppm. The 1 H carrier was set at the frequency of the residual water resonance (4.7 ppm), and the 15 N carrier was set at 121.0 ppm. The acquisition times were 20 and 71 ms for the 15 N and 1 H dimensions, respectively, independent of the scales of the static magnetic fields applied. The other indirect dimensions were acquired using the TPPI-States method (24). All data were processed with the program NMRPipe (25). The peaks were analyzed with the program Sparky (developed by T. D. Goddard and D. G. Kneller, University of California, San Francisco). The assignment results and accompanying details have been described elsewhere (26) and deposited in the BioMagResBank under accession number 6405. A number of resonances were missing in the 1 H-15 N HSQC spectra (Gly 1 , Asn 2 , Ser 3 , Glu 12 , His 15 , Lys 43 , Gly 46 , Glu 67 , Glu 68 , Arg 72 , Ala 74 , Cys 76 , and Asp 102 ), probably because of rapid amide proton exchanges with the solvent at pH 8.0 and conformational exchanges (26). Samples prepared at a low pH (6.0 -7.0) immediately formed precipitates, which prevented us from detecting amide resonances for these residues. Relaxation analyses for amide 15 N spins were performed as described (27,28).
Structural Restraints and Calculations-To obtain interproton distance restraints, two-dimensional NOESY, three-dimensional 15 N-edited NOESY, and 13 C-edited NOESY with mixing times of 100 ms each were acquired. The NOE connectivities derived from strong, medium, and weak cross-peaks were categorized and assumed to correspond to the upper limits for the interproton distances of 3.0, 4.0, and 5.0 Å, respec-tively (26). The distance restraints for the hydrogen bonds were applied for the amides that were judged to form a ␤ strand on the basis of the characteristic NOE patterns and the and dihedral angles predicted using the software TALOS (29), using 2.8 -3.3 Å for nitrogen-oxygen pairs and 1.8 -2.3 Å for hydrogen-oxygen pairs, after the root mean square deviations for the overlaid backbone nuclei of calculated structures reached 1.0 Å. The backbone torsion angles were predicted using TALOS (29), with chemical shifts of 13 C ␣ , 13 C ␤ , 13 C , 1 H ␣ , and 15 N. The restraints for the backbone and dihedral angles were applied as the mean Ϯ two or three times the S.D. as long as TALOS categorized the corresponding angles as "Good" and "New," respectively.
The methyl pairs of all 12 leucine and all 11 valine residues were assigned stereospecifically using 15% fractionally 13 C-labeled protein (30). The structures were calculated with the software CYANA 2.1 by molecular dynamics in a torsion angle space with 20,000 steps (31). The parameters were set at the default values of CYANA, except for the pseudoatom corrections, which were applied to the upper bound restraints involving methyl, methylene, and aromatic ring protons as described previously (32). The best 30 structures of 100 calculated were analyzed with MOLMOL (33) and AQUA and PROCHECK-NMR (34) software.
Bead Aggregation Assay-Bead aggregation was assayed by a modification of a method described previously (35). In brief, mouse anti-human IgG Fc (Chemicon International) was passively absorbed to red fluorescent microspheres (0.39-m beads; Duke Scientific Corp.). The IgG-coupled beads were washed with 20 mM HEPES (pH 7.2) containing 100 mM NaCl and 5% fetal bovine serum and then with 20 mM HEPES (pH 7.2) containing 100 mM NaCl and 0.1% bovine serum albumin (BSA) (Sigma). One microliter of IgGcoupled beads was coated with purified Fc-tagged protein (250 nM in a volume of 100 l) in 20 mM HEPES (pH 7.2) containing 100 mM NaCl, 1 mM CaCl 2 , and 0.1% BSA overnight at 4°C with gentle agitation. Following capture of the chimeric Fc proteins, the beads were sonicated for 5 min in an ice-bath cup horn (Misonix) in polystyrene tubes. The beads were then incubated at 25°C with shaking (1400 rpm). At various time points, 3-l aliquots of the beads were transferred into 600 l of 20 mM HEPES (pH 7.2) containing 100 mM NaCl, 1 mM CaCl 2 , and 0.1% BSA, and 20,000 events were analyzed by FACS (EPICS ALTRA, Beckman Coulter). Following the FACS analysis, 4 l of beads were spotted onto a glass microslide for visualization. To assess the levels of Fctagged proteins on the beads, the beads were pelleted after FIGURE 1. EC1 domain sequences of the cadherin superfamily. A, a phylogenetic tree of the EC1 domain of the mouse cadherin superfamily proteins drawn with TreeView (54). Note that the protocadherin family (shown in cyan) distinctly branches from the other subgroups of the cadherin superfamily as the largest cluster. B, schematic representation of the mouse protocadherin family gene clusters, the CNR/Pcdh␣ gene cluster, mRNA splicing sites, and protein. The conservation levels of the amino acids, which were scored with a PAM120 matrix using ClustalX quality index scores (QI) (55), are shown below the schematic representation of the full-length protein structure. Alignment of the amino acid sequences of the EC1 domains of mouse CNR/Pcdh␣ members (except c1 and c2) using GENETYX-MAC Version 11.0 is shown with the quality index scores. V, variable exons; C, constant exons; S, signal peptide; TM, transmembrane region; CP, cytoplasmic region. containing 100 mM NaCl, and 0.1% BSA; boiled in SDS sample buffer; and analyzed by SDS-PAGE, with subsequent immunoblotting using anti-Fc antibody. All of the proteincoated beads used in the assay were found to contain equivalent levels of purified proteins (see Fig. 5C).
Cell Adhesion Assay-Microtiter plates (96-well; Nunc) were coated with each protein (200 nM) dissolved in 20 mM HEPES (pH 7.2), 100 mM NaCl, and Hanks' balanced saline solution at 40 l/well for 1 h at 37°C in a humidified CO 2 incubator. The wells were subsequently washed twice with Hanks' balanced saline solution and then blocked with 1% BSA and phosphate-buffered saline at 37°C for 2 h. HEK293T cells were washed twice with phosphate-buffered saline and treated with phosphate-buffered saline containing 2 mM EDTA for 5 min at 37°C. The cells were collected and washed with 20 mM HEPES (pH 7.2) containing 137 mM NaCl, 3 mM KCl, and 0.1% BSA. The pellets were suspended again at 1 ϫ 10 5 cells/ml. Aliquots of the cell suspension were placed on the test  (33). The ␤ strand regions (dark blue and green) are superimposed. The side chains of the aromatic and hydrophobic residues forming the hydrophobic core are shown in magenta, and that of Tyr 7 , which is conserved among the protocadherin family members, is shown in red. The backbone regions of the RGD (positions [45][46][47] and Cys-X 5 -Cys (positions 70 -76) sequences are shown in pink and yellow, respectively (Protein Data Bank code 1WUZ). B, ribbon diagram of the representative NMR structure (closest to the mean) of the CNR/Pcdh␣4 EC1 domain drawn with the PyMOL molecular graphics system (developed by W. L. DeLano, www.pymol.org). C, Greek key topology of the CNR/Pcdh␣4 EC1 domain. The ␤ strands are shown in green (␤B, ␤D, and ␤E) and blue (␤A, ␤C, ␤F, and ␤G), representing the two different ␤ sheets. In B and C, each loop is colored differently for clarity. Conserved calcium-binding residues and the disulfide-bonded loop between Cys 70 and Cys 76 are shown in red and yellow, respectively. dishes with 1 mM CaCl 2 and 1 mM MgCl 2 . To activate ␤1 integrin, antibody TS2/16 was added at 10 g/ml for 15 min before plating. The plated cells were incubated for 20 min and photographed by a camera fitted onto a microscope (Olympus IX70/DP50 system). The attached cells were counted in two independent fields per well.

RESULTS AND DISCUSSION
Overall Structure of the EC1 Domain of CNR/Pcdh␣ and Comparison with Those of Other Cadherins-The solution structure of the CNR/Pcdh␣4 EC1 domain was determined by multidimensional double and triple resonance NMR spectros-copy. The stability of the protein could be maintained only at a low protein concentration of 0.1 mM and a high pH of 8.0, conditions that are generally not suitable for NMR measurements. We overcame this problem by using a high magnetic field NMR spectrometer (800 MHz) with a cryogenic probe. The structure was practically determined from the distance, torsion angle, and hydrogen bond restraints listed in Table 1. An overlay of the final 30 structures, which exhibited the best target functions of 100 calculated structures, showed that the coordinates of the backbone atoms were well defined ( Fig. 2A). The average root mean square deviations for these structures were 0.63 Å for the backbone (0.25 Å when limited to the ␤ strand regions) and 1.19 FIGURE 3. Comparison of the EC1 domain structures of the cadherin superfamily. A, structure-based sequence homology alignment of the CNR/Pcdh␣4 EC1 domain with the EC1 domains of classical cadherins whose structures have been determined. Secondary structure elements of CNR/Pcdh␣ are indicated by green and sky blue arrows, and those of the classical cadherins are indicated by gray arrows. The residues responsible for calcium binding are highlighted in blue; the residues in the adhesion interface of the classical cadherins and the corresponding residues forming a hydrophobic cluster in CNR/Pcdh␣4 are shown in red; the RGD motif conserved in the CNR/Pcdh␣ family in shown in green; and the disulfide-bonded sequence between Cys 70 and Cys 76 is highlighted in orange. B, ribbon representation of the structures of the CNR/Pcdh␣4, C-cadherin (Protein Data Bank code 1L3W), N-cadherin (code 1NCI), and E-cadherin (code 1SUH) EC1 domains (displayed with PyMOL). The structural similarity between CNR/Pcdh␣4 and each of the other cadherin proteins was estimated by Dali Server Version 2.0 (56). The pairwise root mean square deviations of the C ␣ atoms and Z-scores obtained were 3.0 Å and 5.8 for C-cadherin, 3.3 Å and 6.0 for N-cadherin, and 3.0 Å and 6.0 for E-cadherin, respectively. The same coloring is used as described for Fig. 2B.

Structure of the EC1 Domain of CNR/Protocadherin-␣
Å for all heavy atoms (0.81 Å for the ␤ strand regions). The detailed statistics for the best 30 structures are also listed in Table 1.
The EC1 domain of CNR/Pcdh␣4 has the Greek key topology of a ␤ sandwich-like structure containing two ␤ sheets that are packed face to face (Fig. 2). One sheet is composed of four ␤ strands (␤A, ␤C, ␤F, and ␤G), and the other is composed of three ␤ strands (␤B, ␤D, and ␤E). All strands are arranged antiparallel, except for the parallel pairing between ␤A and ␤G. Three loops that are spatially close to the N terminus (loops BC, DE, and FG) and three that are close to the C terminus (loops AB, CD, and EF) connect the corresponding pairs of ␤ strands. The ␤ sandwich scaffold is stabilized by an extensive hydrogen bond network between neighboring ␤ strands and by a hydrophobic core formed by inwardly facing residues from the ␤ sheets ( Fig. 2A). The coordinates of the residues forming the hydrophobic core are well defined in the final structures. Of the three aromatic residues (Tyr 7 , Phe 38 , and Phe 91 ) contained in the hydrophobic core, Tyr 7 is conserved among protocadherin family members.
Despite low sequence similarities between the EC1 domains of CNR/Pcdh␣4 and the classical cadherins (30% at the maximum), the overall topology of the CNR/Pcdh␣4 EC1 domain is similar to that of classical cadherin domains in that both contain seven ␤ strands, and the corresponding ␤ strands have similar lengths (Fig. 3). Potentially crucial variations between the structures of the EC1 domains of CNR/Pcdh␣4 and the classical cadherins are found mainly in the loop regions. Significant differences in the lengths of the loop regions between ␤ strands were demonstrated by aligning the sequences on the basis of the positions of their corresponding ␤ strands (Fig. 3A). First, loop BC of the CNR/Pcdh␣4 EC1 domain is longer and contains more hydrophobic residues than the domains of classical cadherins, whereas loop FG is shorter. In the homophilic interaction between classical cadherins, these two loops form and surround the hydrophobic pocket that accommodates the N-terminal Try 2 residue of the other EC1 domain. Second, loop CD of the CNR/Pcdh␣4 EC1 domain is much shorter than loop CD of classical cadherins, which has a quasi-␤ helix conformation. Instead, loop CD of the CNR/Pcdh␣ EC1 domain contains the RGD motif, which is known to be a consensus sequence in integrin interactions. Third, a Cys-X 5 -Cys sequence, which is not found in classical cadherins, is inserted into loop EF of the CNR/Pcdh␣ EC1 domain. Interestingly, we found that the Cys-X 5 -Cys sequence forms a disulfide bond between Cys 70 and Cys 76 in loop EF (Fig. 2, B and C). This bond was strongly predicted by the characteristic chemical shift values of the ␣and ␤-carbons of the associated cysteine residues (36) and was confirmed by mass spectrometric analysis of the CNR/Pcdh␣4 EC1 protein (data not shown).
Calcium Binding Property of the CNR/Pcdh␣ EC1 Domain-Like all classical cadherins, the conserved calcium-binding sites of the CNR/Pcdh␣ EC1 domain (Glu 11 , Asp 65 -Arg-Glu 67 , and Asp 99 -X-Asn-Asp-Asn 103 ) are clustered near the C terminus, close to the linker region between the EC1 and EC2 domains (Figs. 2B and 4D). A comparison of the 1 H-15 N HSQC spectra of the CNR/Pcdh␣ EC1 domain in the presence and absence of calcium ions in solution (1.5 mM) exhibited calcium-induced chemical shift changes in a subset of cross-peaks, including those of Glu 11 , Arg 66 , and Asp 99 . These sites match well with the conserved calcium-binding regions mentioned above. Calcium-induced chemical shift changes were also seen for other residues in the region surrounding these sites (Fig. 4, A  and B).
Interestingly, calcium-induced resonance shifts were also observed for the cross-peaks from the Cys 70 , Gly 71 , and Ser 73 amides forming the disulfide-bonded loop Cys 70 -Gly-Arg-Ser-Ala-Glu-Cys 76 (Fig. 4B). Moreover, the disulfide-bonded loop is spatially located very close to the clusters of calcium-binding residues near the C terminus (Fig. 4D). As the Cys-X 5 -Cys sequence is conserved among the cluster-type protocadherin families (Pcdh␣, Pcdh␤, and Pcdh␥) (Fig. 4C) and among other non-cluster-type protocadherins (protocadherin-8, Arcadlin), but no in classical cadherins, Cys-X 5 -Cys may be an additional novel calcium-binding motif unique to the protocadherin family.
Previous analyses of the crystal structures of the EC1 and EC2 domains of E-cadherin (37,38) and N-cadherin (39) suggested that these calcium-binding residues are involved in calcium-mediated protein-protein interactions. Likewise, the disulfide-bonded loop of the CNR/Pcdh␣4 EC1 domain may be affected by calcium binding and eventually participate in interactions with other unknown proteins. Generally, the disulfidebonded loops of Cys-X 5 -Cys sequences are involved in the protein-protein interactions of viral glycoproteins. For example, a disulfide-bonded loop in the human immunodeficiency virus-1 envelope glycoprotein gp41 plays a central role in gp41-gp120 association and the Env fusion function (40,41). This example suggests that the Cys-X 5 -Cys sequence of the CNR/Pcdh␣ family could function as a novel calcium-dependent adhesion interface.
Homophilic Adhesion Interfaces in the Cadherin Superfamily-In the following analysis, we sought to highlight the structural differences between the adhesion interface of classical cadherins and the corresponding region of protocadherins to better understand the adhesion repertoire of the cadherin superfamily. The CNR/Pcdh␣4 EC1 domain does not have Trp 2 or a hydrophobic pocket, which is essential for the adhesiveness of the classical cadherins (Fig. 5A). Among the classical, type II, and desmosomal cadherins, the amino acid residues constitut-  A) showing the assignments of the perturbed cross-peaks. Peaks of significantly perturbed resonances are labeled. C, alignments of the Cys-X 5 -Cys sequences of the mouse protocadherin family, of CNR/Pcdh␣4 from various species, and of members of the mouse and zebrafish CNR/Pcdh␣ families using ClustalX (55). D, space-filling models of the CNR/Pcdh␣4 EC1 domain structure viewed from the same direction as in Fig. 3B (left) and from the C-terminal bottom (right) showing the Cys-X 5 -Cys site in yellow. The residues corresponding to the calcium-binding site in classical cadherins are in red. NOVEMBER 3, 2006 • VOLUME 281 • NUMBER 44  Fig. 3B. Residues that form the hydrophobic pocket that is important for adhesiveness in classical cadherins (N-cadherin) and the corresponding cluster of hydrophobic residues in CNR/Pcdh␣ are highlighted with their side chains shown in red. B, schematic diagram of the chimeric Fc protein variants. The variants include Ncad-Fc (blue diamond), CNREC1/Ncad-Fc (green triangle), CNR/Pcdh␣-Fc (red circle), and Fc (black X). sig, signal peptide; PRO, prodomain; TM, transmembrane region; CP, cytoplasmic region. C, Coomassie Blue-stained gel of the purified Fc proteins used in the binding assays (left panel) and beads coated with chimeric Fc protein variants containing similar levels of proteins as assessed by immunoblotting using anti-Fc antibody following the aggregation assay (right panel ). Note that the CNR/Pcdh␣-Fc fusion protein exhibited almost the same molecular mass as Ncad-Fc and NcadEC1/CNR-Fc, whereas it should be larger by ϳ100 amino acids. As the CNR/Pcdh␣-Fc fusion protein also showed almost the same molecular mass as another classical cadherin (E-cadherin-Fc) in our previous study (22), it appears that classical cadherins undergo more modification (such as glycosylation) than does CNR/Pcdh. ing the hydrophobic pocket (Ile 24 , Tyr 36 , Ala 78 , Ala 80 , and Ile 92 in the case of N-cadherin) are almost completely conserved (Figs. 3A and 5A). Although the CNR/Pcdh␣4 EC1 domain does have a hydrophobic cluster in the corresponding region, the pocket is not as deep as that in classical cadherins, in which the pocket is large enough to accommodate the side chain of Trp 2 . The CNR/Pcdh␣4 EC1 domain hydrophobic cluster consists of Ile 22 , Leu 26 , Leu 28 , Leu 33 , Phe 38 , Val 82 , Val 84 , and Phe 91 (Fig. 5A). Among these, Ile 22 , Phe 38 , Val 82 , Val 84 , and Phe 91 correspond to the amino acids forming the hydrophobic pocket in N-cadherin, respectively. The short side chains of Ala 78 and Ala 80 contribute to the formation of the deep hydrophobic space in N-cadherin, whereas the corresponding amino acids in CNR/Pcdh␣4 (Val 82 and Val 84 ) have bulkier side chains that make the cavity smaller. Moreover, some other bulky amino acid residues located in loop BC (Leu 26 , Leu 28 , and Leu 33 ) participate in the hydrophobic cluster, filling the hydrophobic cavity and diminishing the open volume of the pocket. Although the calcium-binding sites are conserved in protocadherins as in classical cadherins, the lack of a hydrophobic pocket suggests that the homophilic adhesion interface that is important for classical cadherins does not exist in protocadherins.

Structure of the EC1 Domain of CNR/Protocadherin-␣
These structural differences between the EC1 domains of CNR/Pcdh␣ and N-cadherin led us to speculate that the previously reported difference in homophilic adhesiveness might be attributable to the difference in EC1 domains alone. We previously detected no homophilic binding activity for the CNR/Pcdh␣ family proteins in adhesion experiments using protein-coated beads, the design and interpretation of which largely followed those made for classical cadherins (22). To compare the adhesion activity of the EC1 domains more specifically, we replaced the EC1 domain of N-cadherin with the EC1 domain of CNR/Pcdh␣4 and compared the homophilic binding activities of the mutant ectodomains. In this experiment, the extracellular domains were tagged at their C termini with the Fc region of human IgG (Fig. 5, B and C). Chimeric Fc proteins were captured via their Fc tags on fluorescent beads (0.39 m) coated with anti-Fc antibody. We used smaller beads (0.39 m in diameter) than those in our previously reported assay (6.2 m) (22). Recently, these smaller beads were used to detect subtle differences in the homophilic adhesion activity among members of another diversified neuronal transmembrane family, Down syndrome cell adhesion molecule (35). We examined the ability of these mutant extracellular domains to bind each other by using FACS to monitor bead aggregation over a 120-min time period (Fig. 5D). As reported previously (42), the ectodomain of full-length N-cadherin induced robust aggregation, whereas that of CNR/Pcdh␣ showed no aggregation. Notably, the domain-exchanged CNREC1/Ncad-Fccoated beads exhibited no aggregation. Visualization of the beads by fluorescence microscopy at the 120-min time point revealed aggregates of N-cadherin-coated beads, but not of the other beads, which were not coated with proteins possessing the EC1 domain of N-cadherin (Fig. 5E) (42). As a control experiment, we also exchanged the EC1 domain of CNR/Pcdh␣ with its counterpart in N-cadherin in the context of CNR/Pcdh␣ (supplemental Fig. 1). In this experi-ment, we also removed the EC6 domain of CNR/Pcdh␣ and constructed NcadEC1/CNR-Fc so that the chimera would have the same number of EC domains as N-cadherin. (N-cadherin has five EC domains, whereas native CNR/ Pcdh␣ has six.) However, we observed no significant difference between the aggregation activities of the beads coated with CNR/Pcdh␣-Fc and those coated with NcadEC1/ CNR-Fc (supplemental Fig. 1). We were not surprised by this result because recent studies demonstrated that the EC1 domain of classical cadherin alone is not sufficient to mediate adhesion activity. Different studies showed that the EC1 and EC2 domains (in a cell aggregation assay) (43) or EC1-EC3 domains (in a bead aggregation assay) (44) are the minimal elements essential for homophilic adhesion activity, even though the EC1 domains are still mainly required because they provide the essential region for the interaction.
These functional data suggest that the difference between the homophilic adhesion of CNR/Pcdh␣4 and that of the classical cadherins is due to the structural differences in the hydrophobic pocket in their EC1 domains. However, we cannot exclude the possibility that structural differences in regions other than the hydrophobic pocket are also partly responsible for lack of CNR/Pcdh␣ homophilic interaction.
Our structural study of the CNR/Pcdh␣ EC1 domain and the functional EC1 domain-exchanging experiment provide molecular evidence for differences in the homophilic binding activities of the protocadherins and classical cadherins. The protocadherins may use a different interface, including characteristic loop regions, to exert their functions.
Heterophilic Adhesion Interface Unique to Protocadherins-Our solution structure of the CNR/Pcdh␣4 EC1 domain shows that the RGD motif (positions 45-47) is exposed to solvent as a loop. RGD motifs are often found in extracellular proteins as a consensus sequence for interactions with integrins (Fig. 6A). The tenth human fibronectin type III domain of fibronectin also contains the RGD sequence in loop FG, which is known to be its functional binding site for integrin (45). As with the CNR/Pcdh␣4 EC1 domain, the RGD site in the tenth fibronectin type III domain is exposed to solvent as a loop (Fig. 6A). In addition, we determined that the conformation of the RGD site of the CNR/Pcdh␣4 EC1 domain is too flexible to take on a particular secondary structure. This was based on the results of amide 15 N spin relaxation analysis by 500 and 600 MHz NMR assuming anisotropic rotational diffusion (27), which showed the RGD site to exhibit relatively low squared generalized order parameters (S 2 ϭ 0.63 for Arg 45 ), relatively low 1 H-15 N steady-state NOEs (0.62 for Arg 45 and 0.58 for Asp 47 ), and relatively high chemical exchange rates (R ex ϭ 1.63 s Ϫ1 for Arg 45 and 3.29 s Ϫ1 for Asp 47 ). Generally, low S 2 and NOE values indicate the presence of pico-to nanosecond time-scale motions, and high R ex values indicate the presence of micro-to millisecond time-scale motions. Furthermore, the RGD site exhibited amide peaks with very weak intensities in the 1 H-15 N HSQC spectra and almost no long-range NOE cross-peaks involving non-labile protons in the two-dimensional NOESY spectra. The solution structure of a mouse fibronectin cell-attachment section consisting of the linked ninth and tenth type III modules, mFnFn3 (9,10), also shows a disordered RGD site (46). Therefore, the interaction of RGD sites with integrins may be accompanied by an induced fit of the RGD sites. These characteristics of the CNR/Pcdh␣4 EC1 domain RGD site meet the two structural criteria required for the biologically active conformation of an RGD site. These criteria are 1) high surface accessibility of the RGD sequence and 2) placement of the RGD site on a loop or a ␤ turn (59).
We recently showed that the RGD motif of the CNR/ Pcdh␣4 EC1 domain is functionally involved in cell adhesion via ␤1 integrin (22). Here, we sought to extend this previous functional insight to include the context of the newly determined protein structure. The adhesion capacities of inte-grins generally require the transition of their structure from a highly bent conformation, which has only low affinities for biological ligands, to an extended conformation through a switchblade-like opening process. The factors known to induce this activation of integrins include divalent cations such as Mn 2ϩ in the extracellular medium and several specific "stimulatory" monoclonal antibodies that provoke a conformational change upon binding to a particular integrin (47). To determine whether the conformational activation of integrin is required for it to bind the CNR/Pcdh␣ EC1 domain, we modified our previous cell adhesion assay from one using serum in the cell suspension buffer, which may cause the nonspecific activation of integrins (22), to one  FIGURE 7. Evolutionary features of the CNR/Pcdh␣ EC1 domain structure. A, quality index scores (QI) of multiple alignments of the CNR/Pcdh␣ family members from various species. Note that, in the zebrafish EC1 domains, the sequences are greatly diversified. B, the degree of EC1 domain sequence conservation for each species mapped using the Consurf server (58). Paralogously conserved residues are mapped on the structures of CNR/Pcdh␣4. Highly conserved residues are in red, and diversified residues are in blue.

Structure of the EC1 Domain of CNR/Protocadherin-␣
using Mn 2ϩ or the ␤1 integrin stimulatory monoclonal antibody TS2/16 (48) to induce directly the activation of ␤1 integrins. In the revised assay system, HEK293T cells exhibited no adhesion to wells coated with CNR/Pcdh␣4 EC1-Fc protein in the absence of manganese (1 mM MgCl 2 and CaCl 2 ) (data not shown). However, the addition of manganese (1 mM MnCl 2 ) caused a dramatic rise in the adhesion activity of the HEK293T cells (Fig. 6, B, panel a; and C). The adhesion efficacy of CNR/Pcdh␣4 EC1-Fc was similar to that of CNR/ Pcdh␣-Fc (supplemental Fig. 2), as reported using the previous assay system. Moreover, the specific activation of ␤1 integrin by antibody TS2/16 also increased the cell adhesion to CNR/Pcdh␣4 EC1-Fc, even in the absence of manganese (1 mM MgCl 2 and 1 mM CaCl 2 ). Control IgG induced no adhesion (Fig. 6, B, panels b-d; and C). The activation with antibody TS2/16 was specific because the control Fc protein did not exhibit cell adhesion (Fig. 6, B, panel e; and C). These data indicate that the activation of ␤1 integrin is at least partly required for the binding of HEK293T cells to CNR/ Pcdh␣4 EC1-Fc. Although our present and previous data do not exclude the possibility of other sites outside the RGD being involved in the interaction, these structural and functional data suggest that the RGD loop of the CNR/Pcdh␣4 EC1 domain acts as at least one of the binding sites for integrins in the active conformation.
Evolutionary Features of the Structure of the CNR/Pcdh␣ Family EC1 Domain-One feature of the CNR/Pcdh␣ family is its molecular diversity among species (18, 49 -53). Although the EC1 domain of CNR/Pcdh␣ is conserved in terms of its sequence among family members in mouse, rat, chick, and human, the EC1 domain of the zebrafish CNR/Pcdh␣ differs significantly among family members (Fig. 7A). To assess the significance of this diversity, we mapped the information from multiply aligned sequences for each species onto the CNR/ Pcdh␣4 EC1 domain structure (Fig. 7B). In every species, most of the sequentially diverse regions are restricted to the surface area of the structure. In particular, the amino acid sequences of the disulfide-bonded loops between Cys 70 and Cys 76 tend to be diverse. In the case of zebrafish, there is extensive diversity over the entire molecular surface. The conservation of the EC1 domain in mammals, in contrast to the diversification of this domain in zebrafish, may reflect different roles played by the CNR/Pcdh␣ family in the markedly different brain structures of different vertebrate classes.
Conclusion-We have determined, for the first time, the structure of a protein in the protocadherin family, which accounts for the largest subgroup within the cadherin superfamily. Several characteristic features were revealed in the structure of the EC1 domain for this family. These include (i) the lack of an interface for the homophilic adhesiveness that is typically found in classical cadherins, (ii) the loop region structures as molecular interfaces distinct from those of classical cadherins, and (iii) diversity in the surface structures relating to evolutionary differences.
For years, our understanding of the adhesion repertoire of the cadherin superfamily was derived largely from studies of the classical cadherin subfamily. Our analysis of the much larger protocadherin subfamily extends this understanding, indicat-ing that the adhesion repertoire of this superfamily is greater than previously expected. Moreover, these results provide a framework for increasing our understanding of the functions of the cadherin superfamily. Because the members of the protocadherin family are expressed mainly in the nervous system, their diverse structures and functional repertoire may be specifically required for the construction of the highly organized brain. Our study also paves the way for studies designed to reveal more about the molecular basis of the extraordinary diversity of brains.