Mechanism of dimerization and structural features of human LI-cadherin

Liver intestine (LI)-cadherin is a member of the cadherin superfamily, which encompasses a group of Ca2+-dependent cell-adhesion proteins. The expression of LI-cadherin is observed on various types of cells in the human body, such as normal small intestine and colon cells, and gastric cancer cells. Because its expression is not observed on normal gastric cells, LI-cadherin is a promising target for gastric cancer imaging. However, because the cell adhesion mechanism of LI-cadherin has remained unknown, rational design of therapeutic molecules targeting this cadherin has been hampered. Here, we have studied the homodimerization mechanism of LI-cadherin. We report the crystal structure of the LI-cadherin homodimer containing its first four extracellular cadherin repeats (EC1-4). The EC1-4 homodimer exhibited a unique architecture different from that of other cadherins reported so far, driven by the interactions between EC2 of one protein chain and EC4 of the second protein chain. The crystal structure also revealed that LI-cadherin possesses a noncanonical calcium ion–free linker between the EC2 and EC3 domains. Various biochemical techniques and molecular dynamics simulations were employed to elucidate the mechanism of homodimerization. We also showed that the formation of the homodimer observed in the crystal structure is necessary for LI-cadherin–dependent cell adhesion by performing cell aggregation assays. Taken together, our data provide structural insights necessary to advance the use of LI-cadherin as a target for imaging gastric cancer.

Liver intestine (LI)-cadherin is a member of the cadherin superfamily, which encompasses a group of Ca 2+ -dependent cell-adhesion proteins. The expression of LI-cadherin is observed on various types of cells in the human body, such as normal small intestine and colon cells, and gastric cancer cells. Because its expression is not observed on normal gastric cells, LI-cadherin is a promising target for gastric cancer imaging. However, because the cell adhesion mechanism of LI-cadherin has remained unknown, rational design of therapeutic molecules targeting this cadherin has been hampered. Here, we have studied the homodimerization mechanism of LI-cadherin. We report the crystal structure of the LI-cadherin homodimer containing its first four extracellular cadherin repeats (EC1-4). The EC1-4 homodimer exhibited a unique architecture different from that of other cadherins reported so far, driven by the interactions between EC2 of one protein chain and EC4 of the second protein chain. The crystal structure also revealed that LI-cadherin possesses a noncanonical calcium ion-free linker between the EC2 and EC3 domains. Various biochemical techniques and molecular dynamics simulations were employed to elucidate the mechanism of homodimerization. We also showed that the formation of the homodimer observed in the crystal structure is necessary for LI-cadherin-dependent cell adhesion by performing cell aggregation assays. Taken together, our data provide structural insights necessary to advance the use of LI-cadherin as a target for imaging gastric cancer.
Cadherins are a family of glycoproteins responsible for calcium ion-dependent cell adhesion (1). There are more than 100 types of cadherins in humans, and many of them are not only responsible for cell adhesion but also involved in tumorigenesis (2). Human liver intestine-cadherin (LI-cadherin) is a nonclassical cadherin composed of an ectodomain consisting of seven extracellular cadherin (EC) repeats, a single transmembrane domain, and a short cytoplasmic domain (3). Previous studies have reported the expression of LI-cadherin on various types of cells, such as normal intestine cells, intestinal metaplasia, colorectal cancer cells, and lymph node metastatic gastric cancer cells (4,5).
Because human LI-cadherin is expressed on gastric cancer cells but not on normal stomach tissues, LI-cadherin has been proposed as a target for imaging metastatic gastric cancer (6). Previous studies have reported that LI-cadherin works as a calcium ion-dependent cell adhesion molecule as other cadherins do (7). Also it has been shown that trans-dimerization of LI-cadherin is necessary for water transport in normal intestinal cells (8). Sequence alignment of mouse LI-, E-, N-, and P-cadherins (classical cadherins) has revealed significant sequence similarity between EC1-2 of LI-cadherin and EC1-2 of E-, N-, and P-cadherins, as well as between EC3-7 of LIcadherin and EC1-5 of E-, N-, and P-cadherins ( Fig. 1) (9). From the sequence similarity and the proposed absence of calcium ion-binding motifs (10,11) between EC2 and EC3 repeats, there is speculation that LI-cadherin has evolved from the same five-repeat precursor as that of classical cadherins (9).
However, LI-cadherin is different from classical cadherins in several aspects, such as the number of EC repeats and the length and sequence of the cytoplasmic domain. Classical cadherins possess five EC repeats, whereas LI-cadherin displays seven (2). Classical cadherins possess a conserved cytoplasmic domain comprising more than 100 amino acids, whereas that of LI-cadherin is only 20 residues long with little or no sequence homology to that of classical cadherins (7,12).
The characteristics of LI-cadherin at the molecular level, including the homodimerization mechanism, remain unknown. Homodimerization is the fundamental event in cadherin-mediated cell adhesion as shown previously (13,14,15). For example, classical cadherins form a homodimer mediated by the interaction between their two N-terminal EC repeats (EC1-2) (10,14).
In this study, we aimed to characterize LI-cadherin at the molecular level because the molecular description of the target protein may play a significant role for the rational design of therapeutic approaches. We have extensively validated LI-cadherin to identify its homodimer architecture. Here, we report the crystal structure of the homodimer form of human LI-cadherin EC1-4. The crystal structure revealed a dimerization architecture different from that of any other cadherin reported so far. It also showed canonical calcium-binding motifs between EC1 and EC2, and between EC3 and EC4, but not between EC2 and EC3. By performing various biochemical and computational analyses based on this crystal structure, we interpreted the characteristics of the LI-cadherin molecule. In addition, we showed that the formation of the EC1-4 homodimer is necessary for LI-cadherin-dependent cell adhesion through cell aggregation assays. Our study revealed possible architectures of LI-cadherin homodimers at the cell surface and suggested the differential role of the two additional EC repeats at the Nterminus compared with classical cadherins.

Homodimerization propensity of human LI-cadherin
To investigate the homodimerization mechanism of LIcadherin, we expressed the entire ectodomain comprising EC1-7 (Table S1) and analyzed the homodimerization propensity using sedimentation velocity analytical ultracentrifugation (SV-AUC). Although formation of a homodimer was observed, it was not possible to determine its dissociation constant (K D ) because no concentration dependence in the weight-average of the sedimentation coefficient was discerned, suggesting a very slow dissociation rate (Fig. 2). Therefore, to understand the homodimerization mechanism of LI-cadherin in more detail, we prepared truncated versions of LI-cadherin containing various numbers of EC repeats and evaluated their homodimerization potency. The constructs were designed based on the sequence homology between LI-cadherin and classical cadherins. We compared the sequence of human LI-cadherin and human classical cadherins (E-, N-, and P-cadherins) using EMBOSS Needle (16). As it has been pointed out in a previous study (9), EC1-2 and EC3-7 of human LI-cadherin had an approximately 30% sequence homology with EC1-2 and EC1-5 of human classical cadherins, respectively (Fig. 1, Tables S2 and S3).
Notably, Trp239 locates at the N-terminal end of LI-cadherin EC3, and because of that, it has been suggested that this Trp residue might function as an adhesive element equivalent to that of the conserved residue Trp2 of EC1 of classical cadherins, playing a crucial role in the formation of strand swap-dimer (9,10,(17)(18)(19). Considering the degree of sequence homology and that EC1-2 of classical cadherins is the element responsible for homodimerization, we hypothesized that EC1-2 and EC3-4 of LI-cadherin would be responsible for its dimerization. Therefore, we determined the degree of homodimerization of EC1-2 and EC3-4, as well as those of EC1-4 and EC1-5, using SV-AUC (Table S1).
Homodimerization of EC1-2, EC1-4, and EC1-5 was observed, and unlike EC1-7, the weight-average of the sedimentation coefficient increased in a concentration-dependent manner. The K D values determined were 75.0 μM, 39.8 μM, and 22.8 μM, respectively. In contrast, we did not observe dimerization when using EC3-4 despite the sequence similarity with EC1-2 of classical cadherins and the presence of Trp239 in EC3, a residue located at the analogous position to that of the key Trp2 residue in EC1 of classical cadherins (Fig. 2). The solution structure of EC3-4 even at a higher concentration was monomeric as determined by small-angle X-ray scattering (SAXS), supporting the results of SV-AUC (Fig. S1, A-E and Table S4).

Crystal structure analysis of EC1-4 homodimers
To determine the EC repeats responsible for the homodimerization of LI-cadherin, we determined the crystal structure of EC1-4 expressed in mammalian cells at 2.7 Å resolution ( Fig. 3 and Table 1). Each EC repeat was composed of the typical seven β-strands seen in classical cadherins, and three calcium ions bound to each of the linkers connecting EC1 and EC2, and EC3 and EC4 (Fig. 3). We also observed four N-glycans and two N-glycans bound to chains A and B, respectively, as predicted from the amino acid sequence. We could not resolve these N-glycans in their entire length because of their intrinsic flexibility. From the portion resolved, all N-glycans face the side opposite to the dimer interface.
Two unique characteristics were observed in the crystal structure of LI-cadherin EC1-4: (i) the existence of a calcium-free linker between EC2 and EC3 and (ii) an unusual homodimerization architecture not described before for cadherins. A previous study had suggested that LI-cadherin lacks a calcium-binding motif between EC2 and EC3 (9), and our crystal structure has confirmed that hypothesis experimentally. Crystal structures of cadherins displaying a calcium-free linker have been reported previously, and the biological significance of the calcium-free linker has been discussed (20,21).
The EC1-4 region of LI-cadherin was assembled as an antiparallel homodimer in which EC2 of one chain interacts with EC4 of the opposite chain. This architecture is different from that of other cadherins, such as classical cadherins, which exhibit a two-step binding mode (14), and to that of protocadherin γB3, which forms an antiparallel EC1-4 homodimer (22) stabilized by intermolecular interactions in which all the EC repeats participate.
The fact that the affinity of the EC1-5 homodimer is almost twice as high as that of the EC1-4 homodimer suggested the presence of contacts between EC1 and EC5, as can be predicted from the arrangement of EC1 of one chain and EC4 of the other chain in the crystal structure, although this interaction does not seem to be strong. In addition, there was no interaction between EC1-2 of one chain and EC1-2 of the other chain in the crystal structure, suggesting that the architecture of the EC1-2 homodimer detected by SV-AUC should be different from that of the EC1-4 homodimer.

Calcium-free linker
We investigated the calcium-free linker between EC2 and EC3. Classical cadherins generally adopt a crescent-like shape (15,18). However, in LI-cadherin, the arch shape was disrupted at the calcium-free linker region and because of that EC1-4 exhibited unique positioning of EC1-2 with respect to EC3-4.
Generally, three calcium ions bound to the linker between each EC repeat confer rigidity to the structure (11). In fact, a previous study on calcium-free linker of cadherin has shown that the linker showed some flexibility (21). To compare the rigidity of the canonical linker with three calcium ions and the calcium-free linker in LI-cadherin, we performed molecular dynamics (MD) simulations. In addition to the monomeric states, we also used the structure of the EC1-4 homodimer as the initial structure of the simulations. After confirming the convergence of the simulations by calculating RMSD values of Cα atoms (Fig. S2, A and B, see Experimental procedures for the details), we compared the rigidity of the linkers by calculating the RMSD values of Cα atoms of EC1 and EC3, respectively, after superposing those of EC2 alone (Fig. 4, A-C). The EC3 in the monomer conformation exhibited the largest RMSD. The RMSD values of EC3 in the homodimer were significantly smaller than those of EC3 in the monomer form. Dihedral angles consisting of Cα atoms of residues at the edge of each EC repeat also indicated that the EC1-4 monomer bends largely at the Ca 2+ -free linker (Fig. S3, A-C). These results showed that the calcium-free linker between EC2 and EC3 is more flexible than the canonical linker (Movies S1 and S2).
Another unique characteristic in the region surrounding the calcium-free linker was the existence of an α-helix at the bottom of EC2. To the best of our knowledge, this element at the bottom of the EC2 is not found in classical cadherins. The multiple sequence alignment of the EC1-2 of human LI-, E-, N-, and P-cadherin by ClustalW indicated that the insertion of the α-helix-forming residues corresponded to the position immediately preceding the canonical calciumbinding motif DXE in classical cadherins (10) (Fig. S4). The Asp and Glu residues of the DXE motif in LI-cadherin dimers EC1 and EC3 coordinate with calcium ions (Fig. S5, A and B) and were maintained throughout the simulation (Fig. S5, C-J). The α-helix in EC2 might compensate for the Figure 3. Crystal structure of the EC1-4 homodimer. Calcium ions are depicted in magenta. No calcium ions were observed between EC2 and EC3 in either molecule. Four partial N-glycans were modeled in chain A (light green) and two in chain B (cyan) (the amino acid sequence of EC1-4 is given in Table S1). EC, extracellular cadherin. Dimerization mechanism of LI-cadherin absence of calcium by conferring some rigidity to the molecule.

Interaction analysis of EC1-4 homodimers
To validate if LI-cadherin-dependent cell adhesion is mediated by the formation of the homodimer observed in the crystal structure, it was necessary to find a mutant exhibiting reduced dimerization tendency. First, we analyzed the interaction between two EC1-4 molecules in the crystal structure using the PISA server (Table S5) (23). The interaction was largely mediated by EC2 of one chain of LI-cadherin and EC4 of the other chain, engaging in hydrogen bonds and nonpolar contacts (Fig. 5). The dimerization surface area was 1254 Å 2 , and a total of seven hydrogen bonds (distance between heavy atoms < 3.5 Å) were observed. Based on the analysis of these interactions, we conducted site-directed mutagenesis to assess the contribution of each residue to the dimerization of LIcadherin. Eleven residues showing a percentage of buried area greater than 50% or one or more intermolecular hydrogen bonds (distance between heavy atoms <3.5 Å) were individually mutated to Ala (Tables S1 and S5). To quickly identify mutants with weaker homodimerization propensity, sizeexclusion chromatography with multiangle light scattering (SEC-MALS) was used. EC1-4WT (or mutants) at 100 μM were injected in the chromatographic column. Analysis of the molecular weight (MW) showed that the MW of F224A was the smallest among Dimerization mechanism of LI-cadherin all the samples evaluated ( Fig. 6A and Table 2). Analogous observations were made when the protein samples were injected at 50 μM ( Fig. S6A and Table 2). Similarly, the sample with the greatest elution volume among the 12 samples analyzed corresponded to F224A (Fig. 6, B and C and Fig. S6, B and C).
We must note that the samples eluted as a single peak, corresponding to a fast equilibrium between monomers and dimers as reported in a previous study using other cadherins (14). Although the samples were injected at 100 μM, they eluted at 4 μM because SEC will dilute the samples as they advance through the column. Considering that the K D of dimerization of EC1-4WT determined by AUC was 39.8 μM, at a protein concentration of 4 μM, the largest fraction of the eluted sample should be monomer. This explains why the MW of the WT sample was smaller than the MW of the homodimer (99.6 kDa), and why the differences in the MW among the constructs were small. We reasonably assumed that the decrease of MW and the increase of the elution volume indicated a lower fraction of the homodimer in the eluted sample, thus indicating a smaller dimerization tendency caused by the mutations introduced in the protein.
To confirm the disruption of the homodimer by mutation of Phe224, we performed SV-AUC measurement for EC1-4F224A. In agreement with the result of SEC-MALS, it was revealed that this construct does not form homodimer even at the highest concentration examined (120 μM, Fig. 7). Collectively, the mutational study showed that the mutation of Phe224 to Ala abolished homodimerization of EC1-4.

Contribution of Phe224 to dimerization
Although Phe224 does not engage in specific interactions (such as H-bonds) with the partner molecule of LI-cadherin in the crystal structure (Fig. S7, A and B), it buries a significant surface (71.8 Å 2 ), that is, 95% of its total accessible surface, upon dimerization as determined by the PISA server. To understand the role of Phe224 in the dimerization of LI-cadherin, we conducted separate MD simulations of the monomeric forms of EC1-4WT and that of EC1-4F224A. We first calculated the intramolecular distance between Cα atoms of residues 224 and 122. The simulations revealed that Ala224 in the mutant moves away from the strand that contains Asn122, whereas the original Phe224 remains within a closer distance   to Asn122 (Fig. 8A, Fig. S8, Movies S3 and S4). The movement in the mutated residue suggests that the side chain of Phe224 engages in intramolecular interactions, being stabilized inside the pocket. Superposition of EC2 (chain A) in the crystal structure of EC1-4 and EC2 during the simulation of the EC1-4F224A monomer suggests that the large movement of the loop containing Ala224 would cause steric hindrance and would inhibit dimerization (Fig. 8B).  The analysis of the thermal stability using differential scanning calorimetry (DSC) revealed that EC1-4F224A exhibited two unfolding peaks, whereas that of EC1-4WT displayed a single peak (Fig. 8C). These results suggested that a part of the EC1-4F224A molecule was destabilized by the mutation. In combination with the data from MD simulations, we propose that Phe224 contributes to the dimerization of LI-cadherin by restricting the movement of the residues around Phe224 and thus preventing the steric hindrance triggered by the large movement observed in the MD simulations of the alanine mutant. DSC measurements showed that some other mutants exhibited lower thermal stability than the WT protein ( Table 2 and Fig. S9). However, because the value of T M1 of F224A is the lowest among all the mutants examined, and because other mutants displaying lower T M1 than WT did not exhibit a drastic decrease in homodimer affinity like F224A, we conclude that among the residues evaluated by Ala scanning, Phe224 was the most critical for the maintenance of the homodimer structure and thermal stability.

Functional analysis of LI-cadherin on cells
To investigate if disrupting the formation of EC1-4 homodimer influences cell adhesion, we established a CHO cell line expressing full-length LI-cadherin WT or the mutant F224A (including the transmembrane and cytoplasmic domains fused to GFP) that we termed EC1-7GFP and EC1-7F224AGFP, respectively (Table S1 and Fig. S10). We conducted cell aggregation assays and compared the cell adhesion ability of cells expressing each construct and mock cells (nontransfected Flp-In CHO) in the presence of calcium or in the presence of EDTA. The size distribution of cell aggregates was quantified using a micro-flow imaging (MFI) apparatus. EC1-7GFP showed cell aggregation ability in the presence of CaCl 2 . In contrast, EC1-7F224AGFP and mock cells did not show obvious cell aggregates in the presence of CaCl 2 (Fig. 9, A-C). From this result, it was revealed that Phe224 was crucial for LI-cadherin-dependent cell adhesion and the formation of EC1-4 homodimer in a cellular environment.
We next performed cell aggregation assays using CHO cells expressing various constructs of LI-cadherin in which EC repeats were deleted, to elucidate the mechanism of cell adhesion induced by LI-cadherin. LI-cadherin EC1-5 and EC3-7 expressing cells were separately established (EC1-5GFP and EC3-7GFP) (Table S1 and Fig. S10). Importantly, neither EC1-5 nor EC3-7 expressing cells showed cell aggregation ability in the presence of CaCl 2 (Fig. 10).
Because the expression of EC1-5GFP was not conducive to cell aggregation, it is suggested that effective dimerization at the cellular level requires full-length protein. Combined with the observation from SV-AUC for EC1-7 from above, EC6 and/or EC7 should contribute to the slow dissociation of the homodimer. In the absence of EC6-7, it is likely that the dissociation rate of LI-cadherin would increase, thus impairing cell adhesion.
Similarly, expression of EC3-7 on the surface of the cells did not result in cell aggregation. In this case, the observation agreed with the results of SV-AUC and SAXS, showing that EC3-4 does not dimerize. The truncation of EC1-2 from LI-cadherin generates a cadherin similar to classical cadherin in the point of view that it has five EC repeats and that it has a Trp residue at the N-terminus. Together with the crystal structure of the EC1-4 homodimer, which showed that Trp239 was buried in its own hydrophobic pocket and not participating in homodimerization (Fig. S11), the fact that LI-cadherin EC3-7 did not aggregate represent a unique dimerization mechanism in LI-cadherin.

Discussion
This is the first report examining the architecture of LIcadherin EC1-4 homodimer and the flexibility of the Ca 2+free linker in LI-cadherin. The mutational study and the cell aggregation assay showed that LI-cadherin-dependent cell adhesion is mediated by the formation of the dimerization interface between EC2 in one chain and EC4 in the other chain, and the contribution from other EC repeats. Our findings regarding the novel EC1-4 homodimer advances the understanding of LI-cadherin at the molecular level. The EC1-2 homodimer, which was observed by SV-AUC, and the contribution of EC6 and/or EC7 to slow the dissociation rate of the homodimer are also important to understand the mechanism of cell adhesion mediated by LI-cadherin.
The EC1-2 homodimer observed by SV-AUC appeared not to be sufficient to maintain LI-cadherin-dependent cell adhesion. There is a possibility that the weaker dimerization of EC1-2 cannot maintain cell adhesion because of the mobility of the Ca 2+ -free linker between EC2 and EC3. Contrary to the canonical Ca 2+ -bound linker, such as the linker between EC1 and EC2, the linker between EC2 and EC3 in LI-cadherin does not contain Ca 2+ . The lack of Ca 2+ resulted in greater mobility when the EC1-4 homodimer observed by crystal structure (Fig. 3) was not formed. The combination of low dimerization affinity and high mobility likely explains the absence of EC1-2 driven cell adhesion (Fig. S13, A and B). Considering that there are several families of cell adhesion proteins in the human body, we cannot rule out the possibility that the EC1-2 homodimer is formed on cells where cell adhesion is maintained by other cell adhesion proteins. In LI-cadherin-dependent cell adhesion, we assume that the unique architecture of the EC1-4 homodimer was necessary to restrict the movement of the Ca 2+ -free linker and to maintain LI-cadherin-dependent cell adhesion (Fig. S14, A-C).
The noncanonical α-helix in EC2 may also contribute to the unique characteristics of LI-cadherin. A previous study on the Ca 2+ -free linker of protocadherin-15 indicated that a unique 3 10 helix in the middle of the Ca 2+ -free linker is one of the factors conferring mechanical strength to the linker (21). We assume that the α-helix in LI-cadherin EC2 is required to maintain structural rigidity in the absence of the coordination of negatively charged residues to Ca 2+ .
Contribution of EC6 and/or EC7 on homodimerization was also a notable factor. C-cadherin is known to form strandswap dimer, which is mediated by interactions at the N-terminal strand of EC1 (18). However, the bead aggregation assay and laminar flow assay suggested that C-cadherindependent binding activity is maintained through interactions of multiple EC repeats (24). Likewise, our findings that EC1-5GFP and EC3-7GFP do not show cell aggregation ability, whereas EC1-7GFP does, suggested that dimerization of LI-cadherin results from a collective effort from several repeats throughout the protein. According to data from the cell aggregation assays of the mutant EC1-7F224AGFP, it is clear that LI-cadherin-dependent cell adhesion is abolished by the mutation of Phe224, a residue located within the EC2 repeat. This result suggests that EC6 and/or EC7 contribute to cell adhesion after the formation of the EC1-4 homodimer, an idea that is also supported by the observation that EC3-7GFP does not show cell aggregation ability. The mechanism underlying the contribution of EC6 and EC7 repeats to the slow dissociation rate of the EC1-7 homodimer needs to be further investigated.
As for the reasons why EC1-5GFP did not aggregate, there are some alternative explanations. For example, the EC1-4 homodimer observed by X-ray crystallography and detected by SV-AUC cannot be replicated in the short EC1-5 construct in a cellular environment. We hypothesize that the overhang EC1 repeat in the dimer belonging to one cell would collide with the membrane of the opposing cell (steric hindrance) (Fig. S15A). It is also possible that inappropriate orientation of the approaching LI-cadherin molecules would contribute to the inability of EC1-5 to dimerize (Fig. S15B).
Several differences between LI-cadherin and E-cadherin might explain the reason for the unique biological characteristics of LI-cadherin. Both LI-cadherin and E-cadherin are expressed on normal intestine cells; however, their sites of expression are different. LI-cadherin is expressed at the intercellular cleft and is excluded from the adherens junctions (7), where E-cadherin is precisely expressed (25). Although LI-cadherin is not present at the adherens junctions, trans-interaction of LI-cadherin is necessary to maintain water transport through the intercellular cleft of intestine cells (8). Clustering on the cell membrane might also be different. Classical cadherins including E-cadherin are thought to form clusters on the cell membrane to facilitate cell adhesion. The lateral interaction interface of these cadherins was suggested from packing contacts in the lattice of protein crystals (15,18). In contrast to classical cadherins, we did not observe crystal packing contacts that might suggest lateral (cis) interactions in our crystal structure. Indeed, our crystal structure indicates that the few Nglycans present in LI-cadherin are directed toward the opposite side of the homodimer interface, suggesting that the protein chains belonging to the homodimer do not participate in cis-interactions. We speculate that LIcadherin form homodimers with a broad interface to maintain trans-interactions without the need of cis-clusters on the cell membrane.
In summary, our study with LI-cadherin has unveiled novel molecular-level features for the dimerization of a cadherin molecule. We expect that our data will provide fundamental information for the development of diagnostic tools and/or therapeutic solutions targeting LI-cadherin.

Protein sequence
Amino acid sequence of recombinant protein and LI-cadherin-expressing CHO cells are summarized in Table S1.

Expression and purification of recombinant LI-cadherin
All LI-cadherin constructs were expressed using the same method. All constructs were cloned in pcDNA 3.4 vector (Thermo Fisher Scientific). Recombinant protein was expressed using Expi293F Cells (Thermo Fisher Scientific) following manufacturer's protocol. Cells were cultured for 3 days after transfection at 37 C and 8% CO 2 .
Purification method was identical for all the constructs except for EC1-7 (see below). The supernatant was collected and filtered followed by dialysis against a solution composed of 20 mM Tris-HCl, pH 8.0, 300 mM NaCl, and 3 mM CaCl 2 . Immobilized metal affinity chromatography (IMAC) was performed using Ni-NTA Agarose (Qiagen). Protein was eluted by 20 mM Tris-HCl, pH 8.0, 300 mM NaCl, 3 mM CaCl 2 , and 300 mM imidazole. Final purification was performed by SEC using HiLoad 26/600 Superdex 200 pg column (Cytiva) at 4 C equilibrated in buffer A (10 mM Hepes-NaOH at pH 7.5, 150 mM NaCl, and 3 mM CaCl 2 ). Unless otherwise specified, samples were dialyzed against buffer A before analysis and the filtered dialysis buffer was used for assays.
For the purification of EC1-7, dialysis after the collection of the supernatant and IMAC were performed by the same method as other constructs. After the purification with IMAC, the fraction containing the protein was dialyzed against 20 mM Tris-HCl, pH 8.0, 5 mM NaCl, and 3 mM CaCl 2 , and anion-exchange chromatography was performed using a HiTrap Q HP column (1-ml size; Cytiva). The column was Figure 10. Cell adhesion mediated by short constructs. Cell aggregation assay using EC1-5GFP and EC3-7GFP. EC1-7GFP and Flp-In CHO (mock cells) were used as positive and negative controls, respectively. Particles that were 25 μm or larger were considered as cell aggregates. The number of cell aggregates of both EC1-5GFP and EC3-7GFP in the presence or absence of Ca 2+ was determined. Data are corresponded to the mean ± SEM of four measurements. EC, extracellular cadherin.
washed with anion A buffer (20 mM Tris-HCl, pH 8.0, 10 mM NaCl, and 3 mM CaCl 2 ) before the injection of the protein.
The percentage of anion B buffer (20 mM Tris-HCl, pH 8.0, 500 mM NaCl, and 3 mM CaCl 2 ) was increased in the stepwise manner in increments of 12.5% to elute the protein. Elution at an anion B buffer percentage of approximately 25% to 37.5% was collected for final purification. The final purification was performed by injecting the collected fractions onto a HiLoad 26/600 Superdex 200 pg column (Cytiva) at 4 C equilibrated in buffer A.
The collected data were analyzed using continuous c(s) distribution model implemented in program SEDFIT (version 16.2b) (26) fitting for the frictional ratio, meniscus, timeinvariant noise, and radial-invariant noise using a regularization level of 0.68. The sedimentation coefficient ranges of 0 to 15 S were evaluated with a resolution of 150. The partial specific volumes of EC1-7, EC1-2, EC3-4, EC1-4, EC1-5, and EC1-4F224A were calculated based on the amino acid composition of each sample using program SEDNTERP 1.09 (27) and were 0.732 cm 3 /g, 0.730 cm 3 /g, 0.733 cm 3 /g, 0.732 cm 3 /g, and 0.734 cm 3 /g, and 0.732 cm 3 /g, respectively. The buffer density and viscosity were calculated using program SEDNTERP 1.09 as 1.0055 g/cm 3 and 1.025 cP, respectively. Figures of c(s 20, w ) distribution were generated using program GUSSI (version 1.3.2) (28). The weightaverage sedimentation coefficient of each sample was calculated by integrating the range of sedimentation coefficients where peaks with obvious concentration dependence were observed. For the determination of the dissociation constant of monomer-dimer equilibrium, K D , the concentration dependence of the weight-average sedimentation coefficient was fitted to the monomer-dimer self-association model implemented in program SEDPHAT (version 15.2b) (29).

Solution structure analysis using SAXS
All measurements were performed at beamline BL-10C (30) of the Photon Factory. The experimental procedure is described previously (19). Concentrations of EC3-4 was 157 μM. Data were collected using a PILATUS3 2M (Dectris), and image data were processed by SAngler software (31). A wavelength was 1.488 Å with a camera distance 101 cm.
Exposure time was 60 s, and raw data between s values of 0.010 and 0.84 Å −1 were measured. The background scattering intensity of the buffer was subtracted from each measurement. The scattering intensities of four measurements were averaged to produce the scattering curve of EC3-4. Data are placed on an absolute intensity scale. Conversion factor was calculated based on the scattering intensity of water. The calculation of the theoretical curves of SAXS and χ 2 values were performed using FoXS server (32,33).

MD simulations
MD simulations of LI-cadherin were performed using GROMACS 2016.3 (34) with the CHARMM36m force field (35). A whole crystal structure of the EC1-4 homodimer, EC1-4 monomer form, EC1-4F224A monomer form, and EC3-4 monomer form was used as the initial structure of the simulations, respectively. EC1-4 and EC3-4 of chain A was extracted from EC1-4 homodimer crystal structure to generate the EC1-4 monomer form and EC3-4 monomer form, respectively. Sugar chains were removed from the original crystal structure. Missing residues were modeled by MOD-ELLER 9.18 (36). Solvation of the structures was performed with TIP3P water (37) in a rectangular box such that the minimum distance to the edge of the box was 15 Å under periodic boundary conditions through the CHARMM-GUI (38). Addition of N-bound type sugar chains (G0F) and the mutation of Phe224 in EC1-4 monomer to Ala224 were also performed through the CHARMM-GUI (38,39). The protein charge was neutralized with added Na or Cl, and additional ions were added to imitate a salt solution of concentration 0.15 M. Each system was energy-minimized for 5000 steps and equilibrated with the NVT ensemble (298 K) for 1 ns. Further simulations were performed with the NPT ensemble at 298 K. The time step was set to 2 fs throughout the simulations. A cutoff distance of 12 Å was used for Coulomb and van der Waals interactions. Long-range electrostatic interactions were evaluated by means of the particle mesh Ewald method (40). Covalent bonds involving hydrogen atoms were constrained by the LINCS algorithm (41). A snapshot was saved every 10 ps.
All trajectories were analyzed using GROMACS tools. RMSD, dihedral angles, distances between two atoms, and clustering were computed by rms, gangle, distance, and cluster modules, respectively.
The convergence of the trajectories was confirmed by calculating RMSD values of Cα atoms (Figs. S2, A and B and S8). As the molecule showed high flexibility at Ca 2+ -free linker, as for the EC1-4WT monomer, EC1-4F224A monomer, and EC1-4 dimer, RMSD of each EC repeat was calculated individually. Five Cα atoms at the N terminus were excluded from the calculation of RMSD of EC1 as they were disordered. As the RMSD values were stable after running 20 ns of simulations, we did not consider the first 20 ns when we analyzed the trajectories.

Generation of EC3-4_plus
MD simulation of the EC3-4 monomer was performed for 220 ns. The trajectories from 20 to 220 ns were clustered using the 'cluster' tool of GROMACS. The structure which exhibited the smallest average RMSD from all other structures of the largest cluster was termed EC3-4_plus and used for the purpose of comparison with the data in solution (SAXS).

Crystallization of LI-cadherin EC1-4
Purified LI-cadherin EC1-4 was dialyzed against 10 mM Hepes-NaOH at pH 7.5, 30 mM NaCl, and 3 mM CaCl 2 before crystallization. After the dialysis, the protein was concentrated to 314 μM. Optimal condition for crystallization was screened using an Oryx8 instrument (Douglas Instruments) using commercial screening kits (Hampton Research). The crystal used for data collection was obtained in a crystallization solution containing 200 mM sodium sulfate decahydrate and 20% w/v PEG 3350 at 20 C. Suitable crystals were harvested, briefly incubated in the mother liquor supplemented with 20% glycerol, and transferred to liquid nitrogen for storage until data collection.

Data collection and refinement
Diffraction data from single crystals of EC1-4 were collected in beamline BL-5A at the Photon Factory under cryogenic conditions (100 K). Diffraction images were processed with the program MOSFLM and merged and scaled with the program SCALA (42) of the CCP4 suite (43). The structure of EC1-4 was determined by the method of molecular replacement using the coordinates of P-cadherin (PDB entry code 4ZMY) (44) and LI-cadherin EC1-2 (PDB entry code 7EV1) with the program PHASER (45). The model thus obtained was further refined with the programs REFMAC5 (46) and extensively built with COOT (47). Validation was carried out with PRO-CHECK (48). Data collection and structure refinement statistics are given in Table 1. UCSF Chimera was used to render all of the molecular graphics (49).

Site-directed mutagenesis
Site-directed mutagenesis was performed as described previously (50).

SEC-MALS
The MW of LI-cadherin was determined using Superose 12 10/300 GL column (Cytiva) with inline DAWN8+ MALS (Wyatt Technology), UV detector (Shimadzu), and refractive index detector (Shodex). Protein samples (45 μl) were injected at 100 μM or 50 μM. Analysis was performed using ASTRA software (Wyatt Technology). The concentration at the end of the chromatographic column was measured based on the UV absorbance. The protein conjugate method was used for the analysis as sugar chains were bound to LI-cadherin. All detectors were calibrated using bovine serum albumin (Sigma-Aldrich).

Comparison of thermal stability by DSC
DSC measurement was performed using MicroCal VP-Capillary DSC (Malvern). The measurement was performed from 10 to 100 C at the scan rate of 1 C min −1 . Data were analyzed using Origin7 software.

Establishment of CHO cells expressing LI-cadherin
The DNA sequence of monomeric GFP was fused at the C-terminus of all human LI-cadherin constructs of which stable cell lines were established and was cloned in the pcDNA5/FRT vector (Thermo Fisher Scientific). CHO cells stably expressing LI-cadherin-GFP were established using Flp-In-CHO cell line following the manufacturer's protocol (Thermo Fisher Scientific). Cloning was performed by the limiting dilution-culture method. Cells expressing GFP were selected and cultivated. Observation of the cells was performed by In Cell Analyzer 2000 (Cytiva). The cells were cultivated in Ham's F-12 nutrient mixture (Thermo Fisher Scientific) supplemented with 10% fetal bovine serum (FBS), 1% L-glutamine, or 1% GlutaMAX-I (Thermo Fisher Scientific), 1% penicillin-streptomycin, and 0.5 mg ml −1 Hygromycin B at 37 C and 5.0% CO 2 .

Cell aggregation assay
The cell aggregation assay was performed by modifying the methods described previously (51,52). Cells were detached from the cell culture plate by adding 1× HMF supplemented with 0.01% trypsin and placing on a shaker at 80 rpm for 15 min at 37 C. FBS was added to the final concentration of 20% to stop the trypsinization. Cells were washed with 1× HMF supplemented with 20% FBS once and with 1× HMF twice to remove trypsin. Cells were suspended in 1× HMF at 1 × 10 5 cells ml −1 . Five hundred microliter of the cell suspension was loaded into a 24-well plate coated with 1% w/v bovine serum albumin. EDTA was added if necessary. After incubating the plate at RT for 5 min, the 24-well plate was placed on a shaker at 80 rpm for 60 min at 37 C.

MFI
MFI (Brightwell Technologies) was used to count the particle number and to visualize the cell aggregates after the cell aggregation assay. After the cell aggregation assay described above, the plate was incubated at RT for 10 min and 500 μl of 4% paraformaldehyde phosphate buffer solution (Nacalai Tesque) was loaded to each well. The plate was incubated on ice for more than 20 min. Images of the cells were taken using EVOS XL Core Imaging System (Thermo Fisher Scientific) if necessary. After that, cells were injected to MFI.
MFI View System Software and MFI View Analysis Suite were used for the measurements and analyses. The instrument was flushed with detergent and ultrapure water before the experiments. Cleanliness of the flow channel was checked by performing the measurement using ultrapure water and confirming that less than 100 particles/ml was detected. The flow path was washed with 1× HMF before the measurements of the samples. Purge volume and analyzed volume were 200 μl and 420 μl, respectively. Optimize Illumination was performed before each measurement. Particles within a size of 100 to 1 μm were counted.

Data availability
The coordinates and structure factors of LI-Cadherin EC1-4 have been deposited in the Protein Data Bank with entry code 7CYM. All remaining data are contained within the article.