Crystal Structure of HAb18G/CD147

CD147, a member of the immunoglobulin superfamily (IgSF), plays fundamental roles in intercellular interactions in numerous pathological and physiological processes. Importantly, our previous studies have demonstrated that HAb18G/CD147 is a novel hepatocellular carcinoma (HCC)-associated antigen, and HAb18G/CD147 stimulates adjacent fibroblasts and HCC cells to produce elevated levels of several matrix metalloproteinases, facilitating invasion and metastasis of HCC cells. In addition, HAb18G/CD147 has also been shown to be a novel universal cancer biomarker for diagnosis and prognostic assessment of a wide range of cancers. However, the structural basis underlying the multifunctional character of CD147 remains unresolved. We report here the crystal structure of the extracellular portion of HAb18G/CD147 at 2.8Å resolution. The structure comprises an N-terminal IgC2 domain and a C-terminal IgI domain, which are connected by a 5-residue flexible linker. This unique C2-I domain organization is distinct from those of other IgSF members. Four homophilic dimers exist in the crystal and adopt C2-C2 and C2-I dimerization rather than V-V dimerization commonly found in other IgSF members. This type of homophilic association thus presents a novel model for homophilic interaction between C2 domains of IgSF members. Moreover, the crystal structure of HAb18G/CD147 provides a good structural explanation for the established multifunction of CD147 mediated by homo/hetero-oligomerizations and should represent a general architecture of other CD147 family members.

CD147, a member of the immunoglobulin superfamily (IgSF), plays fundamental roles in intercellular interactions in numerous pathological and physiological processes. Importantly, our previous studies have demonstrated that HAb18G/CD147 is a novel hepatocellular carcinoma (HCC)-associated antigen, and HAb18G/CD147 stimulates adjacent fibroblasts and HCC cells to produce elevated levels of several matrix metalloproteinases, facilitating invasion and metastasis of HCC cells. In addition, HAb18G/CD147 has also been shown to be a novel universal cancer biomarker for diagnosis and prognostic assessment of a wide range of cancers. However, the structural basis underlying the multifunctional character of CD147 remains unresolved. We report here the crystal structure of the extracellular portion of HAb18G/CD147 at 2.8 Å resolution. The structure comprises an N-terminal IgC2 domain and a C-terminal IgI domain, which are connected by a 5-residue flexible linker. This unique C2-I domain organization is distinct from those of other IgSF members. Four homophilic dimers exist in the crystal and adopt C2-C2 and C2-I dimerization rather than V-V dimerization commonly found in other IgSF members. This type of homophilic association thus presents a novel model for homophilic interaction between C2 domains of IgSF members. Moreover, the crystal structure of HAb18G/CD147 provides a good structural explanation for the established multifunction of CD147 mediated by homo/heterooligomerizations and should represent a general architecture of other CD147 family members.
CD147 (also known as EMMPRIN (extracellular matrix metalloproteinase inducer), Basigin, M6, neurothelin, and HAb18G) is a type I integral transmembrane glycoprotein that belongs to the immunoglobulin superfamily (IgSF). 4 It plays pivotal roles in intercellular interactions involved in tumor metastasis and angiogenesis (1,2), spermatogenesis and fertilization (3,4), neural network formation and development (5,6), and human immunodeficiency virus infection and rheumatoid arthritis (7,8). CD147 is known to have acted as a central factor in the stimulation of matrix metalloproteinases (MMPs) required for tumor invasion and metastasis via cell-cell and cell-matrix interactions. Several studies have shown that the expression and functional activities of CD147 are associated with and regulated by other important molecules, including integrins (␣3␤1 and ␣6␤1) (9), monocarboxylate transporters (MCT1 and MCT4) (10), and cyclophilins (CypA, CypB, and Cyp60) (7,11,12). It has also been reported that CD147 stimulates the production of vascular endothelial growth factor and hyaluronan, leading to angiogenesis and multidrug resistance, respectively (2,13). It has widely been accepted that CD147 is a multifunctional transmembrane protein mediating molecular events that are crucial to many biological circumstances. However, the molecular mechanisms and structural basis whereby CD147 modulates numerous phenomena are still unclear.
HAb18G/CD147, a novel hepatocellular carcinoma (HCC)associated antigen, was cloned in our laboratory by screening the HCC cDNA expression library using anti-HCC mAb HAb18 (14). Over the years, we have conducted comprehensive researches on this antigen and its specific antibody, mAb HAb18, both in vitro and in vivo. Our studies have found that HAb18G/CD147 stimulates adjacent fibroblasts and HCC cells to produce elevated levels of several MMPs, facilitating invasion and metastasis of the HCC cells (15). Our studies have also suggested that HAb18G/CD147 may be responsible for the increased MMP secretion, cell invasion, and CpyA-mediated cell migrant into the joints, thus contributing to the cartilage and bone destruction of rheumatoid arthritis (8). HAb18G/ CD147 may also function in facilitating the entry of host cells by Research has led to a better understanding of the biological functions of CD147, but the structural basis of CD147 underlying its multifunctional character remains to be elucidated, especially its homo-oligomerization and the heterophilic interactions with other biological molecules such as integrins, cyclophilins, and its mAbs. The lack of a three-dimensional structure of HAb18G/CD147 hampers the design of new antibodies and small molecule regulators against this antigen. Thus, defining the structure of HAb18G/CD147 is of great interest and pharmaceutical importance.
The x-ray crystal structure of the soluble extracellular portion of HAb18G/CD147, solved to 2.8 Å resolution, consists of an N-terminal IgC2 domain and a C-terminal IgI domain, which are connected by a 5-residue flexible linker. Although it has been predicted that CD147 belongs to the IgSF and is highly homologous to the Ig variable (V) domain and major histocompatibility complex class II ␤ chain (17,18), the C2-I domain organization of HAb18G/CD147 we revealed from the crystal structure is unique in comparison with all known structures of IgSF members, which usually adopt V or I set Ig fold at their N termini. This structure provides for the first time an atomic view of an IgSF member with a C2-I domain organization and is likely to represent a general architecture of other CD147 family members. Notably, four copies of HAb18G/CD147 molecules are present in the asymmetric unit with virtually identical individual domains but divergent interdomain orientations and buried surface areas, indicating that these molecules adopt different conformations due to the flexibility of the linker. Moreover, four different pairwise combinations of HAb18G/CD147 molecules were observed in the crystal. Taken together, the unique domain organization, overall flexibility, and diverse dimerization manners of HAb18G/CD147 provide structural clues for the multiple oligomerization-dependent functions of this molecule.

EXPERIMENTAL PROCEDURES
Expression and Purification of Extracellular Portion of HAb18G/CD147-A cDNA encoding amino acids 22-205 of HAb18G/CD147 was inserted into pET21a (ϩ) (Novagen) with NdeI and XhoI, and the integrity was confirmed by automated sequencing. This construct was chemically transformed into the Escherichia coli strain and grown in LB, yielding secretion of soluble HAb18GEP. The bacterial pellet was resuspended and sonicated in 20 mM Tris-HCl, pH 8, and centrifuged at 18,000 ϫ g for 30 min. The supernatant was applied directly to a HiTrap Q HP column, followed by a Mono Q ion exchange column (GE Healthcare). A Superdex 75 gel-filtration column (GE Healthcare) was used as the final purification step with 20 mM Tris-HCl, 150 mM NaCl, pH 8. The eluate was concentrated to 20 mg/ml for crystallization.
Crystallization and Data Collection-Tetragonal crystals were obtained with hanging drop vapor diffusion method in crystallization solution containing 0.5 M ammonium sulfate, 0.1 M trisodium citrate dihydrate, pH 5.6, 1.0 M lithium sulfate monohydrate at 4°C. Crystals were cryoprotected in the mother liquor added with 30% of 6 M lithium acetate and flashcooled in liquid nitrogen for data collection at 100 K. The native diffraction data were collected to 2.8 Å resolution at wavelength of 1.0000 Å at beamline BL17A of the Photon Factory, Japan, using an ADSC Quantum 270 charge-coupled device detector. The single wavelength anomalous dispersion data were collected to 3.1 Å resolution at wavelength of 0.9792 Å at beamline NW12 of the Photo Factory using an ADSC Quantum 210 charge-coupled device detector. All data were indexed, integrated, and scaled using program HKL2000 (19). Crystals belong to space group P4 1 2 1 2 containing four molecules in an asymmetric unit and have a solvent content of ϳ70%. Statistics for data collection are summarized in Table 1.
Structure Determination and Refinement-The structure of HAb18G/CD147 extracellular portion was solved using the single wavelength anomalous dispersion method as implemented in SOLVE (20). The single wavelength anomalous dispersion phases were improved by statistical density modification, including solvent flattening and histogram matching using the program RESOLVE (21), increasing the overall figure-of-merit from 0.30 to 0.62 at 3.1 Å resolution. RESOLVE automatically built a partial polyalanine model of 368 residues. A complete model of a single HAb18G/CD147 molecule was manually built in program O (22). This model was directly used as the search model to determine the initial phases of the 2.8 Å native data by molecular replacement method implemented in program CNS (23). Structure refinement was carried out with the standard protocols of CNS. At the final step of refinement, TLS parameters generated by using the TLSMD server and restrained refinement were used in program O (24). 95% data were used for structure refinement, and 5% data were used for monitoring the free R factor. The structure quality was evaluated by using the program PROCHECK (25). The Ramachandran plot indicated that 84.7%, 12.9%, 1.9%, and 0.5% of the backbone torsion angles were in most favored, additionally allowed, generously allowed, and disallowed regions, respectively. The statistics of the structure refinement and the quality of structure models are summarized in Table 1. Secondary structures were assigned by DSSP (26). The graphics were prepared by using PyMOL. 5 Molecular Dynamics Simulations of HAb18G/CD147 Dimers-The crystal structures of dimers were taken as the starting points for molecular dynamics (MD) simulations. Each MD simulation was carried out using the AMBER suite of programs (version 8.0) with the parm99 force field (27). Each structure was prepared by using the xLeap module in AMBER, in which protons were added to the structure. All ionizable side chains were maintained in their standard protonation states at pH 7.0. The proteins were solvated in cubic box of water molecules, with a water thickness extending at least 10 Å apart from the protein surface. To avoid the instability that might occur during the MD simulations, the solvated system was subjected to minimization for 5000 cycles with protein restrained and followed by another 5000 cycles with the whole system relaxed. Then, the system was gradually heated from 0 K to 300 K during the first 60 ps by three intervals, followed by equilibrium for 80 ps under constant volume and temperature (constant particle number (N), volume (V), and temperature (T)) condition. Afterward, the system was switched to constant pressure and temperature (constant particle number (N), pressure (P), and temperature (T)) condition and equilibrated for 100 ps to adjust the system to a correct density. Finally, the production simulations were carried out in the absence of any restraint under constant particle number (N), pressure (P), and temperature (T) condition, and the data were collected. This protocol was applied to all of the simulation systems.
All the MD simulations were performed using the parallel version of PMEMD in AMBER suit. The particle mesh Ewald method was employed to calculate the long range electrostatic interactions, whereas the lengths of the bonds involving hydrogen atoms were fixed with the SHAKE algorithm (28) of Miyamoto and Kollman (29). During the simulations, the integration time step of 2 fs was adopted, and structural snapshots were flushed every 500 steps (1 ps). The non-bonded cutoff was set to 10.0 Å, and the non-bonded pair list was updated every 25 steps. Each production simulation was coupled to a 300 K thermal bath at 1.0 atm pressure by applying the Berendsen algorithm (30). The temperature and pressure coupling constants were set to 2.0 and 1.0 ps, respectively.
On the basis of the equilibrated dynamic trajectory, the binding free energy of each dimer was calculated by using the MM-GBSA method encoded in the AMBER 8.0 program. Coordinates from the dynamic trajectory were used every 20 ps (5 snapshots out of 100 ps were processed), and the MM-GBSA calculation was performed on each of them using the AMBER 8.0 program. For each snapshot collected during the simulation, dimer binding free energy (⌬G binding ) was calculated using Equation 1 (31,32), where ⌬G dimer , ⌬G monomer1 , and ⌬G monomer2 are the free energies of the dimer and two monomers. Each free energy term in Equation 1 in AMBER 8.0 was calculated with the absolute free energy of the species (two monomers and their dimer) in gas phase (E gas ), the solvation free energy (⌬G solvation ), and the entropy term (T⌬S) using Equation 2, E gas is the sum of the internal strain energy (E int ), van der Waals energy (E vdw ), and electrostatic energy (E electrostatic ) (Equation 3). E int is the energy associated with vibrations of covalent bonds and bond angles, rotation of single bond torsional angles (Equation 4).
The solvation free energy, ⌬G solvation , is approximated as the sum of the polar contribution (G GB ) and nonpolar contribution (G nonpolar ) using a continuum representation of the solvent in Equation 5.
The polar contribution (G GB ) to the solvation energy was calculated using GB model in sander module. The nonpolar contributions (G nonpolar ) were estimated using a simple equation: (33). SASA is the solventaccessible surface area that was estimated using the MSMS algorithm with a probe radius of 1.4 Å (34). The surface tension proportionality constant ␥ and the free energy of non-polar salvation for a point solute b were set to 0.00542 kcal/(mol⅐Å 2 ) and 0.092 kcal/mol, respectively.
The entropy calculation is extremely time-consuming for large systems. In addition, the main aim of calculating the binding free energy is to address the influence of mutations to the binding affinity. In principle, site-directed mutation may not result in dramatically conformational changes for the protein. Accordingly, entropy contributes little to the relative binding free energy changes for the binding strength of one dimer in comparison with other dimers. Therefore, in this study, ⌬G binding (without the term of ϪT⌬S) was estimated to address the relative binding strength of the protein dimers. All the MD simulations and MM-GBSA calculations were performed on an Origin3800 SGI computer.

RESULTS
Overall Structure of the Extracellular Portion of HAb18G/CD147-HAb18G/CD147 crystallizes in space group P4 1 2 1 2 with four monomers in the asymmetric unit (designated A, B, C, and D) (Fig. 1). All residues are traced from the electron density except the first residue at the N terminus and 3-4 residues at the C terminus. The HAb18G/CD147 structure comprises two immunoglobulin-like domains connected by a fiveresidue linker. This architecture agrees with the previous prediction that CD147 is a member of the IgSF. Structurally, the N-terminal Ig-like domain (D1, residues 22-101) strongly resembles the C2 set fold, whereas the C-terminal one (D2, residues 107-205) shows strong homology to Ig variable domain (V set). Each domain adopts similar conformations among the four molecules in the crystal. The r.m.s.d. values from superposing the C ␣ atoms are 0.6 -0.7 Å and 0.6 -0.9 Å for D1 and D2, respectively. However, the relative orientation between the two domains is highly divergent, because the elbow angle (D1 mass center-hinge residue Pro-104 -D2 mass center) varies from 116.4°to 163.0° (Fig. 2B).
Four different pairwise combinations of HAb18G/CD147 are detected in the crystal, joined mainly via edge-by-edge packing and association of ␤-sheets (Fig. 1). Three of these dimers involve molecules within the asymmetric unit (AC, BC, and AD dimers), and one is formed by crystallographic 2-fold symmetry (DDЈ dimer) (Fig. 1). All three potential glycosylation sites deduced from HAb18G/CD147 sequence are distant from the dimer interfaces ( Fig. 2A), thereby eliminating the possibility of their interference with the dimerization, which is in accordance with the finding that the glycosylation of CD147 is irrelevant with its oligomerization (35). Chain A has the strongest electron density and thus will be used hereafter for analysis unless explicitly noted.
Structures of the Domains-The N-terminal domain (D1) is a typical C2 set immunoglobulin domain consisting of a ␤-barrel formed by sheets EBA and GFCCЈ and a conserved disulfide bond between strands B and F ( Fig. 2A). Remarkably, none of the N-terminal Ig domains of the cell surface molecules with known structures have been discovered to adopt C2 Ig fold. Structural superpositions suggest that the overall size (ϳ38 Å in longest axis) and strand topology of D1 are similar to those of the C2 domains from other cell surface glycoprotein antigens, including CD2 (ϳ40 Å, PDB entries 1HNF and 1HNG) (36,37), CD4 (ϳ40 Å, PDB entries 1CDH and 1CID) (38,39), and CD58 (ϳ40 Å, PDB entry 1CCZ) (40). However, owing to the reduced size of its GFCCЈ sheet, D1 is smaller than the C2 domains from cell adhesion molecules, such as VCAM (ϳ48 Å, PDB entry 1VCA) (41), ICAM-1 (ϳ49 Å, PDB entry 1IAM) (42) and ICAM-2 (ϳ49Å, PDB entry 1ZXQ) (43). Noticeably, the EBA sheet is more negatively charged than the opposite GFCCЈ sheet. The loops flanking the EBA sheet are rich of conserved acidic residues (Fig. 3). Among them, the EF loop is most negatively charged, in which a conserved asparagine-rich DXDD motif forms a short 3 10 helix locating at the tail of the D1 and presumably pointing toward the cell membrane. The C-terminal domain (D2) falls into the category of the I set. The ␤-sheets DEBA and AЈGFCCЈ pack against each other, and a disulfide bond between Cys-126 and Cys-185 connects strands B and F together ( Fig. 2A). Because the strands A and CЈ are rather short, D2 could also be  regarded as a shortened V-like Ig domain, as Vaughn and Bjorkman have suggested (44). D2 (ϳ39 Å in longest axis) is of similar size as the heavy-chain variable domain of Fab NEW (ϳ35 Å in longest axis, PDB entry 7FAB (Fig. 4A)). In Fab, the CCЈ and FG loops, also known as ␤-bulges, which introduce twists to strands CЈ and G, are believed to be crucial to antibody V domain dimerization (45). In comparison, the F-G ␤-bulge of HAb18G/CD147 is substantially shorter and results in a less twisted AЈGFCCЈ sheet. Meanwhile, the C-CЈ bulge (Thr-143 to Asp-147) adopts a similar conformation as the C-CЈ loop of Fab, but is rich of conserved acidic residues (DXX(D/E)) (Fig.  4A). In addition, the C-CЈ bulge is one of the most flexible regions in D2, because the r.m.s.d. value from superposing this region is as high as 1.57 Å within the asymmetric unit.
Structural superposition of the two domains of HAb18G/ CD147 revealed their striking resemblance. Although these two domains belong to distinct Ig domain types and share a sequence identity as low as 21%, their three-dimensional struc-tures are very similar, with an r.m.s.d. value of ϳ1.68 Å for 68 aligned C ␣ atoms (Fig. 4B). The intersheet spaces, strand topologies, disulfide bond positions, and even the conformations of some loops (especially the B-C loop whose sequence is conserved between D1 and D2) are quite constant. These similarities imply that the two Ig domains might have evolved from one common ancestral Ig domain. Their difference, however, in addition to the lack of strands AЈ and D in D1, mainly occurs at the (AЈ)GFCCЈ sheet side, where strands F and G are slightly shorter in D1, and the characteristic C-CЈ ␤-bulge of D2 is absent in D1 (Fig. 4B).
The potential N-linked glycosylation site, Asn-44, lies at the end of strand B, i.e. the outermost position of the EBA sheet. The other two glycosylation sites, Asn-152 and Asn-186, locate at the middle of CЈD loop and strand F, respectively, with their side chains protruding oppositely from AЈGFCCЈ and DEBA planes ( Fig. 2A). This evenly distributed glycosylation sites around D2 might enable the molecule to stand upright on the cell surface and fully expose the functioning surface of D1 for binding physiological ligands, as observed in ICAM-2 (46).
Linker and Interdomain Geometry-The domain arrangement of the four molecules in the asymmetric unit is quite different from each other: chains A and B are most bended, chain C is most extended, and chain D is in an intermediate state. This diversity of interdomain geometry is mainly caused by the 5-residue flexible linker containing a GPP motif (residues 103-105) that is conserved among all mammal CD147s (Fig. 3) and even exists in the far-related KIRR3 protein (Kin of IRRE-like protein, 28% sequence identity with that of HAb18G/CD147). Because proline and glycine are typically associated with polypeptide turns, this motif is believed to introduce the obvious bend observed between D1 and D2. More detailed inspection reveals that in chains A and B, the GPP motifs form hydrogen bonds (H-bonds) with the Ig domains, whereas these H-bonds disappear in chains C and D. This observation implies that the overall conformational variation among the four monomers may be associated with the H-bonds making and breaking between GPP motifs and the two domains. Supposing the last strand of D2 (strand G that connects to the transmembrane region) is perpendicular to the cell surface, the flexible linker might enable the EBA and GFCCЈ sheets of D1 to sweep within a large area and form putative functional surfaces at the top of the molecule for ligand binding (Fig. 2, A and B) as seen in CD2 (36).
Molecular Association in the Crystal-Four different pairwise combinations of CD147 are observed in the crystal, representing possible dimer forms, and designated by their chain names as BC, AC, AD, and DDЈ dimers. Interestingly, these dimers further adhere to each other by sharing some conserved ␤-strands located at either edge of the ␤-barrels, leading to homo-oligomer formation in the crystal lattice (Figs. 1 and 5).
BC Dimer-Many Ig domain-containing cell surface molecules, such as mouse and human JAM1 (47, 48), CD2 (36), CD4 (39), and CD80 (49), form homodimers with 2-fold axis between pairs of Ig domains. Likewise, in the crystal structure of HAb18G/ CD147, the N-terminal domains from chains B and C are related by a non-crystallographic 2-fold axis perpendicular to the GFCCЈ sheets, and pack against each other burying 797.4 Å 2 of their surface area (Fig.  5A, left). In this dimer, the two chains contact "head-by-head" in an antiparallel manner, mimicking a transcellular homophilic interaction between two HAb18G/CD147 molecules. The overall length of the dimer (110 Å) is comparable to the assumed distance between two opposing cell surfaces (105-110 Å) (47). The dimer interface is composed of almost equivalent residues from chains B and C. At the center of the interface, two patches formed by Val-61, Leu-62, Lys-63, Leu-67, and Phe-74 contact each other through hydrophobic interactions. In addition, the dimer is further stabilized by two H-bonds between CЈ strands (Lys-63/B⅐⅐⅐⅐Lys-63/C) and two salt bridges between E strands (Glu-73/B⅐⅐⅐⅐Lys-75/C and Lys-75/B⅐⅐⅐⅐Glu-73/C) (Fig. 5A, right). Consequently, the two GFCCЈ sheets assemble into a large eight-strand ␤-sheet (GFCCЈ-CЈCFG). The formation of the BC dimer is distinct from either the classic V-domain dimerization or CD2 adhesion interaction (36,50), but resembles the D1-D1 dimerization of ICAM-1 (42), though its D1 domain belongs to I set.
From the sequence alignment of CD147 with neuroplastins (NPTN) and embigins from various species (Fig. 3), we could find two clusters of conserved residues involved in the dimer interface. The first cluster includes Val-61 and Leu-62 of strand CЈ. Val-61 is conserved among all mammal CD147s, and Leu-62 is completely conserved even in neuroplastins and embigins. This implies that the hydrophobic interaction between the two CЈ strands and flanking loops is essential for the dimerization of HAb18G/CD147. The second cluster comprises Glu-73, Phe-74, and Lys-75. In CD147s from human, bovine, rabbit, Chinese hamster, as well as neuroplastins and embigins, the residues at Glu-73 position are usually occupied by acidic residues, and those corresponding to Lys-75 are always basic residues, suggesting that salt bridges analogous to those observed in HAb18G/CD147 could probably be formed between E strands in the homo-dimer. The position of Phe-74 is always occupied by an aromatic residue (phenylalanine or tyrosine) except the histidine in rabbit CD147. In BC dimer, Phe-74 is hydrophobically interacting with the conserved residue Leu-67 from the opposite monomer. Moreover, the potential N-glycosylation sites (Asn-44) from the two D1 domains locate far away from the dimer interface, further supporting the rationality of this dimer form, because glycosylation was found to have no effect on CD147 oligomerization (35). The binding free energies for all the dimerizations were calculated by using MD simulations. For the BC dimer, the calculated binding free energy was Ϫ20.84 Ϯ 4.74 kcal/mol (supplemental Table S1), which also indicates the existence of this dimerization.
AC Dimer-Chains A and C dimerize through their D1 domains with a buried surface area of 1004.6 Å 2 (Fig. 5B, left). In contrast to the BC dimer, the two EBA sheets of these two monomers constitute a large twisted six-strand ␤-sheet. The dimer interface is stabilized by H-bonds and hydrophobic interactions contributed by residues conserved among all mammal CD147s (Gly-24/A, Thr-25/A, Val-26/A, Thr-28/A, Thr-28/C, Val-30/A, and Val-30/C) (Fig. 5B, right). The assumption that the AC dimer might represent another form of transcellular homophilic interaction between two HAb18G/ CD147 molecules is supported by the following observations: First, the two D2 dimers extend in opposite directions, and the overall length of the dimer is ϳ103 Å, comparable with the assumed cell-cell distance. Second, the simulated binding free energy for this dimer (Ϫ16.18 Ϯ 4.62 kcal/mol) is close to that of the BC dimer (Ϫ20.84 Ϯ 4.74 kcal/mol) (supplemental Table   S1). Third, the N-glycosylation site (Asn-44), although locating near the dimer interface, are pointing their side chains away from the interface into the bulk solvent and thus are not likely to interfere with the dimer formation.
AD Dimer-The D1 domain of chain D dimerizes with the D2 domain of chain A, forming an unusual C2-I interaction interface (Fig. 5C, left). Strand A of D1 from chain D adjoins antiparallel to strand G of D2 from chain A, resulting in a twisted EBA-GFCCЈ ␤-sheet. The contact surface area of this dimer is ϳ855.5 Å 2 . There are six H-bonds between the two domains (Fig. 5C, right). The binding free energy predicted for this dimerization is Ϫ29.69 Ϯ 2.22 kcal/mol (supplemental Table  S1), which is the largest among all observed dimers. In this case, the three potential N-glycosylation sites (Asn-44, Asn-152, and Asn-186) are inevitably positioned outside the dimer interface. This dimerization model implies that, in addition to homooligomerization (e.g. BC and AC dimers), the D1 domain of HAb18G/CD147 might also be involved in heterophilic interaction with other IgSF cell surface molecules that commonly use I/V-like Ig domains for ligand binding. DDЈ Dimer-When inspecting the crystallographic symmetry, we found an interesting phenomenon that the crystallographic 2-fold axis between chain D and its partner DЈ is very similar to the non-crystallographic 2-fold axis relating chains B and C. Both axes lie between the two CЈ strands of D1 domains and are perpendicular to the GFCCЈ sheets (Fig. 5D, left). The overall shape of DDЈ dimer is analogous to that of the BC dimer. In addition, the Lys-63 (D)⅐⅐⅐⅐Lys-63 (DЈ) backbone H-bonds and the hydrophobic interaction between the two CЈ strands were also observed in the DDЈ dimer (Fig. 5D, right). However, the salt bridges between the two E strands were absent in the DDЈ dimer, resulting in a comparatively flatter eight-strand GFCCЈ-GFCCЈ ␤-sheet. Accordingly, the DDЈ dimer has a small buried surface area of only 613.4Å 2 , and the calculated binding free energy for this dimer (Ϫ12.13 Ϯ 3.79 kcal/mol (supplemental Table S1)) indicates the weakest intermolecular affinity among all possible dimers. Therefore, this dimer might be merely a product of lattice packing, but its close resemblance to the BC dimer implies that HAb18G/CD147 may prefer to associate via such 2-fold symmetry.
Because HAb18G/CD147 forms a homo-oligomer in the crystal structure, we further analyzed its oligomeric state in solution by using native PAGE. As indicated in Fig. 6, the purified extracellular portion of HAb18G/CD147 exists as a mixture of monomer and different forms of oligomer in solution. The intermolecular association observed first in the crystal structure and then confirmed in solution provides a basis for understanding the homo-oligomerization of HAb18G/CD147. Further mutagenesis studies are being performed to verify the interfaces observed in the crystal and probe the oligomerization-function relationship of this important IgSF molecule.

DISCUSSION
The unique C2-I domain arrangement distinguishes HAb18G/CD147 from all IgSF family members with known structures. The x-ray crystal structures of many IgSF adhesion molecules, such as CD4, CD2, VACM-1, ICAM-2, and ICAM-1, have been determined and characterized as an I-C2 domain organization (42). These cell surface molecules tend to utilize C2 set Ig domains as C-terminal building blocks to project the functional N-terminal domain (I/V-like Ig domain) away from the cell membrane (42). But little has been reported about the function of the C2 domain other than its supportive role. To our knowledge, the present study provides, for the first time, the three-dimensional structure of an IgSF molecule with an IgC2-IgI domain arrangement. Because of the special structural architecture of HAb18G/CD147, the role of the C2 domain, consequently, has shifted from a "supportive" to a "functional" one, which results in the C2-C2 (BC, AC, and DDЈ dimers) and C2-I (AD dimer) dimerization in the crystal rather than V domain dimerization. Because classic V domain dimerization (50) and CD2 homophilic adhesion (36) have been regarded as typical paradigms for IgSF-IgSF adhesive interactions, the C2 domain-mediated dimerization presented in HAb18G/CD147 crystal structure might provide a novel example adding to the existing IgSF adhesion models. These C2-C2 and C2-I oligomerizations observed in the crystal also offer good structural clues for the established important functions of CD147 mediated by homo/heterophilic adhesive interactions, such as counter-receptor binding, MMP induction, and integrin association.
Four HAb18G/CD147 monomers in the asymmetric unit associate with each other forming three possible dimers, namely BC, AC, and AD. It is obvious that the four molecules adhere together by sharing monomer C between BC and AC dimers and by sharing monomer A between AC and AD dimers. The extensive intermolecular interactions observed in the crystal provide structural evidence for the nature of CD147 as an adhesion molecule of IgSF. Moreover, we presume that an adhesive "network" could probably be formed on the cell surfaces via tandemly repeated dimeric units of CD147, thereby facilitating its oligomerization-dependent multifunctions. Among these dimers, both BC and AC dimers are found to pack "head-by-head" via their N-terminal domains with end-to-end distances comparable to the assumed distance between two opposing cell surfaces. Thus, it is very likely that these dimers represent a trans-cellular homophilic interaction between two HAb18G/CD147 molecules on adjacent cell surfaces. Besides, numerous biological studies have emphasized the critical role of oligomerization in the multiple functions of CD147. Direct evidence has been provided for the presence of homo-oligomers of CD147 in chicken and mouse tissues by chemically cross-linking the membrane fraction (35,51). Yoshida et al. have found that the oligomer of CD147 occurs in a cis-dependent manner on the plasma membrane and the N-terminal Iglike domain is essential and sufficient for oligomerization, probably via hydrophobic interactions (35). Sun et al. have shown that CD147 may be a counter-receptor for itself to induce MMP production, and its trans-homophilic adhesion is mediated by the N-terminal domain in either heterotypic (tumor cell-fibroblast) or homotypic (tumor cell-tumor cell) circumstances. It is also suggested that the cis-multimerization of CD147 on the cell surface could enhance the overall avidity of its trans-interactions (52). Altogether, these biological evidences strongly suggest that the oligomerization-dependent biological functions of CD147 may require both cis and trans homophilic interactions on the surface of plasma membrane. In good agreement with this point, both cis and trans dimers were found in the crystal lattice of HAb18G/CD147, and the dimer models we described could provide structural insights into the oligomerization of this molecule and supply a basis for mapping functionally critical regions as potential drug targets. Similar structural analysis about cis and trans oligomerizations of IgSF molecules could be found in the crystallographic study of human and mouse junctional adhesion molecules, in which the researchers deduced that the cis-dimers enhance and probably occur prior to any adhesive trans interaction (47,48).
In addition to the homophilic interactions, CD147 has been reported to associate with integrins ␣3␤1 and ␣6␤1 also via its N-terminal domain and presumably in a lateral fashion (53). In the present crystal structure, the "head-to-tail" dimer AD could be regarded as a heterophilic-like dimer. This C2-I dimerization provides a possible paradigm for the interactions of CD147 with other IgSF molecules that usually have IgI or IgV as their N-terminal functional domains.
By inspecting the surface electrostatic charge distribution of the N-terminal domain D1 of HAb18G/CD147, we can easily detect that the two edges of this ␤-barrel are substantially more hydrophobic than the sheet surfaces, implying that these two sides could potentially be involved in hydrophobic interactions. In fact, the conserved strand A on one edge is used in both homophilic (dimer AC) and heterophilic-like (dimer AD) adhesion, indicating that this side might be critical in the ligand recognition of HAb18G/CD147. Additionally, the conserved strands CЈ ϩ E on the other edge constitute the interface of dimers BC and DDЈ, suggesting the indispensable role of this side in oligomerization.
Taken together, the homophilic IgC2-IgC2 and heterophiliclike IgC2-IgI dimerization models in the crystal structure of HAb18G/CD147 may provide a rational structural explanation for its oligomerization feature and further support the previous biological evidence that CD147 is an IgSF adhesion molecule mediating cell-cell and cell-matrix adhesion interactions. Mutagenesis of the key dimer-interface residues and relevant functional assays in cellular systems are being performed.
Besides the unique domain arrangement and unusual oligomerization manner, another apparent characteristic of HAb18G/CD147 structure is the flexibility of the linker. The 5-residue linker and relative domain orientation of CD147 resemble those of CD2 but are considerably different from human and rat CD4 with tightly abutting domain junctions (38,39). The interdomain bend introduced by the flexible linker varies remarkably (116 -163.0°), which is similar to those seen in neural cell adhesion molecule (50 -140°) (54). Likewise, the linker of HAb18G/CD147 is believed to provide sufficient internal flexibility for the molecule to interact with different ligands and to satisfy various functional requirements in the intercellular events.
Considering the clinical significance of mAb HAb18 against HAb18G/CD147 (16), discovering the antibody epitope will be useful in explaining the excellent efficacy of HAb18 and providing structural clues for drug discovery (supplemental Figs. S1 and S2). Interestingly, as indicated in supplemental Fig. S1C, the epitope, predicted by molecular docking and confirmed by mutagenesis study, partially overlaps with the interface of dimer BC. Thus the antibody might interfere with the oligomerization-dependent activities of HAb18G/CD147 that are critical for tumor metastasis, especially secretion of MMPs. This finding suggests that blocking the oligomerization of the molecule could be a reasonable strategy for developing more competent antibodies and small molecule antagonists against HAb18G/CD147 for treatment of hepatocellular carcinoma.