Two variable regions in carcinoembryonic antigen-related cell adhesion molecule1 N-terminal domains located in or next to monoclonal antibody and adhesion epitopes show evidence of recombination in rat but not in human.

In this paper, we have characterized the structure, evolutionary origin, and function of rat and human carcinoembryonic antigen-related cell adhesion molecule1 (CEACAM1) multifunctional Ig-like cell adhesion proteins that are expressed by many epithelial tissues. Restriction enzyme digestion reverse transcriptase-PCR analysis identified three cDNAs encoding novel CEACAM1 N-domains. Comparative sequence analysis showed that human and rat CEACAM1 N-domains segregated into two groups differing in similarity to rat CEACAM1(a)-4L and human CEACAM1. Sequence variability analysis indicated that both human and rat N-domains possessed two variable regions, and one contained a major adhesive epitope. Recombination analysis showed that the group of rat but not human N-domains with high sequence similarity was derived at least in part by recombination. Binding assays revealed that three monoclonal antibodies with strong reactivity for the CEACAM1(a)-4L N-domain showed no reactivity with CEACAM1(b)-4S, an allele with a different N-domain sequence. CEACAM1(b)-4S displayed adhesive activity efficiently blocked by a synthetic peptide corresponding to the adhesive epitope in CEACAM1(a)-4L. Blocking analysis also showed that the adhesive epitope for rat CEACAM1 was located downstream from the equivalent human and mouse epitopes. Glycosylation analysis demonstrated O-linked sugars on rat CEACAM1(b)-4S from COS-1 cells. However, this was not the alteration responsible for the lack of monoclonal antibody reactivity. When considered together with previous studies, our findings suggest an inverse relationship between functionality and amino acid sequence similarity to CEACAM1. Like IgG, the N-domain of CEACAM1 appears to tolerate 10-15% sequence diversification without loss of function but begins to show either altered specificity or diminished functionality at higher levels.

Carcinoembryonic antigen-related cell adhesion molecule1 (CEACAM1) 1 is a member of a large family of multifunctional Ig-like cell adhesion molecules (CAMs) structurally related to carcinoembryonic antigen (CEA) (2,3). CEACAM1 from both rodents and humans is composed of an ectodomain with an N-terminal Ig V-like domain (N-domain), three Ig-like C-domains, a single transmembrane domain, and a cytoplasmic (cyto) domain that through differential splicing varies in length from 6 to 71 amino acids (4 -6). Multiple genes with unique N-domain sequences and a variety of splice variants have been reported in both rodents and humans (2,3,(5)(6)(7)(8)(9)(10)(11)(12). The major splice variants in rodents and humans have from 2 to 4 Ig-like domains and cyto domains with either 70 -71 (L forms) or 9 -10 amino acids (S forms) (1-3, 5, 6). In rodents, allelic variants (Ceacam1 a and 1 b ) 2 or separate genes differing in both the nucleotide and amino acid sequence of their N-terminal Ig domains (rats and mice) have also been described (1,13).
Interest in the role of CEACAM1 in cancer has blossomed since early reports showed that this gene was lost or greatly down-regulated in rodent hepatocellular carcinomas (4 -16) and colon carcinomas (13,17,18) and that restoration of CEACAM1-4L expression in human prostate (19,20), mouse colon (21,22), or human bladder carcinomas (23) produced a marked decrease or loss of tumorigenicity. Relatively recent studies have also shown that the ectodomain is not required for tumor suppression; the CEACAM1 a -4L cyto domain is necessary and sufficient for inhibiting tumorigenicity (24). Cell-cell adhesion, on the other hand, appears to be an activity that is mediated primarily by the N-terminal V-like domain (4,(25)(26)(27). Because all of the CEACAM1 alleles and genes have unique N-domain sequences, it follows that these sequence variations may in turn produce changes in conformation that alter the adhesive properties of the N-terminal domain. This possibility is consistent with a recent report by Watt et al. (28) demonstrating that single amino acid mutations could alter monoclonal antibody (mAb) binding and cell-cell adhesion activity.
Over the past 10 years, considerable insight has been gained into the effects that changes in amino acid sequence have on the structure, tumor suppression, or cell-cell adhesion activity of N-domains from human and mouse CEACAM1. Surprisingly, however, there have only been a few studies of this nature for rat CEACAM1. Sippel et al. (25), for example, demonstrated that mutation of Ser-503 abrogated CEACAM1 a -mediated aggregation, whereas Estrera et al. (29) determined that phosphorylation of Ser-503 was essential for tumor suppression. To gain further insight into the characteristics and origin of Ceacam1 N-domains, we have utilized genomic PCR to search the rat genome for additional N-domain encoding sequences. By using this approach, we were successful in identifying three novel N-domains by their resistance to REs that cut the N-domains of Ceacam1 a , -1 b , and Ceacam10. These novel N-domains displayed only small differences in amino acid sequence when compared with Ceacam1 a , -1 b , and Ceacam10 N-domains. Moreover, the sequence differences were clustered into two locations, suggesting the presence of variable (V) regions similar to those seen in N-domains of the PSG family (30). Computer analysis of rat Ceacam1 N-domains further revealed evidence for recombination events that were located within or adjacent to V regions. V regions were also found in the N-domains of human CEACAM1 genes with low sequence diversity, but in contrast to the rat, there was no evidence for recombination.
Diversity without loss of function is a characteristic of variable V domains in antibody genes that is at the heart of the ability of the immune system to respond to mutations in infectious agents (31). N-domains for CEACAM1 family members appear to have retained this characteristic, and in a recent report by Watt et al. (28), there were only a few single site mutations capable of causing a major reduction in mAb binding and/or cell adhesion activity. Although small and naturally occurring changes in CEACAM1 amino acid sequence have not been investigated in any detail, it seemed likely that these differences would also result in changes in the structural and functional properties of CEACAM1 N-domains as based on previous studies by Lin and co-workers (4,32). To determine the effects of naturally occurring differences in the amino acid sequences of the six most similar rat N-domains, we examined the reactivity of monoclonal antibodies (mAb) with CEACAM1 a -4L and CEACAM1 b -4S N-domains, the two major N-domain variants in the rat. In three different assays, mAb 362.50 and 9.2 specific for N-domain epitopes in CEACAM1 a showed no reactivity with CEACAM1 b , suggesting the differences in amino acid sequence, although small, had a significant impact on epitope presentation. Epitope mapping by using peptide arrays defined two N-terminal and one C-terminal epitope that was recognized most strongly by mAb 9.2. Because mAb 362.50 reacted with peptide arrays spanning an epitope domain shared by CEACAM1 a -4L and CEACAM1 b -4L, these conformational effects did not appear to emanate directly from differences in primary sequence. Adhesion blocking assays showed that the CЈ and G ␤-strands contained major and minor adhesive domains, respectively, the former of which shared sequence with a peptide that blocked human CEA adhesion (33). mAb binding assays with peptide arrays spanning the CЈ region from CEACAM1 a -4L and CEACAM1 b -4L showed that mAb 362.50 reacted equally well with both CЈ peptide arrays despite the differences in primary sequence, thus suggesting that factors in addition to primary sequence were responsible for the lack of reactivity with CEACAM1 b -4L. Taken together with previous studies, our results show that the small differences in the amino acid sequences of the CEACAM1 a and 1 b N-domains produce subtle differences in structure that alter mAb binding and adhesive interactions.

MATERIALS AND METHODS
Peptide Synthesis-The numbering of the peptides corresponds to the published amino acid sequences of CEACAM1 a -4L (5) with residue 1 corresponding to the initiation methionine. Peptides were synthesized and purified by high performance liquid chromatography, as described previously (34). Peptides corresponding to residues 57-72 ( Preparation of Antibodies-The origin and characteristics of pAb 669 (35) and mAbs 362. 50 and 5.4 (16, 36) have been described previously. mAb 9.2 (37) was provided by Drs. Werner Reutter and Oliver Baum at the Free University, Berlin,Germany. Fluorescence and electron microscopic analysis of cells labeled in suspension with these mAbs indicated that the reactive epitopes were located in the ectodomain of CEACAM1 a -4L. The preparation and characteristics of rabbit antisera against synthetic peptides corresponding to sequences unique to the cyto71 (pAb C3) and the cyto10 domain of CEACAM1 a -4L (pAb OB2) have been described previously by Lin et al. (6) and Baum et al. (38), respectively. mAb 324.5 recognizing a tumor-associated Ig-like membrane protein, TuAg.1 (39), was used as a negative control for the radioimmunoprecipitation analysis and adhesion blocking experiments.
Indirect Immunofluorescence Analysis-At 24 -48 h after transfection with Ceacam1 a -4L or Ceacam1 b -4S expression plasmids, COS-1 cultures growing in chamber slides were fixed in acetone and labeled by an indirect immunofluorescence protocol as described previously (36) with pAb 669, a polyclonal antibody recognizing both CEACAM1 a -4L and CEACAM1 b -4S (38), and mAb 5.4 (36) or mAb 362.50 (15). Liver tissue was harvested from adult male ACI or Fischer F344 rats (Harlan Sprague-Dawley, Indianapolis, IN) as described previously (14,15,36). For flow cytometric analysis, cells in suspension were labeled by indirect immunofluorescence (IIF) as described previously (40) by using phycoerythrin-conjugated anti-mouse IgG. CEACAM1 expression levels were analyzed according to the manufacturer's protocol by using a Guava PC flow cytometer equipped with Guava Express software (Guava Technologies Inc., Hayward, CA). Data were presented as a graph of cell counts versus fluorescence intensity. Median and peak fluorescence values were determined using Guava Express software.
Radiolabeling Procedures and Immunoprecipitation Analysis-Retrovirally transduced COS cells stably expressing either the gene for Ceacam1 a -4L or Ceacam1 b -4S were surface-labeled with 125 I (carrierfree; Amersham Biosciences) by the lactoperoxidase:glucose oxidase procedure of Keski-Oja et al. (41). Specific activity of labeled cells was ϳ2-3 cpm/cell. Immunoprecipitation of radiolabeled proteins from precleared, 0.5% Nonidet P-40 detergent extracts using pAb 669 or mAb 362.50 was carried out by procedures described previously (15,39). Immunoreactive components were boiled in 1ϫ SDS sample buffer and resolved by one-dimensional SDS-PAGE. Staining, destaining, drying of gels, and autoradiography were carried out as described by Hixson et al. (16).
Affinity Purification of CEACAM1 from Transfected Cells and Hepatocytes-pAb 669 was cross-linked to protein-A-agarose by a modified protocol of Schneider et al. (42). Briefly, 50 l of pAb 669 was incubated with 100 l (packed) of Affi-Gel protein-A agarose (Bio-Rad) suspended in PBS overnight at 4°C with constant end-over-end mixing. Matrix was pelleted at 1,000 ϫ g and washed sequentially with PBS, 0.5% Triton X-100 in PBS and 0.2 M triethanolamine, pH 8.0. The IgG was cross-linked to the matrix using 50 mM dimethyl pimelimidate-2HCl (Pierce) in 0.2 M triethanolamine, pH 8.0, for 45 min with end-over-end mixing. After a 1-min centrifugation at 1,000 ϫ g, the cross-linking solution was removed; the reaction was stopped by a 45-min incubation with 50 mM ethanolamine, pH 8.0, and the matrix was washed three times with 0.2 M triethanolamine and 0.5% Triton X-100 in 0.5 M NaCl. Prior to deglycosylation and immunoblot analysis, 5 ϫ 10 5 transfected COS cells or 1 ϫ 10 6 rat hepatocytes isolated by a collagenase perfusion technique described previously (36) were lysed in 1 ml of Nonidet P-40 detergent lysis buffer (20 mM sodium phosphate buffer, pH 7.4, ϩ 0.5% Nonidet P-40) containing the protease inhibitor Pefabloc (Roche Applied Science) at a final concentration of 0.5 mM. Lysates were cleared by centrifugation at 30,000 ϫ g for 20 min at 4°C and incubated at 4°C for 2-4 h with end-over-end mixing in the presence of the pAb 669 cross-linked protein-A matrix. The matrix was pelleted by a 1-min centrifugation at 1,000 ϫ g; unbound proteins in the supernatant were removed, and the matrix was washed sequentially with 0.5% Triton X-100 in PBS and 0.05 M Tris/HCl, pH 8.0, containing 0.5 M NaCl. Matrix was then washed five times with 20 mM sodium phosphate buffer, pH 7.4, to remove free chloride ions that inhibit the O-glycanase enzyme. Bound proteins were eluted from the cross-linked matrix with 500 l of 50 mM diethylamine, pH 11.5, and neutralized to pH 7.4 with 0.5 M monobasic sodium phosphate. Proteins were acetone-precipitated in 3 volumes of ice-cold acetone in the presence of 5 g of BSA as a carrier protein, pelleted at 18,750 ϫ g for 15 min at 4°C, and resuspended in ϳ100 l of 0.1% SDS in 20 mM sodium phosphate buffer, pH 7.4.
Analysis of Potential N-and O-Linked Glycosylation Sites in C1D1 and C2D1 Domains-Analysis of the D1 domains from CEACAM1 a -4L and CEACAM1 b -4S for potential O-glycosylation sites was performed according to the rules proposed by Pisano et al. (43) or was carried out using the NetOGlyc 2.0 Prediction Server, a collection of artificial neural networks that recognize O-glycosylation sites based on sequence context and surface accessibility.
Deglycosylation Procedures-Affinity-purified CEACAM1 a -4L or CEACAM1 b -4S and 1 g of purified asialofetuin (Sigma) as a positive control for deglycosylation by O-glycanase were resuspended in 0.1% SDS solution and boiled for 5 min to denature the proteins. After boiling, Triton X-100 was added to a final concentration of 0.6%. The denatured protein samples were separated into three microcentrifuge tubes and subjected to the following treatments: 1) incubation with buffer, 2) digestion with neuraminidase only (removal of sialic acid residues), and 3) sequential digestion with neuraminidase and O-glycanase. For neuraminidase digestion, proteins were incubated with 0.05 units of Clostridium perfringens neuraminidase (Sigma) in 20 mM sodium phosphate buffer, pH 7.4, for 30 -60 min at 37°C. Samples that were to be digested only with neuraminidase were diluted in 0.25 volume of 5ϫ SDS sample buffer (10 mM Tris-HCl, 10% SDS, 10% 2-mercaptoethanol, 30% glycerol, pH 6.8) and boiled for 5 min. The remaining digested samples were incubated an additional 18 h at 37°C with 0.5 units of O-glycanase (Genzyme, Cambridge, MA). Digestions were terminated by adding 0.25 volume of 5ϫ SDS and boiling for 5 min. Aliquots (50 l) from the glycosidase-digested samples or from detergent extracts of Sf9 cells expression deletion mutants (prepared as described previously and boiled in SDS sample buffer) were resolved by one-dimensional SDS-PAGE (44) on 7.5%, 0.75-mm slab gels at a constant voltage of 200 V. Transfer of proteins separated by SDS-PAGE onto nitrocellulose membrane filters (Schleicher & Schuell) was carried out as described (4,5). Fetuin controls run on separate gels were stained by using a silver staining kit (Bio-Rad). After transfer, nitrocellulose filters were labeled with mAb 9.2, mAb 362.50, or pAb 669 by a previously described indirect immunoperoxidase protocol with a 1:1000 dilution of horseradish peroxidase-conjugated goat anti-rabbit (for pAb 669) or goat anti-mouse (for mAbs 9.2 and 362.50) antibodies from Cappel/Organon Teknika. Blots prepared with deglycosylated CEACAM1 a -4L, CEACAM1 b -4S, and CEACAM1 a -4L and -4S from normal Fischer rat hepatocytes were developed with a chemiluminescent Western blotting detection system from Amersham Biosciences according to the manufacturer's specifications. Blots were exposed to X-Omat AR x-ray film (Eastman Kodak Co.) for 30 s up to 5 min. For detection of the deletion mutant CEACAM1 a -4L proteins, filters were incubated with an alkaline phosphatase-conjugated goat anti-rabbit (pAb 669) or goat anti-mouse antibody (mAbs 362.50 and 9.2) as described previously (4). Deletion mutant CEACAM1 a -4L blots were developed using nitro blue tetrazolium and 5-bromo-4-chloro-3-indolyl phosphate as described previously (4). Apparent molecular weights of immunoreactive proteins were calculated from prestained protein standards (Sigma) run and transferred concurrently with samples and transferred to the nitrocellulose.
Epitope Mapping of mAbs Using Peptide Arrays-Membranes containing a series of overlapping peptides 10 amino acids in length that spanned the CEACAM1 a N-domain were obtained from the following three different sources: the Institute for Applied Microbiology, University of Agriculture, Forestry and Biotechnology, Vienna, Austria; Genosys, The Woodlands, Texas; or Jerini Biotools, Berlin, Germany. mAbs 9.2 and 362.50 purified on protein-G-Sepharose were labeled with 125 I (Amersham Biosciences) using IODO-GEN beads and passed over a Sepharose G-25 column to remove the unincorporated iodine. Membranes with peptide arrays were blocked overnight in Tris-buffered saline (TBS, 50 mM Trizma (Tris base), 140 mM NaCl, 3 mM KCl), pH 8.0, containing the Genosys blocking solution and 5% (w/v) sucrose. Membranes were then washed for 10 min in TBS containing 0.05% (v/v) Tween 20 and incubated overnight with 5 ϫ 10 5 cpm per ml of 125 I-mAb 9.2 diluted in blocking buffer. After washing according to the protocol provided by the manufacturer (Genosys), membranes were sealed in plastic wrap and exposed to x-ray film in the presence of intensifying screens at Ϫ80°C for up to 5 days.
Insect Cell Adhesion Assays-Cell adhesion assays were performed with cultured Sf9 (Spodoptera frugiperda) insect cells expressing the CEACAM1 a -4L isoform as described previously by Cheung et al. (4). Antibody blocking experiments were carried out by adding different amounts of ascites fluid containing CEACAM1 a -specific mAbs 362.50 and 9.2, as well as a control mAb 324.5 to Sf9 cells infected with Ceacam1 a -4L recombinant viruses at 24 h post-infection. All cell suspensions contained 50 g/ml DNase to reduce nonspecific aggregation caused by DNA from damaged cells. IgG concentrations in ascites containing mAb 9.2, mAb 362.50, and mAb 324.5 were determined by measuring the total IgG eluted from protein A-Sepharose incubated with a known volume of ascites (up to 1 ml). Single cells were determined for each different antibody concentration at 72 h post-infection. Percent inhibition was defined as the percent of single cells (number of single cells/total number of cells infected ϫ 100) at 72 h after infection (48 h after addition of antibodies), i.e. the greater the inhibition of aggregation, the higher the number of single cells. Counts of single cells were made using a hemocytometer. To assess the ability of peptides corresponding to various sequences in the N-domain to block aggregation of Sf9 cells, peptides dissolved in PBS or Me 2 SO were added to single cell suspensions containing 5 ϫ 10 5 cells/ml to give final peptide concentrations from 0 to 60 g/ml. After 90 min, the extent of aggregation was determined. For COS-1 cells expressing CEACAM1 a -4L and CEACAM1 b -4S, cells were suspended in aggregation buffer (0.2% BSA (w/v), 20 g/ml DNaseI, 6 g/ml HEPES, pH 7.4) containing peptides at a concentration of 30 or 60 g/ml. For aggregation assays, 600 l of cell suspension (5 ϫ 10 5 /ml) was placed in each well of a 24-well plate coated with 10 g/ml BSA. At 45 and 90 min, the number of single cells in each well were counted by using an inverted microscope, and the percentage of single cells was calculated as described above. Separate plates were used for each time point.
RNA Isolation, RT-PCR, and RE Analysis-Total RNA was isolated from male rat liver snap-frozen in liquid nitrogen and stored at Ϫ80°C using the Tri Reagent® kit (Molecular Research Center, Inc., Cincinnati, OH) according to the manufacturer's specifications. Liver samples were obtained from Japanese Fischer 344 males (Charles River, Japan), American Fischer F344 males (Charles River, Wilmington, DE), and German Fischer 344 males selected from a colony maintained at Rhode Island Hospital that was initiated with breeder pairs obtained from Charles River, Germany; ACI and Sprague-Dawley males were from Harlan Sprague-Dawley, Indianapolis, IN. RT-PCR on 1 g of total RNA was performed as described previously (19) by using random hexamers (PerkinElmer Life Sciences) in the RT step and the following Ceacam1 a N-domain primers in the amplification step as follows: 1) 5Ј-UTR/Sig 5Ј-CAGGCAGCAGAGACTATGGAGCTA-3Ј (nucleotides Ϫ14 -9) and 2) D2AS 5Ј-GGGATTGGAGTTGTTACCTGTGA-C-3Ј (nucleotides 445-468). Control templates for the PCR included 20 g of pCDM8 plasmid carrying either the Ceacam1 a -4L or Ceacam1 b -4S cDNAs. No template controls were also run. The 485-bp products of the 5Ј-UTR/Sig ϩ D2AS RT-PCR were subjected to RE analysis with DraI and HincII (New England Biolabs, Beverly, MA) according to the directions of the manufacturer. Digested products were visualized on 2% agarose gels containing ethidium bromide.
Genomic DNA Isolation, PCR, and RE Analysis-Genomic DNA was isolated from ϳ100 mg of American Fischer rat liver tissue snap-frozen in liquid nitrogen according to the Easy DNA kit (Invitrogen) specifications for small amounts of tissue (Յ100 mg). The following Ceacam1 exon 2-specific primers were used for genomic PCR:) 1) GD1S, 5Ј-CA-AGTCACCGTAGACGCTGTGC-3Ј (nucleotides 103-124), and 2) GD1AS, 5Ј-CGAAATTGCACAGACGTTTGTA-3Ј (nucleotides 394 -415). PCR was performed by using 4 l of genomic DNA or 20 g of control plasmid DNA (pCDM8 carrying either the Ceacam1 a -4L or the Ceacam1 b -4S cDNA) in a 100-l PCR using Taq polymerase (Promega, Madison, WI) or ULTma proofreading polymerase (PerkinElmer Life Sciences) according to the manufacturer's specifications. Controls without template were also run. The 314-bp products of the GD1S ϩ GD1AS PCR were subjected to RE analysis with DraI, HincII, and XbaI (New England Biolabs) according to the manufacturer's directions. Digested products were visualized on 2% agarose gels containing ethidium bromide.
Cloning of Ceacam1 N-terminal Ig Domains-Ceacam1 N-domain cDNAs were amplified with the GD1S and GD1AS primers using genomic DNA isolated from Fischer 344 rat liver. PCR products were ligated into the pCR2 TA cloning vector (Invitrogen) according to the manufacturer's specifications. The ligation mix was then used to transform competent INV␣FЈ cells (Invitrogen). Blue-white screening of recombinants was subsequently performed according to the protocol provided by the manufacturer. Plasmid DNA obtained from transformed white colonies using a Qiaprep Spin Miniprep Kit (Qiagen, Valencia, CA) was screened for the presence of Ceacam1 N-domain insert by both restriction digest and the PCR/RE protocol described above. Ceacam10 and RE-resistant (RER) N-domains were each cloned in a slightly different manner. Total N-domain PCR product from genomic DNA was amplified with the GD1S and GD1AS primers as described above. For the cloning of Ceacam10 N-domain, genomic PCR products were directly ligated with the pCR2 TA cloning vector. The ligation mix was then digested with HincII to eliminate any plasmids containing the Ceacam1 a N-domain cDNA. To obtain a population of PCR products enriched in RER N-domains, the initial PCR products were sequentially digested to completion with HincII and XbaI, enzymes specific for the Ceacam1 a and Ceacam10 N-domains. The digested PCR products were run on 2% agarose gels, and the undigested products, enriched for RER N-domains, were gel-purified by using the GeneCleanII Kit (Bio 101, Inc., Vista, CA) according to the manufacturer's specifications. To ensure a high percentage of product with the A overhangs needed for TA cloning, the gel-purified product enriched for RER N-domains was reamplified using the GD1S and GD1AS primers. Following ligation of the resulting PCR product into the pCR2 TA cloning vector, the ligation mix was digested with HincII to eliminate any contaminating Ceacam1 a N-domain containing plasmids. The Ceacam10 and RER N-domainenriched ligation mixtures used to transform competent INV␣FЈ cells and plasmid DNAs from recombinant white colonies were screened by RE digest and PCR/RE assays. The authenticity of the Ceacam1 a and Ceacam10 N-domain inserts and the sequence of the unique RER Ndomain inserts were confirmed by automated sequencing as described previously (19), using the following vector specific primers: forward SP6 primer (5Ј-CTATTTAGGTGACACTATAG-3Ј) and a reverse T7 promoter primer (5Ј-TAATACGACTCACTATAGGG-3Ј). Automated sequence readouts were transferred into the MacVector program for MacIntosh and were aligned with the known sequences of the Ceacam1 a , -1 b and -10 N-domains.
Cloning and Sequence Analysis of N-domain PCR Product Resistant to RE Digestion-The undigested 314-bp PCR products remaining from the sequential HincII/XbaI digestion were cloned into the pCR2 TA cloning vector (Invitrogen), and forward (T7) and reverse (SP6) primers were used to obtain the sequence off both strands of DNA from each clone. To verify the accuracy of the PCR-derived inserts, sequences obtained from clones determined by RE digestion to contain the Ceacam1 a and the Ceacam10 N-domains were shown to be identical to published sequences for these two N-domain gene fragments (4,6,8). The residual product resistant to digestion with HincII and XbaI was cloned into the pCR2 TA cloning vector, and inserts from several clones were sequenced. Three clones, designated Nx, Ny, and Nz, were identified that contained 4 -24 nucleotide changes relative to the Ceacam1 a N-domain nucleotide sequence and lacked all of the unique RE sites. To determine whether these new N-domains were being expressed, RE-RT-PCR analysis with the same primer set used for genomic PCR was performed with total RNA from liver, lung, colon, small intestine, heart, kidney, brain, and prostate as template.
Phylogenetic Analysis-N-domain nucleotide sequences were aligned without ambiguity using the ClustalW alignment program in the Se-qPup software package (45). Patterns of sequence evolution were examined with MEGA version 1.02 (46) using two different sets of sequences (1) the six rat N-domain sequences (Ceacam1 a , -1 b , -10, Nx, Ny, and Nz; 2) the six rat sequences plus the two mouse sequences (mCeacam1 a and -1 b )) and a human N-domain sequence (hCEACAM1). The sequences were analyzed for d S (synonymous differences per synonymous sites) and d N (nonsynonymous differences per nonsynonymous site). Statistical tests of the differences between d N and d S were performed as described in the MEGA manual (46). Briefly, the significance of the difference, d, between d N and d S is evaluated using the ratio t ϭ d/s(d) where s(d) is the standard error of the difference between d N and d S. A t test with infinite degrees of freedom was used to determine the significance of the ratio. The six rat and the three human genes in the high sequence similarity groups were examined for pat-terns of recombination using DNAsp2.2 (47). This software package examines pairs of variable sites where two nucleotides are segregating at each site in the sample of sequences and tests for the presence of all four gametic phases in the sample. From this, a minimum number of recombination events can be inferred.
Binding of mAbs to CEACAM1 a Ectodomain Deletion Mutants-The binding of mAb 9.2 and mAb 362.50 to the extracellular domains of CEACAM1 were tested by Western immunoblot analysis using cell lysates prepared from insect cells infected with recombinant viruses containing full-length C-CAM1 or its deletion mutants. After SDS-PAGE, the proteins were transferred onto nitrocellulose and immunoblotted with mAbs or antibody 669 and alkaline phosphatase-conjugated goat anti-rabbit antibody. Antibodies in 1:500-fold dilutions were used.
Transfection of Ceacam1 a and -1 b Expression Vectors-pCDM8 expression vectors containing Ceacam1 a -4L and Ceacam1 b -4S cDNAs were those described previously by Lin and Guidotti (35) (48) were transfected by essentially the same protocol. The AML-12 cell line was kindly provided by Dr. Nelson Fausto, University of Washington, Seattle, WA. The construction of the retroviral expression plasmid LNCL containing the Ceacam1 a -4L cDNA has been described previously (19). Another retroviral expression plasmid, LNCS, containing the Ceacam1 b -4S cDNA was also constructed for these experiments (19). Retroviral plasmids were transfected into the ecotropic BOSC23 (49) and the amphotropic PA317 (50) retroviral packaging cell lines as described previously (19). The human colon carcinoma cell lines (M-HuCC) stably transfected with MMTN expression vectors carrying either the Ceacam1 a -4L or Ceacam1 b -4S cDNAs were kindly provided by Dr. Sue-Hwa Lin. COS-1, BOSC23, and PA317 cells were examined 48 h post-transfection by IIF with polyclonal antibody (pAb) 669 to determine the efficiency of transfection before further manipulation of the cells.
Expression of Ceacam1 a -4L and Ceacam1 b -4S Isoforms of CEACAM in Retrovirally Transduced Cells-Culture medium containing the LNCL (Ceacam1 a -4L) or LNCS (Ceacam1 b -4S) ecotropic retrovirus was produced in BOSC23 packaging cell lines and used to infect PA317 cells. Culture supernatants containing amphotropic retroviruses produced by PA317 cells as described by Comegys et al. (19) were used to infect the COS-1 cells, the ACI rat-derived transplantable hepatocellular carcinoma cell lines, 253T and 1682A (16,51), and the human prostate carcinoma cell line, PC-3 (19). Following infection, COS-1, 253T, 1682A, and PC-3 cells were cultured in the presence of 750 -1200, 650, 300, and 500 g/ml G418, respectively, for 7-14 days, and drug-resistant colonies were selected by ring cloning. Drug-resistant colonies were maintained in Dulbecco's modified Eagle's medium (COS-1), Waymouth's (Invitrogen) (253T and 1682A), or RPMI (Invitrogen) (PC-3) medium supplemented by 10% Rehatuin fetal bovine serum (Intergen, Purchase, NY). Clonal lines derived by limiting dilution cloning were screened by IIF with pAb to identify clones expressing the highest levels of CEACAM1 protein. To eliminate contamination with negative cells, CEACAM1 a -4L-expressing cells were selected from COS-1, 253T, and 1682A clones using anti-mouse IgG magnetic beads (Dynabeads M-450) (Dynal, Inc., Great Neck, NY) coated with mAb 5.4 according to methods described previously (52). Negative cells were removed from 253T and 1682A clones by using a panning procedure adapted from Wysocki and Sato (53) and Sigal et al. (54) to select for CEACAM1 b -4S-expressing cells. B, analysis of PCR products from control plasmid DNAs using 5Ј-UTR ϩ Sig/D2AS primer pair followed by RE analysis with DraI (D) specific for C1 b , HincII (H) specific for C1 a , or both enzymes (DϩH). Undigested (U) samples appear adjacent to (left side) digested samples, and product sizes are indicated on the right. Lane 1, 1-kbp DNA ladder; 500-bp marker indicated at the left; lanes 2-5, pCDM8 C1 b -4S DNA; and lanes 6 -9, pCDM8 C1 a -4L DNA. C, analysis of RT-PCR products from liver RNAs of different rat strains using 5Ј-UTR/Sig/D2AS primer pair followed by RE analysis as described in B. Briefly, sterile 100-mm Petri dishes coated with 10 g/ml unconjugated goat anti-rabbit IgG (Pierce) in carbonate buffer (15 mM sodium carbonate, 35 mM sodium bicarbonate, pH 9.5) for 90 min at room temperature were washed sequentially with Hanks'-buffered saline solution (HBSS) (Invitrogen) and HBSS containing 0.1% bovine serum albumin (BSA) and 0.2 mM EGTA (HBSS-BE). Infected cells (1-2 ϫ 10 7 ) trypsinized from culture plates were collected in HBSS-BE containing 4.2 g/ml aprotinin (Sigma), washed in HBSS-BE, resuspended in a 1:200 dilution of pAb 669 in HBSS-BE, and incubated on a rotator at 4°C for 20 -30 min. Following two washes in HBSS-BE, antibodycoated cells were plated into goat anti-mouse IgG-coated plates, incubated for 40 min at 4°C, swirled to redistribute unbound cells, and incubated an additional 30 min at 4°C. Dishes were then washed six times with HBSS-BE, decanting and discarding the supernatants from each wash. After the final wash, culture medium with serum was added to each plate, and bound cells were released in a stream of medium generated with a pipette.

The CEACAM1 b Allele Is Not Present in the Fischer Rat
Genome-PCR of control plasmid DNA pCDM8 carrying the Ceacam1 a -4L or Ceacam1 b -4S cDNAs using the 5Ј-UTR/Sig and D2AS primer set (Fig. 1A) generated predicted products of 485 bp (Fig. 1B). RE analysis of the 485-bp products amplified from the control plasmids showed that Ceacam1 a -4L-derived N-domain products, as expected, were cut only by HincII re- sulting in fragments of 352 and 133 bp, whereas Ceacam1 b -4Sderived products were cut only by DraI resulting in fragments of 245 and 240 bp (Fig. 1B). RT-PCR analysis of total RNA from rat liver of several different strains, including American Fischer 344, German Fischer 344, Japanese Fischer 344, ACI, and Sprague-Dawley, using the 5Ј-UTR/Sig and D2AS primers also resulted in the amplification of 485-bp products (Fig. 1C). RE analysis of these 485-bp products revealed that they could be cut with HincII, resulting in fragments of 352 and 133 bp identical in size to those produced by digestion of control Ceacam1 a -4L plasmid DNA (Fig. 1C). However, none of the 485-bp products amplified from rat liver contained the DraI enzyme site unique to the Ceacam1 b -4S N-domain (Fig. 1C), indicating that both of the two major cyto domain splice variants expressed in the liver of these rat strains and substrains had the Ceacam1 a -4L N-domain.
To determine whether the lack of Ceacam1 b -4S transcripts was due to the absence of this allele in the genome, PCR/RE analysis using primers confined to the N-domain of Ceacam1, designated GD1S and GD1AS (Fig. 1A), was performed on genomic DNA from rat liver. PCR of pCDM8 plasmids carrying Ceacam1 a -4L or Ceacam1 b -4S cDNAs and genomic DNA from American Fischer rat liver resulted in products of the predicted 314-bp size (Fig. 1D). Digestion of the CEACAM1 a -derived and CEACAM1 b -derived control PCR products with HincII and DraI produced fragments of the expected sizes of 234/80 and 192/122 bp, respectively (Fig. 1D). In addition to DraI and HincII, PCR products derived from genomic DNA were cut with XbaI, the RE unique to the Ceacam10 N-domain, which, although not expressed by rat liver, would be present in the genomic DNA. DraI and/or HincII digestion of N-domains amplified from American Fischer liver genomic DNA only produced 234-and 80-bp fragments consistent with the presence of a Ceacam1 a but not a Ceacam1 b N-domain (Fig. 1E). This demonstrated that the Ceacam1 allele is not present in the genome of this inbred strain of rats, thereby eliminating the need to digest with DraI when searching for RE-resistant Ndomains. Subsequent XbaI digestions produced predicted fragment sizes of 190 and 124 bp (Fig. 1E), and the results were consistent with the presence of the Ceacam10 N-domain.
The Fischer Rat Genome Contains Three Novel N-domains Resistant to RE Digestion-After sequential and complete digestion of the 314-bp amplified N-domain products from genomic DNA with HincII and XbaI, a residual product resistant to digestion (Fig. 1E) was consistently observed, suggesting the existence of unidentified N-domains. Ceacam1 a could not have been present in the resistant product because it contains the Ceacam1 a HincII site. The possibility that these RER Ndomains were derived from Ceacam9 and Ceacam11 also seemed unlikely because the primers chosen for amplification did not match the corresponding sequences in these two genes.
The undigested 314-bp PCR products remaining from the  (83) for the low and high similarity groups of human and rat CEACAM1 N-domains. Footnote 1, variability for each amino acid was calculated as D/F where F is frequency of the most common amino acid at a given position divided by the total number of N-domains being compared and D is the number of different amino acids at a given position (83). Footnote 2, cricket graph was used to apply a 3-channel binomial smoothing function to each data set. sequential HincII/XbaI digestion were cloned into the pCR2 TA cloning vector (Invitrogen), and forward (T7) and reverse (SP6) primers were used to obtain insert sequences off both strands of DNA from several clones. To verify the accuracy of the PCRderived inserts, sequences obtained from clones determined by RE digestion to contain the Ceacam1 a and the Ceacam10 Ndomains were shown to be identical to published sequences for these two N-domain isoforms (4,6,8). From the HincII/XbaIresistant clones, three (designated Nx, Ny and Nz) were identified that contained 4 -24 nucleotide changes relative to the Ceacam1 a N-domain and had nucleotide changes in the cleavage sites for HincII and XbaI. RE-RT-PCR analysis using total RNA from small intestine, colon, heart, lung, liver, kidney, brain, and prostate as template and the same primer set used for genomic PCR and total RNA from liver (Fig. 1A) did not produce any product that was resistant to RE digestion (data not shown), suggesting these new N-domain variants were not expressed at mRNA levels detectable by RT-PCR/RE analysis.

Sequence Alignment Delineates Groups of N-domains in Rat and Human Displaying High and Low
Similarity to CEACAM1-Comparative amino acid sequence analysis of rat CEACAM1 N-domains defined two groups with high (CEACAM1 b , CEACAM10, Nx, Ny, and Nz) and low (CEACAM9 and CEACAM11) sequence similarity ( Fig. 2A). When sequence variability in five amino acid increments was determined for the high similarity group N-domains (Fig. 3), two regions of sequence variability (V-regions) were detected. Moreover, these V-regions were located adjacent to or within the regions corresponding to binding sites identified previously for opa proteins (55,56) and MHV (57-59) and adhesive domains mediating intercellular adhesion (25, 28, 33) (Fig. 3). Nand C-terminal V-regions were also found in the low similarity group, but the second V-region appeared to extend into the A1 domain. To determine whether human CEACAM1 genes exhibited similar features, the same analysis was performed on human (h) CEACAM1 N-domains. As was found in the rat, human N-domains could be divided into groups with low and high amino acid similarity to hCEACAM1 (Fig. 2B). Scanning for sequence differences in five amino acid increments, also identified two V-regions in the low similarity group that were in approximately the same location as their rat counterparts (Fig. 3) and a third N-terminal V-region that corresponded to a small peak on the shoulder of the first V-domain in the rat. In contrast, well defined V-regions could not be discerned in the low similarity group (Fig. 3).
Variable Sites in the Rat High Similarity N-domain Group Are at Amino Acid Altering Positions-Casual inspection of the rat sequence data for the high similarity group ( Fig. 2A) indicated that most of the variable sites between the six rat Ndomain sequences were at amino acid altering (nonsynony-  Table II. Footnote 2, variable domains depicted in Fig. 3 for rat CAECAM1 a , 1 b , Nx, Ny, and Nz. mous) positions. If simple purifying selection were operating in the N-domain, one would expect more variation at the synonymous (no change in amino acid) sites. To quantify this, patterns of synonymous differences per synonymous site (dS) and nonsynonymous differences per nonsynonymous site (dN) were analyzed using MEGA 1.02 (46). This confirmed that amino acid variation per amino acid-altering site is greater than the synonymous variation per synonymous site. This difference was significant at the 5% level (Table I). The patterns of synonymous and nonsynonymous differences were also examined for the larger data set of the six rat sequences plus two homologous mouse sequences and a single human sequence. As before, the variation at amino acid-altering sites was absolutely greater than that for synonymous sites, but the difference was not statistically significant (Table I).
Recombination Is Involved in the Generation of Rat but Not Human N-domains-By using all six rat N-domains in the high similarity group as the sample, DNAsp2.2 software identified five recombination events at the sites shown in Fig. 4, A and B. By sequentially removing and adding sequences from the analysis, it was apparent that pairs of sequences that include Ceacam1 b contribute a number of sites identified as having experienced recombination (Fig. 4A). For example, Ceacam1 a , -10, Nx, Ny, and N, showed one recombination event between positions 140 and 370, but if the N-domains for Ceacam1 a were exchanged for Ceacam1 b , four recombination events were identified. At the very least, these data indicate that recombination has played a significant role in the evolutionary history of this locus. Surprisingly, when the same analysis was performed on the group of human CEACAM1 sequences with low sequence variability, no evidence of recombination was detected with any combination of N-domains from hCEACAM1, -3, -5, and -6. mAb Raised against CEACAM1 a Show Differential Reactivity with CEACAM1 a and -1 b -To determine whether the naturally occurring amino acid sequence differences in the high similarity group of rat N-domains resulted in structural or functional changes, we focused on the CEACAM1 a and -1 b N-domains. We first determined whether pAb 669 and mAb 362.50 would show differential reactivity with CEACAM1 b . Radioimmunoprecipitation analysis was carried out on cell lysates of 125 I surface-labeled COS cells stably transduced with retroviruses carrying either the Ceacam1 a -4L or Ceacam1 b -4S cDNAs. mAb 324.5 against Tag1/TagE4, an Ig-like transmembrane glycoprotein, was used as negative control (39). As shown in Fig. 5A, pAb 669 immunoprecipitated both the CEACAM1 a -4L and CEACAM1 b -4S isoforms from transduced COS cells, whereas normal rabbit serum controls were negative. CEACAM1 a -4L produced in COS cells was also immunoprecipitated by mAb 362.50, but this mAb and the negative control mAb 324.5 failed to immunoprecipitate CEACAM1 b -4S (Fig. 5A). As expected, COS-1 and six other cell lines expressing CEACAM1 a -4L (BOSC23, PA317, M-HuCC, 1682A, PC3, and 253T cells) were strongly labeled by pAb 669 (Fig. 5B) or mAb 362.50 (Fig. 5C). COS-1 (Fig. 5D) and the same six cell lines expressing CEACAM1 b -4S were also positive with pAb 669. In contrast, neither COS-1 nor any of the six cell lines expression CEACAM1 b -4S displayed detectable levels of reactivity in IIF assays with mAb 362.50 (Fig. 5E) or mAb 9.2 (not shown). Because the amino acid sequences of CEACAM1 a and -1 b only differ in the N-terminal Ig and cyto domains, these results suggested that variations in primary sequence were directly or indirectly responsible for differences in the conformation and/or accessibility of mAb epitopes.
The Binding Sites for mAb 9.2 and mAb 362.50 Are Located in the N-terminal Ig Domain-To define further the recognition domain for the two mAbs, immunoblot analysis using CEACAM1 a -4L proteins with various domains deleted (Fig. 6A) was performed. Western blots from SDS-PAGE gels loaded with Sf9 insect cell lysates and labeled with pAb 669 expressed similar amounts of CEACAM1 a -4L proteins (Fig. 6B). Immunoblot analysis also showed that mAb 9.2 reacted with fulllength CEACAM1 a -4L expressed in Sf9 cells (Fig. 6C). When tested with CEACAM1 a -4L deletion mutants, mAb 9.2 did not FIG. 7. Mapping of epitopes recognized by mAb 9.2 using peptide arrays. Membranes containing an overlapping series of 10 amino acid peptides that spanned the first Ig domain were used to define linear epitopes recognized by mAb 9.2. A and C, autoradiograms of representative arrays that had been incubated with radioiodinated mAb 9.2 or mAb 362.50, respectively. Consensus binding patterns for each mAb developed from four different arrays are shown in B and D. The different shadings indicate the relative intensity of the binding to each peptide spot. E, autoradiogram of an array corresponding to epitope B following incubation with radioiodinated mAb 362.50. The primary sequence of the six overlapping peptides spanning the B epitope region in the N-domain from CEACAM1 a -4L (sequences 1A-1F) and CEACAM1 b -4S (sequences 2A-2F) are indicated above each spot in the array. mAb 362.50 reacted strongly with peptides 1B/1C and 2B/2C but showed different patterns of reactivity with the remaining peptides. show detectable reactivity with ⌬N, suggesting that the epitopes for mAb 9.2 were located in this domain. Similar results were obtained with mAb 362.50 (data not shown). Further analysis with mAb 9.2 showed that this antibody was unable to recognize the ⌬A1, B, A2 deletion mutant and displayed decreased reactivity with the ⌬A1 mutant (Fig. 6C), suggesting that mAb 9.2 recognized an epitope near the C terminus of the N-domain.
Sequences Recognized by mAb 9.2 and mAb 362.50 Are Located in the C and G ␤-Strands-High resolution mapping of the epitopes recognized by mAb 9.2 and mAb 362.50 was carried out by using overlapping arrays of 10 amino acid peptides that spanned the CEACAM1 a -4L N-domain (Table II). When binding assays were performed on four separate arrays from three different suppliers using 125 I-labeled mAbs and a onestep labeling protocol, the strongest and most consistent reactivity for mAb 9.2 mapped to peptides 18 -20, 26 -28, and 52-56 (Fig. 7, A and B). The defined epitope sequences of YWYKGT (epitope A), YIRSDN (epitope B), and VQFRVYPA (epitope C) (Table II) corresponded, respectively, to the rat C ␤-strand, CЈ and part of the CЈCЉ loops, and G ␤-strand ( Fig. 2A). The strongest binding for mAb 362.50 mapped to single peptides with strong to moderate reactivity (peptides 40, 52, and 56) and two overlapping series of peptides (peptides 18 -20 and 25-29). mAb 362.50 showed the most consistent reactivity with peptides 19 -20 and 26 -28 (Fig. 7, C and D). These two series defined overlapping sequences, respectively, of YWYKGTTL (epitope AЈ) and YIRSDN (epitope B) (Table II). To determine whether variations in the primary sequence of the mAb 362.50 epitope in CEACAM1 a and CEACAM1 b were responsible for the differential reactivity seen in IIF and IP assays, a peptide array composed of 13-mer peptides spanning epitope B (peptide 25-31) from CEACAM1 a -4L and CEACAM1 b -4S was synthesized and probed with radioiodinated mAb 362.50. As shown in Fig. 7E, mAb 362.50 strongly reacted with two N-domain peptides in CEACAM1 a (1B and 1C) and their corresponding peptides in CEACAM1 b (2B and 2C) but showed opposing reactivities with the 1A/2A and 1D/2D peptides. These findings  (Table III). Peptide 8 containing the B epitope showed the highest adhesion blocking activity. C, blocking the aggregation of COS-1 cells with N-domain peptides (Table III) indicated that the differences in primary sequence had produced subtle differences in the location of epitope B.
Epitope B Is Located within the Major Adhesive Site in the CEACAM1 a N-domain-In previous reports (25,28,33,(55)(56)(57)(58)(59), the adhesive sites for CEA and the binding domains for MHV and opa proteins were localized to the C ␤-strand and CCЈ loop domains. Because one of the mAb epitopes was within this region, it was of interest to determine whether the two mAbs could block cell adhesion. When we tested the ability of the mAb 362.50 and mAb 9.2 to block CEACAM1 a -4L-mediated cell aggregation, both mAbs were also able to inhibit, to some extent, the adhesion of Sf9 cells expressing full-length CEACAM1 a -4L, whereas the control mAb 324.50, which was against TuAg1 (60), had no effect (Fig. 8A). Significant net inhibition (72%) of cell aggregation was observed with 2 g of mAb 9.2, whereas only 24% of net inhibition was obtained with a similar amount of mAb 362.50. These results suggested that mAb 9.2 was recognizing an epitope close to or within the adhesion sequence.
To determine which mAb 9.2 epitope was involved in cell-cell adhesion, peptides corresponding to mAb 9.2 and mAb 362.50 epitopes with extensions toward the N and C terminus of the N-domain were synthesized and used for blocking of CEACAM1 a -4L-mediated aggregation of Sf9 cells (Table III). Of the eight peptides tested, only peptide 9.2B, containing epitope B, could completely inhibit adhesion at 30 g/ml. This result suggested that sequences around epitope B are critical for adhesion. To define further the adhesion epitope, three peptides with deletions from the C terminus (peptide 9.2B-1, 8.0, and L1) and two with deletions from the N terminus (9.2B-3 and 1.0) of peptide 9.2B were synthesized and tested for their effect on CEACAM1 a -4L-mediated adhesion. As shown, peptides 8.0 (Fig. 8B) and 9.2B-1 (Table III) maintained adhesion blocking activity. In contrast, peptide 1.0 (Fig. 8B) and 9.2B-3 (Table III) were no longer able to block adhesion indicating that the three N-terminal amino acids but not the five C-terminal amino acids of 9.2B are essential for adhesion blocking.
When these experiments were repeated using COS-1 cells expressing either CEACAM1 a -4L or CEACAM1 b -4S, peptide 9.2B with the B epitope again showed the highest blocking activity for both isoforms (Fig. 8C) despite the differences in primary sequence of the B epitope in CEACAM1 a -4L and CEACAM1 b -4S. The levels of expression of the two isoforms (dark line) relative to cells labeled by IIF without a primary anti-CEACAM1 antibody (light dashed line) are shown in Fig.  8D. Taken together, these results define the adhesion epitope for CEACAM1 to be PDSEIARYIRS, a sequence encompassing the entire CЈ ␤-strand and part of the CCЈ and CЈCЉ loops ( Fig.  2A). In addition, the results show that, despite the differences in amino acid sequence, the 9.2B peptide also blocked aggre-gation mediated by CEACAM1 b -4S, a finding in keeping with the ability of mAb 362.50 to recognize the B epitope in both the 1 a and 1 b allele.
Removal of O-Linked Glycans Does Not Restore mAb Reactivity with CEACAM1 b -Previous studies have shown that glycosylation can alter the conformation and binding properties of membrane receptors and Ig-like proteins including mouse Ceacam1 (61). To determine whether the amino acid sequence differences between CEACAM1 a and CEACAM1 b N-domains altered patterns of glycosylation, we searched for sites of Oglycosylation by using the rules proposed by Pisano et al. (43). By this method, a total of nine possible sites for addition of O-linked sugars were identified in the CEACAM1 b N-domain (Fig. 9A), only two of which were also present in CEACAM1 a . Analysis using the NetOGlyc 3.0 Prediction Server, a collection of artificial neural networks that recognize O-glycosylation sites based on sequence context and surface accessibility (62), identified a single site unique to the CEACAM1 b N-domain at Thr-91 (Fig. 9A). In addition, three N-glycosylation sites shared by CEACAM1 a and CEACAM1 b N-domains were identified (Fig. 9A) by using the consensus sequence NX(S/T), where X is anything but Pro (63). These findings raised the possibility that the differential use of common glycosylation sites in or near mAb epitopes or occupation of unique glycosylation sites created by differences in primary sequence might contribute to altered mAb reactivity. Analysis of the three epitope sites showed that none of the seven potential O-glycosylation sites unique to CEACAM1 b N-domain resided within epitope C for mAb 9.2. However, sequence analysis identified an additional O-glycosylation site (not shown) and a single N-glycosylation site at the N terminus of the A1-domain (Fig. 9A) that were located in close proximity to epitope C. Epitope B for mAb 362.50, on the other hand, overlapped a region that not only differed in primary sequence in the two N-domains but was also flanked by the O-glycosylation site at Thr-91 unique to CEACAM1 b . In addition, mAb 362.50 epitope B contained a single N-glycosylation site common to both isoforms.
To determine whether the loss of mAb reactivity involved differential O-glycosylation, CEACAM1 a -4L and CEACAM1 b -4S isolated on immobilized pAb 669 from Nonidet P-40 extracts of transfected COS cells or Fischer 344 hepatocytes were subjected to deglycosylation by sequential neuraminidase and Oglycanase digestion. Removal of sialic acid with neuraminidase is required in order for O-glycanase to cleave O-linked oligosaccharides (64). Immunoblot analysis revealed that CEACAM1 a -4L from transfected COS cells reacted strongly with pAb 669 both before and after digestion with neuraminidase alone or with neuraminidase and O-glycanase (double digestion) (Fig. 9B). When immunoblots were repeated with mAb 9.2, reactivity with both single and double digested CEACAM1 a -4L proteins was also detected (Fig. 9B), but the

CEACAM1 N-domain Variable Regions
level of binding was much less than with pAb 669. In contrast, mAb 9.2 showed no reactivity with undigested, neuraminidasedigested, or double digested CEACAM1 b -4S (Fig. 9C). CEACAM1 a -4L and CEACAM1 b -4S (Fig. 9, B and C) visualized in blots with pAb 669 also showed an initial decrease in size following digestion with neuraminidase and a further reduction following digestion with O-glycanase indicating that the two isoforms were O-glycosylated when expressed in transfected COS cells. However, the size of CEACAM1 a -4L and CEACAM1 b -4S from isolated hepatocytes, which demonstrated a strong reactivity with pAb 669 AND mAb 9.2, was decreased in size by neuraminidase but did not show a further decrease following digestion with O-glycanase (Fig. 9D), a result consistent with previous studies identifying N-but no O-linked glycans on CEACAM1 a (65). Decreases in size were also observed with single or double digested fetuin, a glycoprotein with Olinked sugars known to be susceptible to O-glycanase, confirming that the two enzymes were active under the conditions used for digestion (data not shown).

DISCUSSION
In this report, we present evidence that N-domains from members of the CEACAM1 branch of the CEA family from both human and rat contain two Ig-like V-regions. These V-regions were similar in size and location to those described previously by Kodelja et al. (30) for N-domains from the pregnancy-specific glycoprotein (PSG) branch of the CEA family. The overall similarity of the CEACAM1 V-regions in the low similarity groups of rat and human were also comparable with those reported for PSG V-regions but were much lower in sequence conservation than V-regions from the high similarity rat and human Ndomain groups. This can be seen by comparing the range of variability (y axis) in the 1st and 2nd panels in the top row and the 2nd and 3rd panels in the bottom row of Fig. 3. According to Zimmermann (3), the PSG family in the rat most likely arose after separation of rodent and primate orders and then underwent extensive expansion before mouse/rat speciation. Zimmermann (3) further suggests that Ceacam1 a , Ceacam1 b , and Ceacam10 are relatively recent arrivals that arose by gene duplication after mouse/rat speciation. These rat Ceacam1 family members may thus have had less time than the PSG family to diverge and undergo expansion. The same argument may also apply to the closely related human V-domains (hCEACAM1, -3, -5, and -6).
Out of a total of 26 altered codons in Ceacam1 N-domains, 88% (23/26) had base changes in the first and second nucleotide. Moreover, 92% of the base changes resulted in amino acid replacement (nonsynonymous substitutions). Quantification of patterns of synonymous and nonsynonymous differences further showed that the amino acid altering (nonsynonymous) mutations were becoming established in the population more frequently than "silent" (synonymous) mutations. If this locus were evolving under a simple model of purifying selection, one would expect that amino acid mutations would have a negative impact on the functional aspects of the gene product and that synonymous mutations would be more common. An alternative model that is more consistent with the data is one where amino acid changes have some positive effect, a conclusion similar to that of Hughes and Nei (66) and more recent studies of proteins involved in antigen recognition (67). Significant in this regard is the observation that a number of the amino acids involved in the Ig fold (67) were conserved in all of the N-domains. In addition, the ␤-strand locations in the N-domain of CEACAM11 and CEACAM1, which have only a 48% sequence similarity, are virtually identical based on secondary structure predictions using the PSA Sequence Analysis Server from the Biomolecular Engineering Research Center. Similar findings obtained from a comparison of human and rat N-domains led Rudert et al. (68) to conclude that there is little functional constraint on the primary amino acid sequence except for key amino acid needed for the Ig fold.
Positive change is generated in antibodies by somatic hypermutation in V-domains or untemplated base changes during VDJ recombination that modify antibody specificity or affinity, thereby allowing the immune system to respond to mutations in infectious agents (31). It seems likely that naturally occurring mutations in expressed CEACAM1 N-domains would have a similar effect-modification but not loss of adhesive activity.  1 and 4). Immunoblots of digested CEACAM isoforms resolved by SDS-PAGE on 7.5% gels were performed with both pAb 669 (lanes 1-3) and mAb 9.2 (lanes 4 -6). Sizes of intact and deglycosylated proteins are indicated on the left. B and C, immunoblot of purified CEACAM1 a -4L and CEACAM1 b -4S, respectively. D, an immunoblot of CEACAM1 a -4L and CEACAM1 a -4S splice variants purified from Fischer rat hepatocytes. Note that both CEACAM1 a -4L and CEACAM1 b -4S from COS-1 cells but not hepatocytes show a decrease in size following O-glycanase digestion, indicating the presence of O-linked glycans on the former but not the latter cell type.
The fact that the N-domains for CEACAM1 a , -1 b , and -10 (69) all have adhesion activity is consistent with this idea. Carrying the analogy to antibodies a step further, amino acid sequence diversification of CEACAM1 N-domains should be manifested by functional modifications similar to those that occur in antibodies, namely alterations in affinity or specificity that is advantageous for a particular kind of epithelium. It is noteworthy that all of the rat CEACAM1 proteins with adhesive activity are in the group of high similarity N-domains with amino acid sequence similarities relative to CEACAM1 a ranging from 86 to 95%, values much higher than those for CEACAM9 and -11 (52 and 61%, respectively). Although Ceacam9 is highly conserved between mouse and rat, its does not appear to play an essential role in development because its loss in Ceacam9(Ϫ/Ϫ) mice had no effect on placental, embryonic, or postnatal development (70). Even less is known about the functional activity of rat CEACAM11 except by inference to the structurally identical mouse CEACAM11, the loss of which had no discernible consequence except for a small reduction in litter size (71). Whether the amino acid sequence variations in Nx, Ny, and Nz lead to altered binding properties remains to be determined using the approach described by Lin et al. (69) for CEACAM10. However, this type of analysis would only be warranted if expression of Nx, Ny, Nz at the RNA or protein level can be demonstrated in future studies by carrying out a more exhaustive tissue analysis.
The same relationship between sequence similarity and functional activity held true for 3/4 members of the high similarity group (88% similarity) in human, all three of which mediate cell aggregation. No cell-cell adhesive activity has been reported for CEACAM3, another member of the human high similarity group (89%). However, the close match of its Ndomain to CEACAM1 makes it highly likely that the CEACAM3 N-domain has cell adhesion activity, a supposition supported by its ability to bind Neisseria gonorrhoeae via its opa receptors (72). Although human CEACAM8, a neutrophil CAM in the low similarity group (73%), retains adhesive activity, it adhesion partners are limited to CEACAM6, a widely expressed neutrophil CAM that binds to human CEACAM1, -5, -8 (73), opa proteins (74), and itself. In contrast, CEACAM7 and CEACAM4, with similarities of 64 and 58%, cannot bind opa receptors (74) and have no documented cell-cell adhesion activity.
Taken together, these observations suggest that a direct relationship exists between the degree of similarity to CEACAM1 and functionality, i.e. the lower the similarity, the lower the functionality. This relationship also seems to hold true for the mouse hepatitis virus (MHV) receptor activity of mouse CEACAM1 a , CEACAM1 b , and CEACAM2 with sequence similarities of 100, 75, and 59%, respectively. When tested for MHV binding, the efficiency of CEACAM2 as a receptor was found to be 10 -100-fold lower than CEACAM1 (58,75). In virus neutralizing assays, the inhibition of MHV-A59 infectivity by CEACAM1 a was 4-fold and 1000-fold greater than CEACAM1 b and CEACAM2, respectively. Thus, the decrease in similarity relative to CEACAM1 was paralleled by a decrease in the efficiency of MHV binding.
Aside from direct effects resulting from differences in primary structure, sequence diversification could also alter binding properties of N-domains indirectly by creating, eliminating, or changing the usage of sites for post-translational modifications, the end result being subtle changes evidenced in the present study by loss of mAb binding. Diminished mAb binding to CEACAM1 b , for example, could be caused in part by differential glycosylation. This would be consistent with the strong reactivity of both mAb 9.2 and pAb 669 with hepatocytes and the much weaker reactivity of mAb 9.2 relative to pAb 669 with immunoblots prepared from CEACAM1 a -positive COS-1 cells (Fig. 9B), a difference that could result from differential Nglycosylation or O-glycosylation at Thr-91 in CEACAM1 b . Based on recent reports, it is clear that relatively small changes in O-or N-linked oligosaccharides are sufficient to alter conformation and mAb binding (76 -79). NMR studies by Huang et al. (76) showed that addition of N-or O-linked sugars to a 24-residue peptide from the human immunodeficiency virus glycoprotein 120 envelope had major effects on local conformation, induced minor changes at more distance sites and enhanced or reduced mAb binding to an epitope on this peptide. Similar effects could account for the ability of CEACAM1 b -4S to mediate aggregation when expressed in COS-1 cells but not in Sf9 cells where altered glycosylation shifts the molecular mass of CEACAM1 b -4S from 105 to 70 kDa (4).
In the present studies, glycosidase digestion unexpectedly revealed the presence of O-glycans on both forms of CEACAM1 in COS-1 cells but not in rat hepatocytes, the latter a finding consistent with previous reports (65). This suggested that either O-glycosylation was not involved in the loss of mAb binding or, alternatively, that differences in the pattern of O-and possibly N-glycosylation were primarily responsible, an alternative that will be examined in future studies. Marked cell type-specific differences in N-linked glycans on CEACAM1 have been described by Kannicht et al. (80), but we believe this is the first evidence for tissue-specific O-glycosylation of CEACAM1.
Of the three peptides tested for their ability to block aggregation, only peptide 9.2, containing epitope B, could completely inhibit adhesion (Table III) suggesting that epitope B was within or proximal to the adhesive domain. Further analysis with variants of peptide 9.2 containing C-and N-terminal deletions (Table III) defined the adhesion epitope for rat CEACAM1 to be PDSEIARYIRS, a sequence encompassing the entire CЈ ␤-strand and part of (33) hCEACAM1 (28), and the docking sites for MHV (57)(58)(59) and opa proteins (81,82) are located primarily in the C ␤-strand and the CCЈ loop domain (Fig. 2B). It seems likely that the shift in the rat adhesive domain to the CЈ ␤-strand stems at least in part from primary sequence-related differences in conformation. More important, despite the differences in their N-domains, CEACAM1 a -4L, CEACAM1 b -4S, and CEACAM10 were all capable of mediating cell adhesion, providing another example in the Ig family of diversification without loss of function. Indeed, the major adhesive epitopes for human, mouse, and rat were located in the middle of the first V-region, exactly the location one would predict if the object of diversification was to alter adhesive properties to produce changes in affinity or specificity that are favorable in certain tissues but have a neutral effect in others.
The partial loss of mAb 9.2 reactivity with mutants lacking the A1 domain suggested that epitope C in the C-terminal G ␤-strand was dependent on A1 for its correct presentation, a finding similar to that reported for the binding of virus to the MHV-A59 receptor in mice (57). Epitope C was the only one of the three epitopes that was recognized primarily by mAb 9.2 (Fig. 7), suggesting it was responsible, in part, for the stronger adhesion blocking activity of mAb 9.2 relative to mAb 362.50. At odds with this conclusion was the inability of the peptide containing the C epitope to block cell aggregation (Table III,  peptide 9.2C). However, a peptide containing epitope C would be unable to destabilize intercellular homophilic adhesion if binding between B epitopes was much stronger than between C epitopes. This viewpoint is consistent with results from mAb binding studies with a series of truncated CEACAM1 a N-domain peptides which indicated that each of the three epitopes was necessary but not sufficient for mAb binding. 3 It can be surmised that deletion of any one of the epitope regions causes conformational changes in the remaining epitopes that greatly diminish antibody binding. The same may hold true for peptides where effective blocking of cell adhesion may depend on peptide-specific variations in conservation of tertiary structure, a variable that may be more critical for adhesion blocking than for antibody binding. Alternatively, the adhesive epitope may be upstream from peptide 9.2, another possibility currently under investigation.
Another seeming inconsistency is the lack of reactivity of both mAbs with the CEACAM1 b N-domain. Peptide array analysis showed that despite the differences in primary sequence, mAb 362.50 bound to the B epitope region from both CEACAM1 a and -1 b . Additionally, binding of mAb 9.2 with epitope C in the CEACAM1 b and CEACAM1 a N-domains should be virtually identical because the amino acid sequence for epitope C is exactly the same. A possible explanation for these results is that because of their proximity to V-domains, the small differences in conformation produced by minor differences in sequence may be sufficient to alter binding affinity, a possibility consistent with the subtle difference in the location of epitope B in the CEACAM1 a and -1 b isoforms. Regardless of the reason, these results strongly suggest that relatively minor differences in sequence may have significant effects on adhesive and antibody binding properties of naturally occurring CEACAM1 N-domain variants.
In summary, in this report we have identified three unique N-domains (Nx, Ny, and Nz), which are present in the rat genome. In keeping with the current nomenclature, we propose that these should be designated as Ceacam12, Ceacam13, and Ceacam14. We have also shown that in the rat and human CEACAM1 families, there are subgroups with high and low sequence similarity. We present evidence that shows for the first time that recombination plays a significant role in the generation of diversity in the rat but not the human group of high similarity N-domains. We also show that CEACAM1 Ndomains in both humans and rats harbor two variable regions that contain or are situated adjacent to adhesive domains. We have also delineated cell type-specific differences in O-glycosylation, a new modification that could alter the binding characteristics of CEACAM1 N-domains. Finally, we have defined a primary cell-cell adhesive epitope in the rat CEACAM1 a Ndomain that differs from the principal adhesion site defined for mouse CEACAM1 and MHV. Moreover, a peptide corresponding to this epitope blocked adhesion mediated by both the 1 a and 1 b alleles, showing that sequence diversification did not significantly alter this domain.