Characterization of cadherin-24, a novel alternatively spliced type II cadherin.

Cadherins comprise a superfamily of calcium-dependent cell-cell adhesion molecules. Within the superfamily are six subfamilies including type I and type II cadherins. Both type I and type II cadherins are composed of five extracellular repeat domains with conserved calcium-binding motifs, a single pass transmembrane domain, and a highly conserved cytoplasmic domain that interacts with beta-catenin and p120 catenin. In this study, we describe a novel cadherin, cadherin-24. It is a type II cadherin with a 781-codon open reading frame, which encodes a type II cadherin protein complete with five extracellular repeats containing calcium-binding motifs, a transmembrane domain, and a conserved cytoplasmic domain. Cadherin-24 has the unusual feature of being alternatively spliced in extracellular repeat 4. This alternative exon encodes 38 in-frame amino acids, resulting in an 819-amino-acid protein. Sequence analysis suggests the presence of beta-catenin and p120 catenin-binding sequences, and immunoprecipitation experiments confirm the ability of both forms of the novel cadherin to associate with alpha-catenin, beta-catenin, and p120 catenin. In addition, aggregation assays show that both forms of cadherin-24 mediate strong cell-cell adhesion.

Cadherins comprise a superfamily of calcium-dependent cell-cell adhesion molecules. Within the superfamily are six subfamilies including type I and type II cadherins. Both type I and type II cadherins are composed of five extracellular repeat domains with conserved calcium-binding motifs, a single pass transmembrane domain, and a highly conserved cytoplasmic domain that interacts with ␤-catenin and p120 catenin. In this study, we describe a novel cadherin, cadherin-24. It is a type II cadherin with a 781-codon open reading frame, which encodes a type II cadherin protein complete with five extracellular repeats containing calcium-binding motifs, a transmembrane domain, and a conserved cytoplasmic domain. Cadherin-24 has the unusual feature of being alternatively spliced in extracellular repeat 4. This alternative exon encodes 38 in-frame amino acids, resulting in an 819-amino-acid protein. Sequence analysis suggests the presence of ␤-catenin and p120 cateninbinding sequences, and immunoprecipitation experiments confirm the ability of both forms of the novel cadherin to associate with ␣-catenin, ␤-catenin, and p120 catenin. In addition, aggregation assays show that both forms of cadherin-24 mediate strong cell-cell adhesion.
Cadherins comprise a superfamily of calcium-dependent cell adhesion molecules with six subfamilies including type I and type II cadherins (1). Type I cadherins like E-, P-, and Ncadherin are the transmembrane components of cellular structures known as adherens junctions. Adherens junctions connect adjacent cells to one another and link the cadherin to the actin cytoskeleton. Cadherins are involved in morphogenesis of tissues such as the neural tube (2)(3)(4), and their misexpression has been implicated in human malignancies (5)(6)(7)(8)(9)(10)(11)(12).
Cadherins are comprised of five extracellular repeat domains with conserved calcium-binding motifs, a single pass transmembrane domain, and a highly conserved cytoplasmic tail. The first extracellular repeat of type I cadherins mediates homotypic interactions and contains a conserved histidine-alanine-valine (HAV) sequence (13). The conserved calcium ionbinding sequences LDRE, DXNDN, and DXD coordinate the calcium ions that bridge the extracellular repeats (1). Binding of calcium to these motifs provides rigidity and confers the proper adhesive strength necessary for cell-cell adhesion (14). A conserved stretch of hydrophobic peptides constitutes the transmembrane domain, and the cytoplasmic domain binds directly to ␤-catenin and p120 catenin with the former indirectly linking the cadherin to the actin cytoskeleton via interactions with ␣-catenin (15)(16)(17).
Like type I cadherins, type II cadherins are comprised of five extracellular repeats with conserved calcium-binding motifs, a single pass transmembrane domain, and a highly conserved cytoplasmic tail (1). Unlike type I cadherins, the first extracellular repeat of type-II cadherins does not contain the HAV sequence although some contain QAV (18). Type II cadherins are not as well characterized as type I cadherins, although their aberrant expression in cancer has also been described (19 -22) In this study, we report the characterization of a novel human type II cadherin, which has been designated cadherin-24. The shorter isoform of cadherin-24 encodes a typical type II cadherin, complete with five extracellular repeats containing conserved calcium-binding motifs, a transmembrane domain, and a conserved cytoplasmic tail. Homology studies predict that cadherin-24 is 57% identical to human cadherin-11, a type II cadherin expressed by osteoblasts and some breast cancer cell lines (21)(22)(23). Like cadherin-11, cadherin-24 exists as two alternatively spliced forms. However, unlike cadherin-11, the alternative splice site in cadherin-24 is in the extracellular domain, and the longer, alternatively spliced form binds ␤-catenin. Cadherin-24 is the first cadherin shown to exist as two full-length alternatively spliced functional cadherins capable of binding catenins. Peptide sequence analysis suggests the presence of ␤-catenin and p120 catenin-binding sequences in the cytoplasmic domain of cadherin-24, and immunoprecipitation experiments confirm the ability of cadherin-24 to associate with ␤-catenin and p120 catenin.

EXPERIMENTAL PROCEDURES
RNA Preparation-Total RNA was isolated from MDA-MB-231 cells using the RNA Isolator kit (Genosys Biotechnologies, Inc., The Woodlands, TX).
cDNA Cloning-Single strand cDNA was synthesized using a Gene-Amp RNA PCR kit (PerkinElmer Life Sciences). Marathon-ready human mammary gland cDNA was purchased from Clontech Laboratories, Inc. cDNAs were amplified using PCR and degenerate primers as described for amplification of cadherins and proto-cadherins (24). Larger cDNAs were generated using 5Ј and 3Ј rapid amplification of cDNA ends (RACE) 1 using a kit from Clontech and gene-specific primers. Standard sequencing of PCR products and full-length constructs was done at the Genomics Core Research Facility (University of Nebraska-Lincoln; Lincoln, NE). Oligonucleotides were synthesized by the * This work was supported by National Institutes of Health Grants GM51188 (to M. J. W.) and DE12308 (to K. R. J.). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM  The Genscan algorithm (25) was used to predict the open reading frame and amino acid sequence of cadherin-24 based on partial 5Ј and 3Ј RACE products. The gene-specific primers NC40 5Ј-GCCCCTCCAC-CCCAGCCAGCTCAT-3Ј and NC43 5Ј-AGCTGGCCAGGAGCTGCA-GAGTCACACAC-3Ј were designed based on the nucleotide sequence and the predicted open reading frame. PCR fragments were amplified using the FailSafe TM PCR system (Epicentre Technologies, Madison, WI), subcloned, and sequenced. An expressed sequence tag (EST) (Gen-Bank TM accession number AL137477) was obtained from RZPD Deutsches fur Genomforschung GmbH (Berlin, Germany) to assemble the cDNA encoding the predicted open reading frame of cadherin-24. A 2X-birch profilin tag (26) was added to the C terminus of cadherin-24 by PCR and subcloning into a modified version of pSPUTK (27).
Transfection and Retroviral Infection-To express human cadherin-24 in mammalian cells, the 2X-Cbirch-tagged construct was ligated into the shuttle vector pMS and subcloned into the retroviral expression vector pLZRS-MS-IRES-neomycin (28,29). Phoenix packaging cells (29) were transfected with pLZRS-MS-neomycin-cad24-2X-Cbirch using a calcium phosphate kit (Stratagene, La Jolla, CA). Transfectants were selected with 1 g/ml puromycin (Sigma). Phoenix transfectants were passaged twice in antibiotic-free medium prior to viral harvest. For virus production, cells at 50% confluence were incubated in a 100-mm dish with 5 ml of Dulbecco's modified Eagle's medium containing 10% fetal bovine serum at 32°C for 24 h. Medium was collected, filtered through a 0.45-m filter (Nalgene, Rochester, NY), supplemented with 4 mg/ml polybrene (Sigma), and used immediately. For infection, cells were plated at 10 5 cells/100-mm dishes and infected 12-16 h later. Our sequence is identical to the sequence in the November 2002 freeze of the human genome (genome.ucsc.edu/). Additional 5Ј and 3Ј sequences that were derived from databases are discussed under "Results" but are not presented here. The preprotein region containing the putative signal peptide and furin cleavage site is underlined. The predicted extracellular domain is divided into five subdomains (EC1-EC5). A putative transmembrane domain is underlined with a wavy line. Three potential N-linked glycosylation sites are shown by asterisks. The shaded nucleotide and amino acid sequences indicate the insertion in the long form. and replaced with fresh medium. Stable cell populations were selected in 1 g/ml G418 (Invitrogen) for 7-10 days.
For aggregation assays, cells were trypsinized to generate single cell suspensions and resuspended at a density of 2 ϫ 10 5 cells/ml. 5,000 cells (20 l) were placed on the inside cover of a 100-mm dish and allowed to aggregate at 37°C for 18 h. The cells were triturated, remaining aggregates were observed using a Zeiss Axiovert 200M equipped with an ORCA-ER (Hamamatsu) digital camera, and images were collected using OpenLab software (Improvision Inc., Boston, MA).
Detergent Extraction of Cells-Confluent monolayers were rinsed three times with phosphate-buffered saline and extracted in TNE (10 mM Tris-acetate, pH 8.0, 0.5% Nonidet P-40, 1 mM EDTA) containing 2 mM phenylmethylsulfonyl fluoride and 2 mM sodium orthovanadate. The cells were placed on ice, scraped, and triturated vigorously for 10 min. Insoluble material was pelleted by centrifugation at 14,000 ϫ g for 15 min at 4°C, and the supernatant was used immediately or stored at Ϫ80°C.
Immunoprecipitation and Immunoblot-All polypropylene tubes were rinsed with 0.1% Nonidet P-40 and dried prior to use in immunoprecipitations. 300 l of cell extract was added to 300 l of hybridomaconditioned medium and gently mixed at 4°C for 30 min. 50 l of packed anti-mouse IgG affinity gel (ICN Biochemical Co., Costa Mesa, CA) was added, and mixing was continued for 30 min. Immune complexes were washed five times with TBST (10 mM Tris-HCl, pH 7.5, 150 mM NaCl, 0.05% Tween 20) containing 2 mM sodium orthovanadate, resuspended in 50 l of 2ϫ Laemmli sample buffer, and boiled for 5 min, and the proteins were resolved by SDS-PAGE (31). Proteins were electrophoretically transferred overnight to nitrocellulose membranes and immunoblotted as described previously (32).
Immunofluorescence-Cells were grown on glass coverslips to 80% confluence, fixed in 1% paraformaldehyde for 30 min, and permeabilized in methanol at Ϫ20°C for 5 min. After three 5-min washes in serum-free culture medium, the coverslips were blocked in 10% goat serum in culture medium for 30 min and processed as described previously (32).
Antibodies-The 4A6 mouse monoclonal antibody against the birch epitope was a kind gift from Dr. Manfred Rudiger (Zoological Institute, Braunschweig, Germany). Mouse monoclonal antibodies to ␣-catenin (1G5) and ␤-catenin (5H10) have been described previously (33). The mouse monoclonal antibody to p120 catenin (pp120) was obtained from BD Transduction Laboratories.
Tissue Expression Screening-Relative expression levels of cadherin-24 mRNA were estimated by reverse transcription PCR using a human multiple tissue cDNA panel (Clontech Laboratories, Inc.) and the FailSafe TM PCR System (Epicentre) with gene-specific primers NC42 5Ј-CCCCTGCCAGCCCAATCAGATACTCCATCC-3Ј and NC43, described above. PCR products were resolved on a 1% agarose gel.

Molecular Cloning of Cadherin-24 -
We identified cadherin-24 in MDA-MB-231 breast cancer cells using degenerate primers designed to amplify cadherins and protocadherins (24). A PCR product of ϳ450 bp was amplified that had homology to but was distinct from cadherin-11. To isolate a full-length cDNA clone for cadherin-24, gene-specific primers were designed, and 5Ј and 3Ј RACE products amplified from human mammary gland cDNA. Multiple attempts at more complete amplification of the 3Ј end of cadherin-24 failed, presumably due to its high GC content. However, an EST was available (GenBank TM accession number AL137477) that included the necessary 3Ј sequence to complete the transcript. Lastly, the 5Ј PCR fragments representing both the long and short forms were ligated to a 3Ј cDNA fragment to complete the cDNA clones. The final cDNAs were completely sequenced. The deduced amino acid sequences indicated that the cDNAs encode two isoforms of a novel cadherin (Fig. 1).
We employed PCR to more carefully examine the region of alternative splicing using cDNA from MDA-MB-231 and prim-ers NC42 and NC43 (see "Experimental Procedures"). Three bands (Fig. 2) were excised and sequenced. The slowest migrating band was not related to cadherin-24 and was presumed to be an artifact. Sequence analysis confirmed that the other two bands were cadherin-24. The longer form contained an insertion of 114 bp in the fourth extracellular repeat of the predicted amino acid sequence (Fig. 1). Data from the Human Genome Project confirmed that the 114-bp insert is an authentic exon flanked by well defined intron boundary sequences.
cDNA and Amino Acid Sequence of Cadherin-24 -The nucleotide sequences and the deduced amino acid sequences of the two distinct cDNA clones are shown in Fig. 1. The short form contains a 2,346-bp open reading frame that encodes a putative 781-amino-acid protein, whereas the long form includes an open reading frame of 2,460 bp, which encodes a putative 819-amino-acid protein. We thus termed the cDNA clones cadherin-24 short form and cadherin-24 long form. The predicted start codon has a purine in the Ϫ3 position in accordance with the Kozak criteria (34). Cadherin-24 contains a hydrophobic signal sequence and the postulated furin cleavage site of cadherin precursor polypeptides (35,36). The deduced amino acid sequence of mature cadherin-24 displays homology with the cadherin family as it includes: 1) an extracellular domain comprised of five cadherin-specific repeats (Fig. 1, arrows); 2) a transmembrane domain (Fig. 1, underlined); and 3) a cytoplasmic tail complete with amino acid sequences previously reported to bind catenins (37,38). There are three potential N-linked glycosylation sites in the extracellular domain that are pointed out by asterisks.
Comparison of Cadherin-24 with Other Cadherins- Fig. 3 compares the amino acid sequences of cadherin-24, cadherin-11, cadherin-8, N-cadherin, E-cadherin, and P-cadherin. Cadherin-24 shows a total identity of 57% with cadherin-11, 55% with cadherin-8, 36% with N-cadherin, 33% with E-cadherin, and 31% with P-cadherin. The first two extracellular repeats of cadherin-24 show the highest homology among the compared cadherins (Table I). The extracellular domain of cadherin-24 includes the characteristic cadherin consensus sequences DXD, DRE, and DXNDN that are believed to be involved in Ca 2ϩ binding (39). The N-terminal WV in EC1 is conserved, and the classical HAV sequence is replaced with QAV. All 4 cadherin conserved cysteine residues are present in EC5.
Expression of Cadherin-24 -We analyzed tissue-specific expression of cadherin-24 mRNA in human tissues by reverse transcription-PCR using gene-specific primers flanking the alternative exon insertion point. Fig. 4 shows that cadherin-24 short form is highly expressed in all tissues examined, whereas cadherin-24 long form was expressed in brain, kidney, lung, pancreas, and placental tissue. This is in contrast to MDA-MB-231 human mammary tumor cells, where there was approximately equal expression of the long and short forms (Fig. 2).
Functional Analysis of Cadherin-24 -To determine whether cadherin-24 is a functional cadherin that mediates cell-cell adhesion, birch-tagged constructions of either the long form or the short form were transduced into the cadherin-negative A431D cell line. The expression level was examined by immunoblot and immunofluorescence (Fig. 5). Cadherin-24-2X-birch short form and cadherin-24-2X-birch long form each were de-

TABLE I Amino acid sequence comparisons among human cadherins
Amino acid sequences for the extracellular (EC1-5), transmembrane, and cytoplasmic domains of cadherin-24 (presented in Fig. 3) were compared with the corresponding domains of cadherin-11, cadherin-8, N-cadherin, E-cadherin, and P-cadherin, and the percent identity was calculated. In addition, the overall identities among the mature proteins were calculated. tected as a doublet with molecular weights of ϳ110,00 and 105,000 (Fig. 5A). The upper band is probably not fully processed and likely retains the prosequence. This is commonly seen when cadherins are exogenously expressed in A431D cells (32). The transfectants exhibited elevated levels of ␤-catenin and ␣-catenin, suggesting that both cadherin-24 short form and cadherin-24 long form interact with and stabilize the catenins. Immunofluorescence analysis showed that both the long form and the short form of cadherin-24 were localized to cell-cell borders (Fig. 5B). The diffuse signal seen in the perinuclear region probably reflects partially processed intracellular forms of cadherin-24 corresponding to the higher molecular weight bands seen in Fig. 5A, lanes 2 and 3. To confirm that both forms of cadherin-24 interact with catenins, the cadherin was immunoprecipitated from extracts of transfected cells using an antibody against the birch tag. Fig.  6A shows that ␣-catenin, ␤-catenin, and p120 catenin all coimmunoprecipitate with both the long form and the short form of cadherin-24. A431D cells express two isoforms of p120, and both forms co-immunoprecipitate with both cadherin-24 proteins. To determine whether the cadherin-24 isoforms could mediate cell-cell interactions, we performed aggregation assays using the control A431D cells and the cells transduced with each form of cadherin-24. Fig. 6B shows that both the long form and the short form cause the cells to form large aggregates, whereas the control A431D cells do not aggregate. Thus cadherin-24 is expressed as two alternatively spliced forms, and each form is a fully functional cadherin that mediates cell-cell interactions.
Genomic Organization of Cadherin-24 - Fig. 7 shows the genomic structure of cadherin-24. The gene encoding human cadherin-24 lies on the long arm of chromosome 14 in a generich segment near the centromere and is transcribed toward the centromere. The closest neighbor of cadherin-24 on the 5Ј side is acinus (GenBank TM accession code NM_014977). Acinus and cadherin-24 are transcribed in the same direction with the 5Ј end of the cadherin-24 transcript only ϳ1 kb from the 3Јend of acinus. The 3Ј end of the cadherin-24 transcript lies only ϳ12 kb from the 5Ј end of PSMB5 encoding the proteosomal ␤5 subunit (NM_002797). The cadherin-24 gene is predicted to consist of 14 exons. The most 5Ј and the most 3Ј exons are non-coding. All the introns start with GT and end with AG. Although our longest 5Ј RACE product extended into exon 1, the EST BX248750 adds additional sequence to the 5Ј end and includes the most 5Ј sequence in the databases. A number of ESTs, including GenBank TM accession number AL137477, end in a poly(A) tail, which is located just 3Ј of an AATAAA consensus poly adenylation signal (40), suggesting that these ESTs define the 3Ј end of the mRNA. DISCUSSION The type I and type II cadherins comprise a large family (1). Deletions of coding exons in, for example, E-cadherin have been associated with pathological conditions (41,42). However, only a few cases have been reported where multiple protein products are produced from a given cadherin under normal conditions. Two transcripts encoding different protein isoforms of human cadherin-11 have been described that are predicted to

FIG. 5. Expression and localization of transduced cadherin-24.
As shown in A, lysates of control uninfected A431D cells (c), cells expressing the short form of cadherin-24 (sf), or the long form of cadherin-24 (lf) were resolved by SDS-PAGE and immunoblotted for cadherin-24 using the anti-birch tag antibody (lanes 1-3), anti-␤-catenin (lanes 4 -6), or anti-␣-catenin (lanes 7-9). The positions of the size markers are indicated on the left. As shown in B, A431D cells expressing cadherin-24 short form (a and b) or long form (c and d) were grown on glass coverslips and processed for immunofluorescence using the anti-birch tag antibody. Phase micrographs (a and c) and corresponding fluorescence micrographs (b and d) are shown. Bar ϭ 10 m.

FIG. 6. Cadherin-24 is a functional cadherin. As shown in A,
lysates of transduced A431D cells expressing the short form of cadherin-24 (sf) or the long form of cadherin-24 (lf) were immunoprecipitated with antibodies against the birch tag, resolved by SDS-PAGE, and immunoblotted with antibodies against the tag (lanes 1 and 2), ␤-catenin (lanes 3 and 4)), ␣-catenin (lanes 5 and 6), or p120 catenin (lanes 7 and 8). The positions of the size markers are indicated on the left. As shown in B, transduced A431D cells were trypsinized completely to obtain single cells, suspended in Dulbecco's modified Eagle's medium supplemented with 10% calf serum, plated (5,000 cells in 20 l) on the inside cover of a 100-mm dish, and allowed to aggregate at 37°C for 18 h in a CO 2 incubator. contain identical extracellular domains but completely different cytoplasmic tails (23). The cytoplasmic tail of the longer isoform contains typical p120 catenin and ␤-catenin-binding regions (GenBank TM accession code NM_001797), whereas that of the shorter isoform does not (NM_033664). The transcript encoding the shorter protein contains an extra exon relative to the transcript encoding the more typical cadherin. The extra exon interrupts codon 632, which is within the predicted transmembrane domain of each isoform. Although a number of other cadherins, including human cadherin-24, contain an exon-intron boundary in a similar position within the transmembrane domain, no other cadherins have been reported to be alternatively spliced in this position. Recently, the shorter isoform of cadherin-11 has been reported to affect cell behavior (21).
Two transcripts for rat cadherin-22, also known as rat PBcadherin, have also been reported (GenBank TM accession numbers D83348 and D83349). The transcripts encode proteins that diverge completely in sequence after 23 residues of their putative cytoplasmic tails. Thus, as with human cadherin-11, the longer isoform has binding sites for both p120 catenin and ␤-catenin, whereas the shorter isoform does not. We used each transcript to search the November 2002 freeze of the rat genome (genome.ucsc.edu/) and found that the entire sequence of the more typical transcript maps to rat chromosome 3. However, the sequence unique to the variant transcript maps to rat chromosome 6. In addition, the point where the two transcripts diverge does not correspond to a normal exon-intron boundary. These data raise the possibility that the variant transcript may be a chimeric cDNA rather than an alternatively spliced transcript.
In addition to cadherin-24, there have been a few other reports of cadherins that are alternatively spliced in the extracellular domain. Recently, a secreted form of chicken cadherin-7 was described that was due to alternative splicing at an exon-intron boundary ϳ55 amino acids into extracellular repeat 5 (43). The variant transcript contained an exon that is normally skipped. After 14 codons, the variant exon contained a termination codon; thus, the resulting polypeptide was predicted to be secreted. The authors found that the truncated protein inhibited cell-cell adhesion mediated by cadherin-7. Since type II cadherins can interact heterophilically (44), the secreted protein has the potential to affect the adhesive interactions between a variety of type II cadherins.
Similar to the case of chicken cadherin-7, two transcripts encoding rat cadherin-8 that diverge from one another in extracellular repeat 5 have been described (45). One transcript encodes a cadherin with a typical structure (GenBank TM accession number AB010436), and the other encodes a protein that diverges from the first after ϳ20 amino acids into extracellular repeat 5 (AB010437). The variant protein terminates after a further 19 amino acids, suggesting that it may be secreted. When we analyzed the sequences of the two transcripts using the November 2002 freeze of the rat genome, we found that the variant transcript diverged precisely at an exon-intron boundary, but the variant transcript continued into the adjacent intron throughout the remainder of its sequence. It will be interesting to determine whether the variant cDNA represents an incompletely processed transcript or a secreted form of the protein since a secreted form of cadherin-8 would be expected to interfere with cell-cell adhesion mediated by both cadherin-8 and cadherin-11 (44).
Human cadherin-6 (also known as K-cadherin) has been reported to be alternatively spliced in the extracellular domain (46). These authors reported several variants of human cadherin-6 and characterized a variant called cadherin-6/2 that was missing a portion of extracellular repeats 3 and 4. Using the sequence in GenBank TM accession number NM_004932 to represent the typical human cadherin-6 transcript and the November 2002 freeze of the human genome, we found that cadherin-6/2 is missing exon 7. Since exon 7 contains 254 nucleotides, splicing exon 6 to exon 8 would be expected to change the reading frame, and a stop codon would be encountered 15 codons after the splice junction. Thus, the expected result would be a secreted protein. However, cadherin-6/2 mediated cell adhesion when transfected into L-calls. In addition to missing exon 7, cadherin-6/2 was also missing nucleotide 34 of exon 8, putting the reading frame back into that of cadherin-6, so that cadherin-6/2 also contained an intact cytoplasmic tail. A search of the EST databases did not identify any additional cadherin-6 sequences missing exon 7 or residue 34 of exon 8. It would be interesting to see whether other tissues that express cadherin-6 produce variant transcripts.
The open reading frames of the two transcripts of cadherin-24 contain 781 and 819 codons. The algorithm of Nielsen et al. (47) predicted signal peptidase cleavage between amino acids 16 and 17. The mature proteins are likely to start with Ser-45, which is the residue immediately downstream of a consensus prohormone convertase cleavage site. As with other type II cadherins, the mature extracellular domains are predicted to contain five extracellular repeats. The extracellular portion of mature cadherin-24 contains three potential N-linked glycosylation sites at amino acids 446, 548, 563 of the open reading frame of the shorter splice form.
The alternative sequence in the long form of cadherin-24 is inserted just C-terminal to Leu-454. Based upon the x-ray structure of the extracellular portion of Xenopus C-cadherin, this position corresponds to the end of ␤ strand F in EC4 (48). Just upstream of the site of insertion in cadherin-24 lies Glu-453. This residue aligns with Asp-414 in mature Xenopus Ccadherin, which is involved in coordinating calcium ion number 3 between EC3 and EC4. See Boggon et al. (48) for the numbering of the resides in C-cadherin. This calcium ion is unique in the x-ray structure of C-cadherin in that the side chain of Gln-397, rather than the backbone carbonyl, is involved in coordinating it. The insertion of 38 amino acids at this site in cadherin-24 has the potential to disrupt the binding of one or more of the calcium ions between EC3 and EC4, and as a result, the potential to disrupt the structure at the EC3-EC4 boundary. Recent data suggest that multiple cadherin extracellular repeats may be involved in homophilic binding (49). If this is the case, the insertion of 38 residues at the EC3-EC4 boundary of cadherin-24 may be expected to have an impact on how cells adhere to one another. However, the long splice form of cadherin-24 still localizes to cell-cell borders and mediates cell aggregation (Figs. 5 and 6).
In this report, we have identified for the first time a cadherin FIG. 7. Genomic structure of cadherin-24. Exons are shown as boxes, and introns are shown as lines. The stop and start codons are indicated. Exons 1 and 14 are non-coding, and exon 9 is the alternatively spliced exon. The start sites for each cadherin extracellular repeat (EC) and for the cytoplasmic tail (CT) are indicated. The arrowheads in exons 2 and 13 indicate the beginning and end of the cDNA sequence shown in Fig. 1. Bar ϭ 1 kb. that is alternatively spliced in the extracellular domain, producing a longer splice form that is active in cell adhesion and retains its catenin-binding sites. Interestingly, the insertion in the longer form may alter its ability to bind calcium, which would be predicted to interfere with cadherin function. However, experimental data show that both forms mediate cell aggregation (Fig. 6). We have used PCR to show that cadherin-24 is widely expressed in normal tissues and that the more common transcript is the short form missing the insert in extracellular domain 4. Since we identified cadherin-24 in the MDA-MB-231 human breast cancer cell line, it will be interesting to determine whether the alternatively spliced form is expressed in tumors. We are currently developing antibody reagents to distinguish the two forms for such studies.