A new alternatively spliced exon between v9 and v10 provides a molecular basis for synthesis of soluble CD44.

The numerous isoforms of murine CD44 contain a common peptide region that is encoded by exons 1-5, 16-18, and 20 and variant regions derived from exons 6-15, usually referred to as v1-v10. We have obtained evidence for expression of an additional exon between v9 (or exon 14) and the exon previously termed v10 (or exon 15). Thus, we now number the variant exons as follows: v1-v9 (exons 6-14), v10 (exon 15), and v11 (exon 16); the remaining 3′-exons become exons 17–21 (newly numbered exons are underlined). The new exon, now termed exon v10, contains 93 base pairs and can be internally spliced; the 5′-region is termed v10a, and the 3′-region, v10b. Stop codons are positioned in v10a such that translated protein would be truncated prior to the transmembrane domain and secreted as a soluble protein. We have also found that the previously described v9 exon (now termed v9a), which is 90 base pairs in length, is actually the 5′-region of a longer exon of 142 base pairs (the 3′-region is termed v9b) and thus arises by internal splicing of the longer exon. Using reverse transcription-polymerase chain reaction, four different cDNAs for CD44 isoforms that use different combinations of the new exonic sequences have been found. The mRNAs containing the new exonic sequences are restricted in their expression; to date, we have demonstrated their presence in murine G8 myoblasts in culture and in embryonic muscle and cartilage tissues in vivo. Of these new isoforms, the predominant, full-length amplified product is encoded by exons 1-5, exon 13 (v8), the 5′-part of exon 14 (v9a), exon 15 (v10), exon 16 (v11), exons 17–19, and exon 21. When COS-7 cells were transfected with v10-containing cDNA constructs, the cells secreted low molecular weight soluble CD44 into their medium. Thus, the stop codons within the new exon v10 provide a molecular basis for de novo synthesis of soluble CD44 isoforms.

CD44 is a multifunctional cell-surface glycoprotein that is expressed as multiple isoforms derived from a single gene. Previous studies have shown that the mouse CD44 gene includes 20 exons, at least 12 of which can be alternatively spliced to generate many isoforms (1,2). Almost all CD44 transcripts so far characterized contain the products of exons 1-5 and 16 -17, which encode the extracellular domain of the standard form of CD44 (CD44s), 1  The most predominant CD44 isoform is CD44s, which is the product of exons 1-5, 16 -18, and 20. CD44s is widely distributed, and many studies have shown that CD44s is a major cell-surface receptor for hyaluronan (2,(3)(4)(5). The hyaluronanbinding domain is at the N-terminal end of the protein and is present in all CD44 isoforms despite their varying capacity to bind hyaluronan (6,7). Hyaluronan-CD44s interactions are thought to mediate endocytosis of hyaluronan (8 -10) as well as various aspects of cell aggregation (11), pericellular matrix assembly (12,13), and cell migration (14). CD44 isoforms containing variant exons are mainly restricted to tumor cells, activated lymphocytes, and proliferating or morphogenetically active epithelia (15)(16)(17)(18)(19)(20), 2 but their physiological functions therein are not yet established.
In similar fashion to many other adhesion molecules, soluble CD44 has been detected in substantial amounts in serum, lymph, and synovial fluid from a variety of species (21)(22)(23)(24)(25)(26). Very little is known about the cellular source and molecular identity, with respect to expression of variant exons, of circulating CD44. The molecular weight of soluble CD44 varies over a wide range, in part due to glycosylation (23)(24)(25)(26). Variant domains have also been detected in human sera (26,27). Studies of soluble CD44 from cell culture medium have indicated that soluble CD44 can be generated, at least in part, by proteolytic cleavage of membrane-bound CD44. Elevated levels of soluble CD44 in the sera of tumor patients have been reported (24 -26). Although the ability of soluble CD44-Ig fusion protein to block tumor development suggests that soluble CD44 can antagonize the effects of membrane-bound CD44 (30), the exact function of soluble CD44 in vivo remains uncertain.
In the process of studying the expression of CD44 isoforms during mouse limb development, we have discovered a new exon containing 93 base pairs inserted between the originally reported v9 and v10 exons. We have also found that the originally defined v9 exon is, in fact, part of a larger exon containing a second domain with 52 base pairs additional to those originally described. We have identified several new mRNA transcripts of CD44 containing various combinations of products of these exons in mouse embryonic cartilage and muscle tissues as well as in cultured murine G8 myoblasts. All of these transcripts contain stop codons (arising from within the new exon) that would result in translation of truncated CD44 isoforms lacking transmembrane and cytoplasmic domains. Therefore, the newly described exon provides a molecular basis for de novo synthesis of soluble CD44 by murine cells.
Reverse Transcription-Polymerase Chain Reaction (RT-PCR)-Total RNA was isolated from murine G8 myoblasts and various mouse embryonic tissues using TRIzol reagent (Life Technologies, Inc.) according to the manufacturer's instructions. First strand cDNA was synthesized from 5 g of total RNA using Superscript II RNase H Ϫ reverse transcriptase (Life Technologies, Inc.) according to the manufacturer's instructions.
PCR was performed on the cDNA template using the primers indicated in Table I and Taq polymerase (Perkin-Elmer). The cycling conditions were as follows: 35 cycles at 94°C for 30 s, 55°C for 30 s, and 72°C for 1 min, followed by a final 5 min at 72°C. Ten l of the PCR products were analyzed by 1% agarose gel electrophoresis. The remaining portion of the products was used for subcloning.
cDNA Cloning and Sequencing-The PCR products were directly ligated into the pCRII cloning vector (Version 2.3, Invitrogen, San Diego, CA) according to the manufacturer's instructions and used to transform bacteria. The nucleotide sequences of selected cloned inserts were determined by the double-stranded DNA/dideoxy chain termination method using Sequenase (U. S. Biochemical Corp.) following the manufacturer's instructions.
Transient Transfection of COS-7 Cells and Analysis of Products-CD44 cDNAs were inserted into the pCR 3-Uni eukaryotic expression vector (Invitrogen) at the HindII/XhoI site. Clones containing the different inserts were picked, and their authenticity and orientation were checked by DNA sequencing. COS-7 cells were maintained in RPMI 1640 medium containing 10% fetal bovine serum and nonessential amino acids (Life Technologies, Inc.). Transient transfection was performed with the different CD44 constructs and Lipofectamine reagent according to manufacturer's instructions (Life Technologies, Inc.). Culture medium from cells transfected with each construct was collected 72 h after transfection. In each case, one-tenth of the medium from a single 100-mm dish (1 ml) was precipitated with 5 ml of cold acetone, and the pellets were dissolved in SDS-polyacrylamide gel electrophoresis sample buffer. The cell layer from each 100-mm dish was extracted with 50 mM Tris-HCl buffer, pH 7.4, containing 150 mM NaCl, 5 mM EDTA, 1% Triton X-100, 0.1% SDS, 2 mM phenylmethylsulfonyl fluoride, 2 g/ml leupeptin, and 0.5 unit/ml aprotinin, and one-fifth of each extract was used for SDS-polyacrylamide gel electrophoresis. 10% SDS-polyacrylamide gel was run under nonreducing conditions, and the separated proteins were transferred to Hybond-ECL nitrocellulose membranes (Amersham Corp.). After transfer, the membranes were blocked in 1% bovine serum albumin, 1% nonfat milk, 5 g/ml goat IgG in 0.2% Tween 20, 0.05 M Tris-HCl, and 0.15 M NaCl, pH 8.0, and then incubated with monoclonal antibody against CD44 (KM201, American Type Culture Collection) at 4°C overnight. After extensive washing with 0.2% Tween 20, 0.05 M Tris-HCl, and 0.15 M NaCl, pH 8.0, the membranes were incubated with goat anti-rat secondary antibody conjugated with horseradish peroxidase for 30 min at room temperature, followed by ECL detection reagents (Amersham Corp.) according to the manufacturer's instructions.

Detection of a Potentially New CD44 Isoform in Mouse Embryonic Cartilage and
Muscle-Total RNA was isolated from many different tissues of embryonic and adult Swiss Webster mice, and RT-PCR was performed with a variety of primers. When primers 1r (exon 16 antisense primer) and v8f (exon v8 sense primer) (Table I) were used, a single major product was obtained in most cases, e.g. in E9 -E14 limbs and E15 lung bud, tooth rudiment, nasal epithelium, and skin ( Fig. 1, lanes 2-7 and 12-15). Subcloning and sequencing confirmed that this product had an expected composition, i.e. exons v8 -v10 and a portion of exon 16. However, an additional product of larger size was detected in embryonic hyaline cartilage (Fig. 1, lane 8, arrowhead) and muscle (lane 10, arrowhead) on days E15 and E17, respectively. Hypertrophic cartilage from E15 and E18 (containing small amounts of bone marrow) also produced this transcript ( Fig. 1, lanes 9 and 11). This product was shown, as presented below, to contain an additional transcribed insert of 93 bp between v9 and v10.
New Exonic Sequence between v9 and v10 of Murine CD44 -To facilitate characterization of the potentially new isoform of CD44, we employed murine fetal G8 myoblasts since larger amounts of material could be obtained more easily than from embryonic tissues. Four major RT-PCR products were revealed using the antisense primer from exon 16 (1r) with the sense primer from v8 (v8f), and three major amplified products were obtained using primers from exon 16 (1r) and v9 (v9f). These products were subcloned into the pCRII vector and sequenced. When the sense primer from v8 was used, three of the major amplified products were found to contain v8, v8-v9, or v8-v9-v10 joined to exon 16. However, the fourth and largest major product contained v8-v9-v10 and a novel 93-bp sequence between v9 and v10 joined to exon 16. Likewise, when the sense primer from v9 was used, two of the products included v9 and v9-v10, and a third major amplified product included v9, v10, and the same 93-bp sequence between v9 and v10 as described above.
The nucleotide sequence of the additional 93-bp insert between the original v9 and v10 exons is given in Fig. 2. The predicted amino acid sequence is also shown; a stop codon appears after the 12th amino acid in this sequence. These results suggested the possibility that new exonic sequence may reside between exons v9 and v10 and that this exonic sequence may give rise to transcripts encoding truncated CD44 protein.
To confirm the presence of new exonic sequence between v9 and v10, we first ensured that the results obtained above were not due to contamination by genomic DNA. Total RNA from G8 myoblasts was treated with or without DNase-free RNase at 37°C for 5 min. After phenol/chloroform extraction and ethanol precipitation, RT-PCR was performed with the antisense a r, reverse orientation; f, forward orientation.
FIG. 1. Expression of novel CD44 isoform mRNAs in mouse embryonic tissues. The reverse complement primer 1r and the sense primer v8f were used for RT-PCR (see Table I  primer from invariant exon 16 (1r) and the sense primer from v8 (v8f) or v9 (v9f). Neither the v8-v9-v10-and v9-v10-containing products (ϳ460 and ϳ360 bp, respectively, in Fig. 3A, lanes  2 and 3) nor the longer products containing the new 93-bp insert (lanes 2 and 3, arrows) were detected after RNase treatment (lanes 4 and 5), indicating that genomic DNA was not the template for any of these products.
To confirm these results further, purified genomic DNA from Balb/c mouse kidney was used in the PCR with primers from within v9 and v10 (v9f and v10r in Table I). The PCR product was ϳ1.1 kilobases in size (Fig. 3B, lane 2), and sequencing indicated that this product contained the expected portions of exon v9 (90 bp) and exon v10 (55 bp) as well as the intron between v9 and v10 (ϳ1 kilobase). Thus, since the unexpected products obtained by RT-PCR (Fig. 3A, arrows) contained only ϳ100 bp more than the largest expected products, the intron between v9 and v10 is too large to be the source of these longer RT-PCR products. These results further indicate that additional exonic sequence between exons v9 and v10 is expressed by G8 myoblasts.
Detection of Additional Murine CD44 Isoforms with New Inserts-Three additional minor CD44 isoforms containing new inserts were detected by cloning RT-PCR products derived from using primers 1r and v9f (Table I) with total RNA from G8 myoblasts, embryonic muscle, and embryonic cartilage (these products are not readily seen in Figs. 1 and 3A). The additional nucleotide sequences, predicted amino acid sequences, and deduced structures of the relevant region of these isoforms are given in Fig. 4. A total of 145 additional base pairs were recovered between v9 and v10. For ease of discussion, we have defined the first 52 bp immediately following the v9 exon sequence as part I of the new inserts, the next 65 bp as part II, and the last 28 bp as part III (Fig. 4A). The combined sequence of parts II and III is identical to that given in Fig. 2 for the 93-bp insert that was originally identified. Thus, the major isoform contains v9, parts II and III of the new inserts, and v10 (Fig. 4B); this isoform is expressed by G8 myoblasts and in embryonic muscle and cartilage ( Figs. 1 and 3A, arrows). The three additional isoforms contain the following: (a) v9, parts I and II of the new inserts, and v10 (Fig. 4C), detected in embryonic hypertrophic cartilage (with bone marrow) and G8 myoblasts; (b) v9, part II of the new inserts, and v10 (Fig. 4D), detected in G8 myoblasts; and (c) v9, parts I-III of the new inserts, and v10 (Fig. 4E), detected in embryonic muscle. All four isoforms contain stop codons, either after the 12th additional amino acid in the case of isoforms with inserts beginning at part II (Fig. 4, B and D) or after the 24th additional amino acid in the case of isoforms with inserts beginning at part I (Fig. 4, C and E). Consequently, all of these isoforms would encode CD44 proteins truncated before the transmembrane domain.
Full-length Murine CD44 Transcripts Containing New In-serts-To determine the arrangement of expressed exons occurring around the inserts, full-length cDNAs containing new inserts were recovered by two RT-PCRs using total RNA from G8 myoblasts. In the first reaction, the sense primer from the extreme 5Ј-end of CD44 exon 1 (3f) and the antisense primer from the 3Ј-end of part II of the new inserts (4r) were used. In the second reaction, the sense primer from the 5Ј-end of part II of the new inserts (5f) and the antisense primer from the 3Ј-end of CD44 exon 20 (6r) were used. One major product and several minor products were amplified in these RT-PCRs. Subcloning and sequencing of the RT-PCR products indicated that the major amplified product contained CD44 exons 1-5, v8, v9, parts II and III of the new inserts, v10, exons 16 -18, and exon 20 (Fig. 5A). Several other minor amplified products were characterized and are shown in Fig. 5 (B-E). Thus, we have recovered several full-length CD44 transcripts   FIG. 2. Nucleotide and predicted amino acid sequences of the  predominant insert between v9 and v10. The unexpected RT-PCR products from G8 myoblasts and from embryonic muscle and cartilage were subcloned and sequenced. An insert (93 bp in length) between v9 and v10 was identified, and its sequence is shown here. This sequence contains a stop codon (***) after the 12th amino acid in the predicted peptide sequence .   FIG. 3. Expression of novel CD44 isoforms by G8 myoblasts. A, RT-PCR products obtained when total RNA template from G8 myoblasts was treated with (lanes 4 and 5) or without (lanes 2 and 3) DNase-free RNase prior to RT-PCRs. Lane 1, DNA ladder; lanes 2 and 4, antisense primer 1r and sense primer v8f; lanes 3 and 5, primers 1r and v9f. B, PCR using mouse genomic DNA as template. Lane 1, DNA ladder; lane 2, antisense primer v10r and sense primer v9f.

FIG. 4. Sequence and arrangement of new inserts between v9
and v10. RT-PCR products from G8 myoblasts and embryonic tissues (E15 cartilage, E17 muscle, and E18 hypertrophic cartilage) were subcloned and sequenced. The largest insert identified between v9 and v10 contained 145 bp; its sequence is shown in A. Various shorter inserts were obtained that contained portions of this sequence. On the basis of the data obtained, the new inserts were subdivided into parts I (boldface letters in A), II (underlined), and III (thickly underlined). The double-struck block letters in A indicate the consensus motifs for splicing sites. The most prominent insert contained parts II and III (B; 93 bp in length) (also see Fig. 2). Other inserts contained parts I and II (C; 117 bp), part II (D; 65 bp), or, as mentioned above, parts I-III (E; 145 bp). All these isoforms contain stop codons (***), leading to amino acid sequences truncated before the transmembrane domain of previously identified CD44 isoforms. The predicted amino acid sequences of the various new inserts are shown in B-E. containing the new inserts described above.
Secretion of Soluble CD44 from Cells Transfected with cDNA Constructs Containing the New Exon-COS-7 cells were transiently transfected with cDNA constructs for CD44s and for the major new CD44 isoform, i.e. containing exons 1-5, v8, v9, parts II and III of the new inserts, v10, exons 16 -18, and exon 20, as described above. Fig. 6 shows the Western blot analysis of the translated products found in the cell layers and media from these cultures. Clearly, a significant amount of secreted soluble CD44, of smaller size than CD44s, was detected in the medium from cells transfected with the construct containing the new exonic insert, whereas no secreted CD44 was detected in the medium from the cells transfected with CD44s. COS-7 cells were also transfected with cDNA for a minor CD44 variant, containing exons 1-5, v7-v9, parts II and III of the new inserts, v10, exons 16 -18, and exon 20. Expression was lower than in the cells described above, but these cells also secreted significant amounts of soluble CD44 (data not shown).
Exon/Intron Organization around New Inserts between v9 and v10 of Murine CD44 -Mouse genomic DNA was used in the PCR to recover CD44 genomic sequences between v9 and v10. The salient sequences and their relationship to exon/intron boundaries are shown in Fig. 7. By comparing these genomic sequences with cDNA sequences for the new inserts, we conclude that part I of the new inserts is a hitherto unrecognized portion of exon v9 since this 52-bp sequence is connected directly to the 3Ј-end of the v9 genomic sequence (Fig.  7). Therefore, we redefine the original exon v9 (1) as v9a and the additional 52 bp revealed in this study as v9b. Parts II and III of the new inserts derive from a new exon between v9 and the original v10 exon (Fig. 7). This new exon is separated from exon v9 by an intron of 170 bp and from the originally defined v10 exon by an intron of ϳ0.7 kilobase. Both introns have GT/AG consensus sequences for RNA splicing. Thus, we rede-fine the new exon as v10 or exon 15 of CD44; parts II and III are termed v10a and v10b. The previous v10 exon becomes v11 (exon 16), and the previous exons 16 -20 become exons 17-21. New numbers assigned to exons are underlined to facilitate further discussion. DISCUSSION We have identified a new CD44 exon between v9 (exon 14) and the exon previously termed v10 or exon 15 (1). Thus, we now number the variant exons as v1-v9 (exons 6 -14), v10 (exon 15), and v11 (exon 16), and we number the remaining 3Ј-exons as exons 17-21. We have also found that the previously described v9 exon (now termed v9a) is actually the 5Јregion of a longer exon and thus arises by internal splicing of this longer exon. The new exon, now termed exon v10, contains 93 base pairs and can be internally spliced. Transcripts for CD44 isoforms containing all or part of this new exon include a stop codon positioned so that translated protein would be truncated prior to the transmembrane domain and thus would be secreted as soluble protein. Therefore, transcription and translation of this new exon provide a mechanism whereby soluble CD44 can be synthesized de novo via alternative splicing. COS-7 cells transfected with CD44 cDNA constructs containing the new v10 exon secreted soluble CD44 of lower molecular weight than CD44s, confirming the above supposition. Recently, we have found that human tumor cells also express a CD44 transcript with an insert between v9 and v10 that contains a stop codon; however, the arrangement of exonic se- FIG. 5. Full-length CD44 transcripts containing the new inserts. Total RNA from G8 myoblasts was used to recover full-length CD44 transcripts containing the new insert; the primers used (3f, 4r, 5f, and 6r) are given in Table I 1 and 2); a CD44 variant containing exons 1-5, v7-v9, parts II and III of the new inserts, v10, exons 16 -18, and exon 20 (lanes 3 and 4); or vector only (lanes 5 and 6). In each case, one-fifth of the cell layer extract (lanes 1, 3, and 5) and one-tenth of the medium (lanes 2, 4, and 6) were analyzed by Western blotting for reactivity with antibody KM201 against CD44. Arrowheads indicate the positions of migration of 203-, 118-, 86-, 52-, and 31-kDa markers.

FIG. 7. Exon/intron organization of the new inserts.
Genomic DNA sequence between the original v9 and v10 exons was obtained by PCR using mouse genomic DNA with primers derived from v10 and v9 (v10r and v9f in Table I). The exon (cDNA) sequences are shown in upper-case letters, and the intron sequences in lower-case letters. The lengths of intron sequences are calculated from PCR and partial sequencing data. The sequencing data indicate that part I of the new inserts (now termed v9b) is directly connected to the original v9 exon (now termed v9a), while parts II and III of the new inserts derive from a new exon (now termed v10) that is separated from v9b by an intron of 170 bp. An intron of ϳ0.7 kilobase lies between the newly defined v10 and v11 (formerly v10) exons. Both introns are flanked by the GT/AG consensus sequence for RNA splicing (thickly underlined). quences in the human genome is somewhat different from that in the mouse. 3 There are many examples of production of soluble forms of transmembrane cell adhesion molecules, in most cases arising via shedding. Although the role of these soluble forms is not established, they are often regarded as potential antagonists of their membrane-bound forms. Soluble CD44 has been detected in serum from several species and in cell culture medium from various cell lines (21)(22)(23)(24)(25)(26)(27)(28)(29). Evidence has been presented showing that, in culture medium, soluble CD44 is a proteolytic product of membrane CD44 (26,28,29). In the case of human CD44, variant exon v1 contains a stop codon and thus could serve to generate soluble CD44; however, no evidence for its expression has been published (1). It has been proposed that soluble CD44 may act as an antagonist of membrane CD44, especially in tumorigenesis and immune responses (22,26,30). An example of the potentially antagonistic action of soluble CD44 is the demonstration that soluble CD44-Ig fusion protein antagonizes the function of membrane CD44 in melanoma growth in vivo (30). Synthesis of soluble CD44 via alternative splicing to include the new v10 exon provides a mechanism of production that is subject to rigorous cellular control and thus may be important in the regulation of CD44 activity.
Among the numerous embryonic tissues we have tested, only cartilage and muscle (and possibly bone marrow) express the new v10 exon. The predominant isoform containing the new exon in these tissues is encoded by exons 1-5, exon 13 (v8), the first part of exon 14 (v9a), and exon 15 (the new v10 exon), followed by the newly numbered exons 16 (v11), 17-19, and 21. The functional relevance of expression of this and other new v10-containing isoforms during cartilage and muscle development is not known, but their restricted expression implies that they have an important role. Cartilage and muscle both arise from precursor cells that exhibit prominent hyaluronan-dependent pericellular matrices that are lost preceding differentiation (31,32). These matrices are dependent on interaction of hyaluronan with cell-surface CD44 (12,13), and thus, soluble CD44 produced in these differentiating tissues may compete with membrane-bound CD44, causing displacement of hyaluronan from the cell surface and consequent loss of the pericellular matrices. An intermediate stage in cartilage and muscle differentiation, after loss of these pericellular matrices but prior to final differentiation, is mesodermal condensation. The process of condensation is caused, at least in part, by crossbridging of the mesodermal cells by hyaluronan bound to the cell surface (33,34). An alternative possibility, then, would be that soluble CD44 produced during differentiation inhibits hyaluronan-mediated cross-bridging of the condensed meso-derm, allowing further differentiation to proceed. Further work will be required to determine the precise role of soluble CD44 in these processes.