Alternative splicing of the unique "PLUS" domain of chicken PG-M/versican is developmentally regulated.

We investigated the occurrence of alternatively spliced forms (V0, V1, V2, and V3) of PG-M/versican, a large chondroitin sulfate proteoglycan in developing chicken retinas, using the reverse transcription-polymerase chain reaction. We characterized the PLUS domain, which is apparently unique to the chicken molecule and is regulated by alternative splicing. PG-M in chicken retinas consisted of four forms with (V0, V1, V2, and V3) and two forms without (V1 and V3) the PLUS domain (PG-M+ and PG-M-, respectively). The four forms of PG-M+ were found in all samples examined, but the occurrence of the two PG-M- forms was regulated developmentally. Genomic analysis has revealed that the PLUS and CS-alpha domains are encoded by a single exon, and this exon has an internal alternative 5'-splice donor site, allowing alternative spliced forms that do not include the 3'-end of the exon. Sequences corresponding to the chicken PLUS domain (plus) were not found in mouse and human and may have disappeared during evolution. Sequence similarity suggests that the PLUS domain corresponds to the keratan sulfate attachment domain of aggrecan and that it has a distinct function in the chicken eye.

PG-M, a large chondroitin sulfate proteoglycan, is a major extracellular matrix molecule located in the mesenchymal cell condensation regions of developing chicken limb buds (1). Its expression, however, is regulated in an inverse relationship to that of aggrecan, and PG-M disappears after cartilage development (2). PG-M is also transiently expressed in various embryonic tissues during morphogenesis and differentiation (3). Therefore, PG-M may play some regulatory roles in many biological events.
Our cDNA studies on the core proteins of mouse PG-M revealed four mRNA species designated PG-M(V0), PG-M(V1), PG-M(V2), and PG-M(V3) in order of length (4,5). All have hyaluronan-binding domains at the amino terminus and two epidermal growth factor (EGF) 1 -like domains, a lectin-like domain, and a complement regulatory protein (CRP)-like domain at the carboxyl terminus. The amino-and carboxyl-terminal regions show binding activity for hyaluronan (1,6) and a C-type lectin-like activity (7), respectively. However, they have different chondroitin sulfate attachment regions in the middle of the core proteins. The differences are generated by alternative and simultaneous usage of the two different domains for the chondroitin sulfate attachment region (CS-␣ and CS-␤).
Versican was first identified in human fibroblasts by a cDNA study (8,9). Homology analysis of the deduced amino acid sequence demonstrated that versican corresponds to the core protein of PG-M(V1) (4). Other forms (V0, V2, and V3) of human versican have since been identified (5,10).
Although there are four forms of mouse and human PG-M/ versican, PG-M(V2) and PG-M(V3) have not yet been identified in chicken (11). We reported that the chondroitin sulfate attachment region of chicken PG-M(V1) is longer than that of mouse PG-M(V1) (4), suggesting an extra domain between the hyaluronan-binding domains and the CS-␣ domain in chicken PG-M. Whether or not this domain is a single exon and whether or not its expression is regulated by alternative splicing remain to be examined.
In this study, we investigated the occurrence of multiple forms (V0, V1, V2, and V3) of PG-M in the developing chicken retina and found alternative splicing for this domain, which we have named the "PLUS" domain. Although the significance of diverse alternative splicing for PG-M is not known, each form of PG-M may have a unique function in this developing organ. Because PG-M and aggrecan are structurally similar, the PLUS domain might be related to the keratan sulfate attachment domain of aggrecan. We discuss the evolutionary significance of this finding.

EXPERIMENTAL PROCEDURES
RNA Isolation-Total RNAs were obtained from whole eyes of chicken embryos (White Leghorn) on days 5, 7, and 9 (designated E5, E7, and E9, respectively). Total retinal RNA was obtained from chicken embryos on days 14 and 20 (designated E14 and E20, respectively) and from adult chicken. RNA was extracted using guanidinium thiocyanate (12).
cDNA Libraries-Human fetal brain, fetal liver, cerebral cortex, and skeletal muscle and mouse brain, embryonic stem cell, and skeletal muscle cDNA libraries were obtained commercially (CLONTECH, Palo Alto, CA).
RT-PCR Amplification-Primers for RT-PCR amplifications were chosen from the published sequences of chicken PG-M to detect the specific portion of each splicing form. Reverse transcription was performed using three antisense primers (see Fig. 1A, e, j, and m; and Table  I) and SuperScript II RNase H Ϫ reverse transcriptase (Life Technologies, Inc.) as recommended by the manufacturer. The first PCR amplifications were carried out using pairs of outer primers (see Fig. 1A, a, d, f, i, and l; and Table I). The second PCR was performed using the first PCR products as templates and pairs of inner primers (see Fig. 1A, b, c, g, h, and k; and Table I). Conditions for PCR amplification were as described (5). We amplified chicken genomic DNAs (CLONTECH) using the LA PCR kit (Takara Biomedicals, Kyoto, Japan) as recommended by the manufacturer. The final products were resolved by electrophoresis on a 1.2% (v/w) agarose gel or on 2% (v/w) NuSieve (3:1; FMC Corp. BioProducts, Rockland, ME).
DNA Sequencing-PCR products were purified with the EasyPrep PCR Product Prep kit (Pharmacia Biotech, Uppsala). Purified DNAs were sequenced as described (5). The sequencing primers were identical to those used for the above RT-PCR amplifications.
Sequence Similarity Analysis of the PLUS Domain-The sequence of the PLUS domain was compared with the data base compiled by the European Bioinformatics Institute using the GENETYX-MAC computer program (Software Development Co., Tokyo). The deduced amino acid sequence was compared with other protein sequences in the data base compiled by the National Biomedical Research Foundation and the European Bioinformatics Institute.

Development-and Age-dependent Expression of the PG-M PLUS Domain in the Chicken Retina-
The second RT-PCR amplifications performed using the inner primer pairs on E14 and E20 retinal cDNAs and on retinal cDNAs from adult chicken (1-year-old) generated one or two products (Table I and Fig. 1 (A and B)). The latter indicated the presence of two transcripts with and without the exon containing ϳ400 nucleotides and corresponding to the PLUS domain. The shorter transcripts without the exon were found in PG-M(V1) and PG-M(V3) of E14 retina and in PG-M(V1) of adult retina. PG-M ϩ and PG-M Ϫ refer to PG-M with and without the PLUS domain, respectively. Four forms (V0, V1, V2, and V3) of PG-M ϩ were detected in all retinas (Fig. 1B). However, the occurrence of PG-M Ϫ was developmentally regulated. PG-M Ϫ (V1) was detected in E14 and adult retinas (Fig. 1B, lanes 3  and 15), but not in E20 retina (lane 8). PG-M Ϫ (V3) was detected in E14 retina (Fig. 1B, lane 5), but not in E20 and adult retinas (lanes 10 and 17). Since the primer pair b and c only gave a band corresponding to the product containing the PLUS domain (Fig. 1B, lanes 6, 11, and 18), no forms containing the exon for the hyaluronan-binding region directly spliced to that for the CS-␣ domain. A summary of the variation of PG-M forms expressed in E14, E20, and adult retinas is shown in Table II.
Analysis of PG-M Forms Expressed in Chicken Embryonic Whole Eyes-We further examined the PG-M forms in E5, E7, and E9 whole eyes using RT-PCR to determine the relevance of PG-M Ϫ to the developmental stage. We also examined the presence of PG-M Ϫ (V0) and PG-M Ϫ (V2). Since the retinas of these early embryos were too small to isolate, we analyzed whole eyes. The results revealed the presence of all forms (V0, V1, V2, and V3) of PG-M ϩ and two forms (V1 and V3) of PG-M Ϫ (Fig. 1C). These expression profiles were the same as those in E14 retina. The variation of PG-M forms expressed in E5, E7, and E9 whole eyes is summarized in Table II.
Alternative Splicing of the PLUS Domain in Chicken PG-M-We compared the DNA sequences of the PCR products of PG-M ϩ and PG-M Ϫ . The results revealed alternative splicing of the PLUS domain, which was located between the hyaluronanbinding BЈ domain (nucleotide 1183) and the CS-␣ domain (nucleotide 1598) for PG-M ϩ (V0) and PG-M ϩ (V2), between the hyaluronan-binding BЈ domain and the CS-␤ domain (nucleotide 4379) for PG-M ϩ (V1), and between the hyaluronan-binding BЈ domain and the EGF-like domain (nucleotide 9905) for PG-M ϩ (V3) ( Fig. 2A). The results also showed that the PLUS domain consisted of 414 nucleotides (nucleotides 1184 -1597 for PG-M ϩ (V0)) ( Fig. 2B). New termination codons and shifts of reading frames were not identified in these junctional regions ( Fig. 2A). A computer-assisted sequence similarity search for the PLUS domain in nucleic acid and protein data bases did not identify other genes with significant homology.
Location of the Exon Coding for the PLUS Domain in the Chicken PG-M Gene-To confirm the presence of the exon gene for the PLUS domain in the chicken PG-M gene, which we named plus, we performed PCR studies on chicken genomic DNA. Primers specific to exons for the hyaluronan-binding BЈ domain and the CS-␣ domain were used together with internal primers to the exon for the PLUS domain to determine the position of the plus exon in the PG-M gene. Analysis of the PCR products indicated that the exon was ϳ12 kilobases downstream of the exon for the BЈ domain (Fig. 3A, lane 2), but adjacent to that for the CS-␣ domain (lane 4). We then sequenced the PCR product amplified with the primer pair p and q. The results showed no intron between the PLUS and CS-␣ domain-encoding sequences, suggesting that they are encoded by a single exon (Fig. 3B). We named this domain "PLUS-␣." The nucleotide sequence of the boundary region between the two domains is shown in Fig. 3B. Although there was an exon terminus-like sequence (AAG) in the 3Ј-terminal portion of the PLUS domain and a splicing donor site-like sequence in the 5Ј-terminal portion of the CS-␣ domain (Fig. 3B, boldface letters), there was no typical acceptor site-like sequence in the 3Ј-terminal portion of the PLUS domain. This sequence causes similar alternative splicing in other genes and is termed an internal alternative 5Ј-splice donor site (13)(14)(15).
Absence of the PLUS Domain in Human and Mouse cDNA Libraries-Sequences corresponding either to an internal alternative 5Ј-splice donor site or to the PLUS domain in human or mouse PG-M/versican have not been described (16,17). To confirm that there is no PLUS domain in human and mouse PG-M, we amplified the relevant cDNAs from several cDNA libraries of various human and mouse tissues using appropriate primers (Table I and Fig. 4). The products showed a single band or no band (Fig. 4), confirming that cDNAs for the PLUS domain were absent in those cDNA libraries (no PG-M ϩ forms). Although cDNA libraries of mouse and human retinas were not examined, cDNAs for PG-M ϩ were found in cDNA libraries of  Table I PCR products obtained from E7 and E9 whole eyes were essentially the same as those obtained from E5 eye.

Summary of various forms of PG-M expressed in chicken whole eyes and retinas at different developmental stages
Shown are the PG-M forms detected in E5, E7, and E9 whole eyes. All of them showed the same patterns. Also shown are PG-M forms detected in E14, E20, and adult retinas. The plus signs indicate the presence of each PG-M form.

Whole eyes
Retinas various chicken tissues corresponding to those of the human and mouse tissues tested in the above experiment. Therefore, the PLUS domain may be unique to chicken PG-M.

Mechanisms of Characteristic Alternative Expression of the PLUS Domain-This
study revealed that there is an internal alternative 5Ј-splice donor site at the boundary of the PLUS and CS-␣ domains in the exon for the PLUS-␣ domain (Fig. 3B). The absence of a typical 3Ј-splice acceptor site at the boundary is the reason why the V0 and V2 forms of PG-M Ϫ are absent (Fig. 5). The exon for the PLUS-␣ domain functions as a single exon in the expression of the V0 and V2 forms of PG-M ϩ , but this exon is spliced out in the expression of the V1 and V3 forms of PG-M Ϫ (Fig. 5). An internal alternative 5Ј-splice donor site in the exon for the CS-␣ domain functions like the beginning of an intron ("pseudo intron") in the expression of the V1 and V3 forms of PG-M ϩ . During the splicing, the plus sequence remains as an exon for the V1 and V3 forms of PG-M ϩ (Fig. 5). Since we found that PG-M ϩ (V1) is the major form of PG-M in 10-day chicken embryonic fibroblasts (11), this splicing event does not seem to be rare.
This study showed that all forms (V0, V1, V2, and V3) of PG-M ϩ were present in all samples examined. On the other hand, the V1 and V3 forms of PG-M Ϫ are expressed in a developmentally regulated manner and tend to be expressed at the earlier stages (Table II), suggesting that alternative splicing skipping the exon for the PLUS-␣ domain is regulated developmentally. The size of the intron between exon VI (BЈ domain) and exon VII (CS-␣ domain) in human or mouse PG-M is ϳ6 kilobases (16,17). This study also showed that the size of the intron between the exons for the BЈ and PLUS-␣ domains is ϳ12 kilobases in chicken PG-M/versican. Considering the difference, a region of the gene containing the PLUS domain sequence and an intron of ϳ6 kilobases might have been removed during evolution by some mechanism. The sequence similarity between the PLUS domain (414 nucleotides and 137 amino acids) and the first part of the mouse or human CS-␣ domain (the same numbers of nucleotides and amino acids) is fairly low, not only in nucleotide sequences (48.6 and 48.8% identity to human and mouse, respectively), but also in amino acid sequences (19.0 and 10.3% identity to human and mouse, respectively), which supports the above notion. However, it is still possible that the PLUS domain is not a separately defined domain, but is simply an alternatively spliced part of the chicken CS-␣ domain.
Possible Functions of the PLUS Domain-Sequence similarity analysis has not not identified any nucleotide or amino acid sequences in the data bases similar to those for the PLUS domain, except for some identity in the nucleotide sequence to the KS domain of aggrecan as discussed below. Since the PLUS domain was detected in chicken PG-M/versican, but not in human or mouse PG-M/versican as far as we investigated in available cDNA libraries, it is likely that this domain is unique to chicken PG-M/versican. Genomic analysis of human and mouse PG-M/versican proteins has shown that the total number of exons is identical (15 exons) (16,17). Comparisons of nucleotide sequences of the cDNAs and deduced amino acid sequences among chicken, human, and mouse PG-M/versican proteins suggested that chicken PG-M/versican may have the same number of exons because the PLUS and CS-␣ domains are derived from a single exon (PLUS-␣) (Fig. 3A).
Aggrecan contains several domains that are highly homologous to PG-M/versican (18 -23). This structural identity suggested that the PLUS domain might correspond to the KS attachment domain of aggrecan. Comparisons of the nucleotide and amino acid sequences between the PLUS domain and the KS attachment domain of human, rat, mouse, or chicken aggrecan revealed significant identity among their nucleotide sequences (44.1, 47.6, 57.4, and 51.6% to human, rat, mouse, and chicken domains, respectively) and amino acid sequences (20.0, 23.5, 23.5, and 40.0% to human, rat, mouse, and chicken domains, respectively). A comparison of the frequency of serine plus threonine residues (potential O-glycosylation sites) to the total amino acid residues of the respective domains between chicken PG-M/versican and human aggrecan also revealed significant similarity with respect to the potential for O-glycosylation between the PLUS domain of chicken PG-M/versican and the KS attachment domain of human aggrecan (Table III). Furthermore, the phylogenetic tree (Fig. 6) constructed as described by Saitou and Nei (24) suggests that the chicken KS domain is more closely related to the PLUS domain than it is to the human, mouse, and rat KS domains. The distance between a pair of sequences is the sum of the branch lengths. Thus, the PLUS domain of PG-M/versican could be considered to correspond to the KS attachment domain of aggrecan. Interestingly, PG-M/versican regulates molecular forms by alternative splicing of the PLUS, CS-␣, and CS-␤ domains, while aggrecan does so by alternative splicing of the EGF and CRP domains (21,25,26). With regard to comparisons of the PLUS domain of PG-M/ versican with the KS domain of aggrecan, two reports describe the relationship between exon boundaries and the functional domains of aggrecan (27,28). According to Valhmu et al. (27), the KS domain is composed of two regions, KS-1 and KS-2. The former is encoded by exon 11 and is well conserved among various animal species (bovine, mouse, rat, and chicken), while the latter is composed of variable numbers of poorly conserved hexapeptide repeats and is encoded by the 5Ј-end of the large exon 12, which also encodes the CS-1 and CS-2 domains of aggrecan. Considering our finding that the PLUS and CS-␣ domains of PG-M/versican are encoded by a single exon, the PLUS domain appears to be rather comparable to the KS-2 domain. However, Li and Schwartz (28) seemed to limit the definition of the KS domain to the sequence encoded by exon 11 of the chicken gene.
Roles of Multiple Forms of PG-M-Chondroitin sulfate proteoglycans in the retina have been extensively studied (29 -44), and possible functions have been suggested (34, 35, 44 -47). We demonstrated not only the presence of various forms of PG-M in the developing chicken retina, but also their developmental stage-and age-dependent variations by immunofluorescent staining with polyclonal and monoclonal antibodies to PG-M  Table I). PCR products were derived from cDNA libraries of the following human and mouse tissues; lanes 2 and 12, human fetal brain; lanes 3 and 13, human fetal liver; lanes 4 and 14, human adult cerebral cortex; lanes 5 and 15, human adult skeletal muscle; lanes 7 and 17, mouse adult brain; lanes 8 and 18, mouse embryonic stem cell; lanes 9 and 19, mouse adult skeletal muscle.   6. Phylogenetic tree of the PLUS domain and the KS attachment domains of aggrecan based upon nucleotide sequences. The tree was constructed by the neighbor-joining method (24). The distance between a pair of sequences is the sum of the branch lengths. Nucleotide positions of the KS domains are compared: nucleotides 2171-2353 for chicken (26), nucleotides 2095-2313 for human (22), nucleotides 2161-2343 for rat (18), and nucleotides 2194 -2391 for mouse (21).