Genomic Structure and Expression of the Mouse Growth Factor Receptor Related to Tyrosine Kinases (Ryk)*

We report the genomic organization of the mouse orphan receptor related to tyrosinekinases (Ryk), a structurally unclassified member of the growth factor receptor family. The mouse RYK protein is encoded by 15 exons distributed over a minimum of 81 kilobases. Genomic DNA sequences encoding a variant protein tyrosine kinase ATP-binding motif characteristic of RYK are unexpectedly found in two separate exons. A feature of the gene is an unmethylated CpG island spanning exon 1 and flanking sequences, including a TATA box-containing putative promoter and single transcription start site. Immunohistochemical examination of RYK protein distribution revealed widespread but developmentally regulated expression, which was spatially restricted within particular adult organs. Quantitative reduction of Southern blotting stringency for the detection ofRyk-related sequences provided evidence for a retroprocessed mouse pseudogene and a more distantly related gene paralogue. Extensive cross-species reactivity of a mouseRyk kinase subdomain probe and the cloning of aRyk orthologue from Caenorhabditis elegansdemonstrate that Ryk and its relatives encode widely conserved members of a novel receptor tyrosine kinase subfamily.


From the Ludwig Institute for Cancer Research, P. O. Box 2008, Royal Melbourne Hospital, Victoria 3050, Australia
We report the genomic organization of the mouse orphan receptor related to tyrosine kinases (Ryk), a structurally unclassified member of the growth factor receptor family. The mouse RYK protein is encoded by 15 exons distributed over a minimum of 81 kilobases. Genomic DNA sequences encoding a variant protein tyrosine kinase ATP-binding motif characteristic of RYK are unexpectedly found in two separate exons. A feature of the gene is an unmethylated CpG island spanning exon 1 and flanking sequences, including a TATA boxcontaining putative promoter and single transcription start site. Immunohistochemical examination of RYK protein distribution revealed widespread but developmentally regulated expression, which was spatially restricted within particular adult organs. Quantitative reduction of Southern blotting stringency for the detection of Ryk-related sequences provided evidence for a retroprocessed mouse pseudogene and a more distantly related gene paralogue. Extensive cross-species reactivity of a mouse Ryk kinase subdomain probe and the cloning of a Ryk orthologue from Caenorhabditis elegans demonstrate that Ryk and its relatives encode widely conserved members of a novel receptor tyrosine kinase subfamily.
Members of the receptor tyrosine kinase (RTK) 1 family of transmembrane signal transduction proteins participate in the regulation of diverse cellular activities such as mitogenesis, motility, differentiation, survival, metabolism, adhesion, fasciculation, morphogenesis, and oncogenesis (1). Structurally, and usually functionally, related RTKs are grouped into distinct subfamilies largely on the basis of characteristic combinations of protein motifs in their ligand-binding extracellular domains. Unique in this classification system is the RYK (for related to tyrosine kinase) receptor subfamily which, aside from two leucine-rich motifs in the mammalian orthologues and a possible protease cleavage site common to all RYKs, is comprised of members devoid of recognizable protein modules in their extracellular domains characteristic of other RTKs (2)(3)(4).
The cytoplasmic protein tyrosine kinase (PTK) activity of RTKs is classically activated by homodimerization of receptor protomers mediated by the binding of an extracellular growth factor ligand (5). More recently, RTK activation via heterodimerization (6,7), extracellular matrix (8), and ligandindependent routes (9) have been recognized as physiologically significant modes of signal generation. Twelve conserved peptide sequence motifs, or subdomains, are the signature of the PTK catalytic domain, including some 13 invariant residues, which participate in the phosphotransfer reaction (10,11).
A subclass of RTKs that lack demonstrable phosphotransferase activity has emerged recently whose members display substitutions in one or more of the conserved and catalytically important PTK motifs. These include the human CCK4 (12) and chicken KLG (13) orthologues, ERBB3 (14), the human and mouse Eph-like proteins HEP (15) and MEP (16), and the human RYK protein (3). The likely functional importance of RYK subfamily-specific amino acid substitutions to otherwise universally conserved PTK motifs is underscored by their occurrence in such phylogenetically diverse species as Drosophila (Doughnut 2 and Derailed, Ref. 4), human (3), mouse (2) and zebrafish Ryk. 3 Although largely functionally uncharacterized, transmembrane proteins such as RYK which bear an apparently catalytically inactive PTK domain may well modulate or relay growthregulatory cues present in the extracellular environment to cytoplasmic signaling and/or effector molecules. For instance, inactivation of the PTK activity of the Caenorhabditis elegans Eph receptor VAB-1 results in weak mutant phenotypes (relative to extracellular domain mutations), suggesting that a subset of RTKs may execute kinase-independent functions (17). These may include recruitment of heterologous protein kinasecompetent receptor subunits, as in the case of a kinase-defective EGFR mutant still capable of activating mitogen-activated protein kinase (18). That mice homozygous for a deletion of exons encoding the PTK domain of VEGFR1 (Flt-1), an RTK essential for embryonic angiogenesis, exhibit normal vessel development and are fully viable suggests that the primary biological role of this receptor is in ligand binding rather than signal transduction (19).
Identification of derailed in a P element enhancer trap screen for fasciculating neurons demonstrated a requirement for growth cone pathfinding cues transduced by this RYK subfamily member during Drosophila nervous system development (4). Furthermore, defective somatic muscle insertion site selection into the epidermis of derailed mutants suggests that the mechanism for muscle insertion target recognition is biochemically similar to that used in axon pathfinding (20). A pleiotropic role for ERBB3 in cardiac and central nervous system development has been demonstrated in ErbB3-null mice (21,22). However, a role for other PTK-inactive RTKs in growth regulation remains to be demonstrated.
Here we report the structure of the mouse Ryk transcription unit, including the likely location of its promoter, as identified by an unmethylated CpG island. The unusual organization of domains in the mouse RYK protein extends to the genomic structure that displays features unique within the RTK/growth factor receptor gene family. Southern blotting results suggestive of the existence of a retroprocessed mouse Ryk pseudogene are also presented. Immunohistochemical analysis has been undertaken to define the spatial distribution of RYK expression within individual adult mouse organs. Conserved structural elements that define the RYK subfamily are identified in five metazoan proteins, including one from C. elegans, indicating an early evolutionary origin of these RTK-like molecules. Consistent with subsequent duplications of the ancestral vertebrate genome, the mouse, human, and zebrafish genomes also contain putative Ryk subfamily paralogues that we find to be phylogenetically conserved molecules with likely roles in cellular growth regulation.
DNA Sequencing-Double-stranded plasmid DNA was sequenced by the dideoxy chain termination method (PRISM TM Ready Reaction DyeDeoxy TM Terminator Cycle Sequencing kit; Perkin-Elmer). A 1.8-kb XbaI-BglI fragment flanking the presumptive initiator methionine was sequenced on both strands by generating a set of nested 200 -300-bp deletions in both directions using an Exo-Size Deletion Kit (New England Biolabs, Beverly, MA; Ref. 26).
Cloning of Ceryk-A search of a C. elegans database (ACeDB) using the Doughnut extracellular domain revealed homology with an open reading frame on genomic cosmid C16B8 designated C16B8.1 (27), here referred to as Ceryk. To verify the exon-intron structure predicted by the Genefinder program (28), oligonucleotides to putative Ceryk exons were used to amplify the entire coding sequence, and flanking regions, in overlapping RT-PCR products. Reverse transcription of mixed stage C. elegans total RNA (ϳ5 g) was performed using a first strand synthesis for RT-PCR kit (Amersham); cDNA amplification by touchdown PCR was performed as described (29). Detection of sorting signals and cleavage sites in the N termini of the RYK proteins was performed using SignalP, and transmembrane domains were identified using PSORT.
PCR-Amplification of human RYK exons from genomic DNA (ϳ100 ng) was performed as described above for PCR of C. elegans cDNA. Oligonucleotides were designed to amplify a region of the human RYK gene homologous to that contained in the mouse Ryk Zoo blot probe, 5Ј-CATTGATGACACACTTCAAG-3Ј (human cDNA nt 1539 -1558) and 5Ј-CACATCACTAGCGCTAGAG-3Ј (human cDNA nt 1698 -1679), which amplify a 159-bp product from human cDNA. The oligonucleotide probe 5Ј-CTTTCAAGAGCCATCCAAC-3Ј (human cDNA nt 1661-1643) is fully internal to the primers used in the PCR.
Sequence Analysis of the Ryk Promoter-Identification of putative transcription factor binding elements was performed by searching position-weighted nucleotide distribution matrices using MatInspector (30). The CpG island was analyzed with the EGCG program CpGplot using the algorithms of Gardiner-Garden and Frommer (31).
Production of Monoclonal Antibodies against the Human RYK Extracellular Domain-A purified FLAG-tagged version of the human RYK extracellular domain (RYK-EX-FLAG) was used to immunize female BALB/c mice prior to fusion of spleen cells with the mouse myeloma P3X63Ag.653 (NS-1). Hybridoma supernatants containing monoclonal antibodies to the RYK extracellular domain were screened for by an enzyme immunoassay. Antibodies chosen for further analysis (RYK-1, RYK-2, and RYK-3) were subcloned twice and isotyped as described (32).
Immunohistochemistry-Tissue sections were stained with a twostep immunoperoxidase technique as described (33), with the modification that the RYK-1 monoclonal was detected with a goat anti-mouse IgM-horseradish peroxidase secondary reagent (Dako, Copenhagen, Denmark).

RESULTS
Structure of the Mouse Ryk Gene-The mouse RYK protein is encoded by 15 exons covering a minimum of 81 kb (Fig. 1a). Twenty four independent phage clones isolated from a 129/Sv genomic lambda library and two bacterial artificial chromosome (BAC) clones identified by amplification with the primer pair BAC.01 and BAC.02 ( Fig. 1c) were analyzed to define the organization of mouse Ryk. A 4-kb XbaI fragment containing exon 1, a CpG island, the transcription start site, and putative promoter sequences ( Fig. 1, a and b) was subcloned from BAC.Ryk.2. The exact size of intron 1 (Ͼ19 kb) has not been determined.
Two possible translation initiation codons (M1 and M2; see Six independent groups have now cloned and sequenced the mouse Ryk cDNA (2,(37)(38)(39)(40)(41). When these sequences are aligned (not shown), nucleotide differences that predict amino acid substitutions are apparent. The major differences are found in the 5Ј region of the cDNA and most likely reflect difficulties in sequencing this highly G ϩ C-rich region. No nucleotide ambiguities were encountered in our genomic sequencing of this portion of the open reading frame, the sequence of which always matched a majority of published cDNAs, consistent with the interpretation that differences between the reported sequences are the result of sequencing errors. The long 5Ј-untranslated region published by Kelman et al. (39) appears to be part of a chimeric cDNA clone given that the sequence is unrelated to genomic DNA sequence in Fig. 2b beginning at position 1763 and extending upstream. Additionally, our BAC.Ryk clones do not hybridize with oligonucleotides complementary to the unique 5Ј-untranslated end reported by Kelman et al. (39). The kinase activation loop of mouse RYK, bounded by subdomains VII and VIII of the PTK catalytic core (10), is encoded by exon 13 and contains a conserved tyrosine residue at position 454 (Ref. 2; see Fig. 6) which is often involved in catalytic autoregulation of diverse PTK superfamily members (42). The two alternative polyadenylation signals reported by Kelman et al. (39), both of which perfectly match the consensus sequence AATAAA, reside in exon 15 (Fig. 1a).
The sequences of mouse Ryk exon-intron boundaries are presented in Table I. The mouse Ryk transcription start site (see below) defines the 5Ј boundary of exon 1. All introns Sites for rare-cutting enzymes recognizing CpG-containing hexanucleotide sequences are shown in boldface. The 0.7-kb intron 1-derived probe used in Southern blotting is indicated. The BssHII site in brackets is either incompletely methylated or polymorphic in genomic DNA from W9.5 ES cells (see text and Fig. 3b). b, DNA sequence of the region surrounding exon 1. The boundaries of a CpG island are indicated by curved arrows and a TATA box plus other putative transcription factor binding motifs, with core sequences underlined, are boxed. Only the single best potential binding conform to the 5Ј-gt . . . ag-3Ј motif, with the exception of exon 8 which has a modified 3Ј splice acceptor site ( Table I).
Identification of a CpG Island Containing the Putative Mouse Ryk Promoter-The DNA sequence flanking exon 1 is shown in Fig. 2b. No consensus sequence for a splice acceptor site associated with this exon was found in the upstream region. This and the location of the transcription start site (see Discussion) suggest that the promoter resides at least partly within this sequence. The presence of a CpG island in this area, recognizable as a high density of sites for restriction enzymes with G ϩ C-rich recognition sequences containing one or more CpG dinucleotides ( Fig. 2a; Ref. 43), is a landmark for the Ryk promoter since a CpG island was always found to cover the whole or part of the promoter in a survey of 102-island-containing genes expressed in a wide variety of tissues (44).
CpG islands are defined as sequences Ͼ200 bp in size comprising a moving average G ϩ C content Ͼ50% and an observed/expected CpG dinucleotide frequency Ͼ0.6 (31, 44). The 859-bp Ryk CpG island lies between nucleotides 1469 and 2347 and is characterized by elevated G ϩ C content (75%), an observed/expected CpG ratio of 1.004 (Fig. 3a), and is unmethylated at CpG residues (Fig. 3b). Mouse genomic DNA prepared from low passage embryonic stem cells (W9.5 line, derived from mouse 129/Sv C3Ϫϩ c ϩ p , Jackson Laboratory stock number JR0090; Ref. 45), which express Ryk mRNA, was subjected to Southern blot analysis with methylation-sensitive restriction enzymes. Digestion with XbaI, or double-digestion with XbaI and either BssHII or EagI (both of which are blocked by methylation of either of the two CpG dinucleotides in each of their recognition sequences (43)), was used to assess the methylation status of four CpG dinucleotides within the putative island. When XbaI-digested DNA was hybridized with the 0.7-kb EcoNI-XbaI probe (see Fig. 2a), the expected 4-kb XbaI fragment was detected (Fig. 3b). However, this fragment was not present in doubly-digested DNA, demonstrating the presence of internal, unmethylated sites cleavable by BssHII and EagI. Although the 0.7-kb probe hybridized with the expected 2.2-kb BssHII-XbaI fragment, an additional 1.4-kb BssHII-XbaI fragment was detected. This smaller fragment is generated by cleavage at a BssHII site 1.4 kb upstream of the most 3Ј XbaI site (Fig. 2a) and reflects either clonal variation in methylation of CpGs in the sequence 5Ј-GCGCGC-3Ј, as previously reported for CpGs in DNA flanking the mouse liver aprt CpG island (46), or an allelic polymorphism. Allelic variation in simple sequence length polymorphisms has been reported in the 129 mouse substrains and ES cell lines derived from them (47), demonstrating that alleles are still segregating in these supposedly highly inbred mice. As judged by the status of four CpG dinucleotides within the Ryk CpG island, this region of DNA is unmethylated and is expected to constitutively maintain this property in expressing and nonexpressing cell types at all stages of development (48,49). The unmethylated or variably methylated status of the BssHII site suggests that CpG dinucleotides immediately surrounding the island may also be protected from methylation.
Sequences immediately flanking the transcription start site (see "Discussion") display potential binding motifs for a variety of transcriptional modulators (Fig. 2b). The MatInspector program (30) was used to search the sequence for high quality matches to a data base of position-weighted nucleotide distribution matrices. DNA elements potentially mediating the response to growth-regulatory stimuli through the sequence-specific binding of transcription factors are particularly abundant in the putative Ryk promoter (Fig. 2b). These include response elements for transcription factors activated by mitogenic (e.g. Ets family, AP-1, n-Myc, Egr family, E2F, RREB-1, NGFI-C, c-Myb), anti-mitogenic (IRF-1, ISRE), metabotropic (T 3 receptor, aryl hydrocarbon receptor, and nuclear translocator), and motif for each of the transcription factors shown is boxed (core similarity Ͼ0.70, matrix similarity Ͼ0.8; see "Experimental Procedures"). The transcription start site is marked by a square arrow, and the antisense 35-mer used for primer extension by Yee et al. (40) is represented by a half-arrow, and alternative translation initiation methionines (M1 and M2) are circled. The exon 1, 3Ј splice donor site (type I) is boldfaced. SRF, SRE-binding factor; NRF-2, nuclear respiratory factor 2; ISRE, interferon-stimulated response element; Tal-1, T-cell acute lymphocytic leukemia protein 1; ER, estrogen receptor; STAT, signal transducer and activator of transcription; AML-1, acute myeloid leukemia protein 1; T3R, thyroid hormone receptor; HNF-3␤, hepatocyte nuclear factor 3␤; IRF-1, interferon regulatory factor 1; EGR, early growth response protein; AhR/ARNT, aryl hydrocarbon receptor/aryl hydrocarbon receptor nuclear translocator; RREB-1, Ras-responsive element binding protein 1; NGFI-C, nerve growth factor-induced protein C; Gfi-1, growth factor independence protein 1.  differentiation (HNF-3␤, GATA-1, Nkx-2.5, MyoD) signals. This is consistent with the likely function of RYK as a cellsurface growth factor receptor or co-receptor, the expression of which is expected to be under stringent spatial and temporal control (see "Immunohistochemical Analysis of RYK Expression in the Mouse," below).
Identification of Mouse Ryk-related Sequences by Southern Blotting-A screen of the mouse genome by Southern blotting using mouse Ryk cDNA-derived probes was initiated as a first step in resolving the issue of whether mammals have one or more Ryk gene paralogues (see Ref. 41). Restriction fragment length polymorphism mapping in the mouse (50) and human cytogenetic analysis (3)  By using a gradient of hybridization and washing stringencies originally employed to detect novel human RTKs by Southern blotting of genomic DNA (25, 53), we have detected two classes of mouse genomic DNA fragments related to the Ryk gene characterized herein. The first class is detectable at high stringency (Fig. 4c, arrows) using an Ryk kinase domain probe. However, only the hybridization signals expected from the structure of the mouse Ryk gene (Fig. 1a) are observed when genomic DNA digests are hybridized at high stringency with a probe to the Ryk extracellular domain (Fig. 4a). The nature of this first class of Ryk-related hybridization signals (high signal strength of the fragments, in particular the 3.8-kb EcoRI and 2.2-kb XbaI fragments, at high stringency) is consistent with a contiguous organization of kinase domain exons comprising a partially retroprocessed pseudogene (Ryk). At reduced stringency, hybridization signals not present at high stringency are visible using both extracellular and kinase domain probes from the Ryk cDNA (Fig. 4, b and d, arrowheads). Using a v-erbB PTK domain probe, this level of hybridization stringency has proven to be suitable for discriminating between RTKs highly related to the probe sequence, such as subfamily members, while more distantly related members of the PTK superfamily are not detected (25,53). The detection of DNA fragments at reduced stringency with probes to the extracellular and kinase domains of mouse Ryk further suggests that the hybridization signals represent a genuine subfamily member. We predict that this second class of fragments is derived from a novel mouse member of the Ryk RTK subfamily. Screening of genomic DNA libraries enriched for Ryk, which we have found to be under-represented in supposedly complete genomic DNA libraries, and Ryk-related fragments is underway.
Immunohistochemical Analysis of RYK Protein Expression in the Mouse-Analysis of Ryk mRNA expression in a large sample of mouse and human tissues has been reported by us and others (2, 37-41, 54 -57). These analyses show that Ryk mRNA is distributed ubiquitously. In order to assess the spatial distribution of the functional gene product, we generated an anti-RYK monoclonal antibody and performed immunohistochemis- try on sections of embryonic and adult mouse tissues. The RYK-1 antibody was raised against the extracellular domain of human RYK (Fig. 5a) and is specific for the extracellular domain of this receptor (Fig. 5, b and c). The RYK-1 monoclonal antibody has been used in Western blotting analysis of cell lines overexpressing a variety of RTK extracellular domains (e.g. EGFR, VEGFR1, VEGFR2, VEGFR3, and Tie2) with no evidence of cross-immunoreactivity. The dual Western blotting signal from MCF-7 lysate at approximately 90 and 47 kDa (Fig.  5c) is also observed in reduced immunoprecipitates of metabolically labeled MCF-7 cells using an antiserum raised against the C terminus of human RYK (3) and is consistent with proteolytic processing of the receptor extracellular domain at the tetrabasic KRRK site. Disulfide bonding between Cys-156 and Cys-191 of the human receptor, which represent conserved residues that flank the KRRK motif in all RYK subfamily members (see Fig. 6), may be an important post-translational modification. The RYK-1 antibody cross-reacts with both mouse and zebrafish RYK but not with Drosophila Derailed. 4 RYK protein expression was detected in a large proportion of adult mouse tissues and on cells from a variety of origins. In the kidney, strongest RYK staining is seen in tubules immediately under the cortex (Fig. 5d), which gradually declines in strength toward the pelvic region. Glomeruli within all regions of the kidney are uniformly unstained with the RYK-1 antibody. RYK expression is limited to epithelial cells lining the renal tubules. Strong and uniform RYK expression is evident in liver lobules, associated with the membranes of hepatocytes (Fig. 5f). The adrenal medulla shows no reactivity with the RYK-1 antibody, whereas high RYK expression is seen in the adrenal cortex (Fig. 5h). Strong and uniform RYK expression is present on embryonic day 14 myocardium (Fig. 5j); this represents the earliest expression of RYK that we have observed in the mouse (embryonic days 8 -12 are unstained; not shown). Adult spleen shows staining with the RYK-1 antibody restricted to the red pulp (Fig. 5l). Within this compartment, megakaryocytes show membrane-localized immunoreactivity, whereas erythroblasts and lymphocytes are free of staining. In addition, reticular fibers and endothelial cells lining the venous sinuses are stained. The adult small intestine displays strong staining with the RYK-1 antibody on the epithelium and connective tissue/vascular core of villi, whereas the glandular crypts are unstained (Fig. 5n). The mucosa of the large intestine is stained on connective tissue surrounding intestinal glands, which are themselves free of RYK-1 immunoreactivity (Fig. 5p).
To explore the range of evolutionary conservation of Ryk subfamily members, low stringency Southern blot hybridization of a mouse exon 13 probe to a "Zoo blot" was performed. Extensive and strong hybridization of the Ryk-specific kinase 4 J. L. Bonkovsky, personal communication. subdomain probe to genomic DNA from a wide range of metazoans was observed (Fig. 7a). The 3.6-kb mouse HindIII fragment represents the expected hybridization signal from mouse Ryk, and the strong ϳ0.7-kb signal, which is also seen at high stringency in Fig. 4c, is most likely derived from a mouse pseudogene. Weaker hybridization to other mouse genomic HindIII species is also visible (at ϳ1.3, ϳ1.5, and ϳ1.8 kb, see also Fig. 4d). To identify the origins of the two strongest hybridization signals from human genomic DNA (Fig. 7a), PCR was employed to map the organization of human RYK sequences recognized by the Zoo blot probe. Using primers to human RYK corresponding to those used to produce the mouse exon 13 Zoo blot probe, a 159-bp PCR product, which was uninterrupted by introns and not cleavable by HindIII (3), was amplified from genomic DNA (Fig. 7b). This result indicates that the mouse Zoo blot probe recognizes a single human RYK exon resistant to cleavage by HindIII within the limits of homology to the probe. Thus it is likely that the signals at 3.2 and 4.3 kb (Fig. 7a, human) reflect hybridization to human RYK and its pseudogene. Hybridization to RYK-related sequences in the human genome is also evident as weaker signals (e.g. at ϳ1.8 kb; Fig. 7a). Mapping of exon-intron boundaries of the zebrafish ryk gene predicts the mouse Zoo blot probe to detect only one fragment, such that the two strong zebrafish hybridization signals (Fig. 7a) can be confidently assigned to different loci.
Hybridization of the kinase domain probe to distinct genomic DNA fragments was observed in all species tested, with the exception of Escherichia coli (negative control). After prolonged exposure, three autoradiographic signals were visible from Drosophila genomic DNA (not shown). Ryk gene orthologues and paralogues, both characterized and uncharacterized, therefore seem to be widely conserved over the phylogenetic scale, from mammals (human, mouse, horse, dog, cow, sheep, and rabbit), birds (chicken), through lower vertebrates (Xenopus, zebrafish, and carp), to invertebrate metazoans (Drosophila and C. elegans). DISCUSSION We have determined in detail the genomic organization of the mouse member of the growing subfamily of growth factor receptors related to tyrosine kinases (Ryk). Although all subfamily members are currently orphan receptors, RYK molecules are expected to function in the transduction of growthregulatory information across the plasma membrane by virtue of their prototypical RTK topology, as has been demonstrated for all other RTK subfamilies (5).
The mouse Ryk transcription unit resides on chromosome 9 (39,50) and spans a minimum of 81 kb of genomic DNA. A large first intron is a feature of the mouse Ryk gene which is shared with zebrafish ryk and Drosophila derailed (4 . The GXGXXG motif vital for ATP binding in PTKs is modified to XXGXXG in the RYK subfamily. This motif is invariably encoded by a single exon; however, in the mouse and C. elegans Ryk genes a type I splice donor site interrupts the codon for the second consensus glycine residue, and a type O splice site separates the fourth and fifth residues of motif I in the zebrafish ryk gene (data not shown). Other mouse RYK PTK subdomains, with the exception of IX, are encoded by single exons, as is usually the case for other RTK genes. This supports the view that the RYK kinase-like domains from phylogenetically diverse species are evolutionarily related.
We speculate that the significance of exon splitting of RYK subdomains I and IX, together with unusual kinase subdomain sequences, may be that the ancestral Ryk gene arose very early in metozoan evolution and has since been subjected to selective pressure to maintain a modified PTK activity that relies heavily on atypical residues at normally conserved positions. Molecular modeling of the mouse RYK PTK domain indicates that the nucleotide-binding cleft is particularly large and may therefore indicate a preference for a phosphodonor substrate other than ATP. 5 This prediction is currently being tested.
Primer extension analysis performed by Yee et al. (40) allowed us to identify the transcription start site at nucleotide 1627, within the CpG island (Fig. 2b). This result is consistent with the finding that almost every widely expressed gene transcribed by RNA polymerase II that has been examined, such as mouse Ryk, has a CpG island at the 5Ј end that includes the transcription start site (Refs. 31 and 44; CpG island data base 4.0). 6 Assignment of the transcription start site leads us to predict the synthesis of two mouse Ryk mRNA species of 2.3 and 2.6 kb. Two polyadenylation sites in exon 15, which appear to be utilized at equal frequencies in most tissues (2,(37)(38)(39)(40)(41), define the predicted alternative 3Ј-mRNA ends. By using RT-PCR, we can find no evidence for alternative splicing of the Ryk primary transcript in 3T3 L1 cells. 7 Wide variation in the reported lengths of mouse Ryk transcripts could be due to the inconsistent use of RNA size standards (i.e. RNA ladders versus 28 S and 18 S rRNA species) in different laboratories. The most accurate sizing of Ryk transcripts seems to have been reported by Maminta et al. (61), where an RNA ladder has been used to estimate mRNA lengths of 2.1 and 2.6 kb. These correspond well with Ryk mRNA sizes predicted here by identification of the transcription start site and alternative polyadenylation sites. However, the possibility that extensive and stable secondary structure in the G ϩ C-rich 5Ј end of the Ryk mRNA, which survives denaturing conditions to variable extents according to the particular methods employed by different laboratories, remains an alternative explanation for the apparent distortion of transcript lengths relative to RNA standards.
The G ϩ C-rich nature of the 5Ј end of the mouse Ryk mRNA is a common characteristic of transcripts encoding growthregulatory proteins. The likely mRNA secondary structure has been proposed to function in the attenuation of translation as an extra level of gene regulation (62). This region of the Ryk gene is also highly enriched for the CpG dinucleotide relative to the bulk mammalian genome, where CpG depletion to 20% of the frequency expected from base composition has resulted from methylation of CpG and the high mutability of 5Me CpG to TpG or CpA (CpG suppression; Refs. 48 and 49). CpG dinucleotides in the Ryk promoter are protected from genomic DNA methylation and are present at the frequency predicted by base composition alone (i.e. no CpG suppression). These properties identify the sequence spanning exon 1 of the mouse Ryk gene as a CpG island 859 bp in size (Fig. 3).
A transcriptionally active CpG island represents a domain of "open" chromatin structure characterized by core histone underacetylation, histone H1 depletion, and a nucleosome-free region (63). Constitutive binding of transcription factors is essential for the maintenance of a methylation-free CpG island (46,64). Virtually all sequenced genes with widespread expression patterns are associated with a CpG island (Ref. 44; CpG island data base 4.0), 8 and mouse Ryk can now be added to this class of genes. The human RYK gene is likely to be marked by a CpG island given that approximately 80% of islands are common to mouse and man (65) and human RYK also shows widespread expression of an mRNA with a G ϩ C-and CpGrich 5Ј domain (3,57). Other RTK genes with known CpG islands include mouse and human FGFR3 (66), human EGFR (67), human NTRK1/TRKA (Ref. 68; 423 bp CpG island in exon 1 detected with CpGplot; data not shown), human FLT1/ VEGFR1 and FLT3 (69), human and mouse KIT (59), and the human PDGF␣R (CpG island data base 4.0). 8 Our immunohistochemical analysis of mouse RYK further defines the expression pattern of this unusual receptor. Whereas Northern blot and RNase protection assays of Ryk mRNA indicate near ubiquitous expression (2, 38 -41, 55, 56), Wang et al. (57) have reported localization of RYK mRNA to the epithelial and stromal compartments of human tissues such as ovary, brain, lung, colon, kidney, and breast by in situ hybridization. The immunohistochemical staining results presented here confirm localization of the functional Ryk gene product to 7  FIG. 6. Deduced amino acid sequence of C. elegans RYK and alignment with other members of the RYK subfamily of proteins. Alternative ATG translation start codons are marked in the mouse sequence (*). N-terminal signal sequences and their predicted sites of cleavage are shown (ƒ). The transmembrane domains (TM), mammalian leucine-rich motifs (LRMs), putative tetrabasic cleavage sites (TBC), conserved cysteine residues (q), Hanks kinase motifs (I-XI), potential autoregulatory tyrosine residue homologous to that in pp60 src (crossed circle), boundaries of the protein tyrosine kinase the tubular epithelium in kidney and to the stromal compartment of the large intestine and indicate that mouse RYK is localized to the stroma and epithelium of villi in the small intestine. However, organs including embryonic day 14 heart and adult liver show strong and homogenous expression of RYK throughout. Furthermore, the spatially restricted localization of mouse RYK to distinct parenchymal compartments of organs such as the adrenal gland and spleen suggests that specific differentiated cell types not related in embryonic origin may require the signal transduced by RYK.
We have demonstrated the existence of multiple Ryk subfamily members in the mouse genome by Southern blotting: the canonical Ryk (2), a likely partial retroprocessed pseudogene derived from Ryk kinase domain exons, and a Ryk-related gene detectable at quantitatively reduced stringency using Ryk cDNA-derived probes to the extracellular and PTK domains.
The human genome appears to share with the mouse the feature of a Ryk pseudogene, as well as the Ryk-related gene. We have detected Ryk subfamily-like sequences in the genomes of mammals, chicken, fish, and Drosophila by reduced stringency Southern blotting and in a nematode worm by cDNA amplifi-cation and sequencing. Although the stringency conditions used for Southern blot hybridization were relaxed, more stringent and extensive washing was performed, suggesting that the level of cross-species nucleotide sequence conservation in the kinase-like domain is high.
Southern blot analysis of the Danio rerio (zebrafish) genome showed hybridization signals representing the ryk locus plus a related sequence of unknown identity. From the simple nematode worm C. elegans, we have sequenced a Ryk cDNA orthologue, Ceryk, which predicts a transmembrane protein demonstrating structural conservation of RYK subfamily-specific features. These include a compact extracellular domain containing a putative basic protease cleavage site flanked by universally conserved cysteine residues. Post-translational processing of the human RYK exodomain into ␣␤ disulfide-linked subunits analogous to the c-MET/HGF receptor (70), mouse STK/RON (71), the insulin receptor (72), and the insulin-like growth factor 1 receptor (73) is supported by our Western blotting and immunoprecipitation data (Fig. 5c and Ref. 3), although we have found human RYK to be inconsistently processed in this manner. In the predicted CeRYK protein, a pro- FIG. 7. Phylogenetic conservation of Ryk subfamily molecules. a, zoo blot of genomic DNAs digested with HindIII and hybridized with a 159-bp probe encoding mouse Ryk exon 13. The expected 3.6-kb hybridization signal is present in the lane containing mouse DNA, in addition to a smaller fragment of ϳ700 bp, probably derived from a pseudogene. All species gave rise to observable hybridization signals (lane containing Drosophila DNA required longer exposure, not shown) with the exception of E. coli (negative control). The two strong signals in the zebrafish lane represent distinct genes (see text). b, PCR analysis of human RYK exons encoding sequences homologous to the mouse exon 13 probe. PCR primers flanking the human cDNA sequence homologous to the mouse Zoo blot probe amplify an intronless 159-bp product from human genomic DNA, not cleavable by HindIII, which hybridizes to an internal oligonucleotide (lower panel, 10-min exposure). M, 100-bp ladder; G, human genomic DNA template; ᮎ, negative control (no template). tein tyrosine kinase-like domain with strongest sequence homology to other RYKs, distinctive RYK-specific amino acid substitutions in PTK subdomains I and VII (11), and a Cterminus conforming to the consensus -Y(V/I)-COOH, representing a possible PDZ domain ligand (74), are identifiable in the intracellular domain (Fig. 6). No likely candidate C. elegans mutant phenotypes mapping to this CELC16B8 chromosomal locus are known.
The RYK proteins are structurally unique in at least two significant features. Few recognizable protein motifs are present in an unusually short extracellular domain, and phylogenetically conserved intracellular substitutions in at least two highly conserved PTK sequence elements responsible for cooperatively binding the Mg 2ϩ -ATP phosphate donor complex at the active site may abrogate, or more likely modify, catalytic activity. These changes perhaps reflect involvement of RYK in a unique cell-surface signal transduction complex, where it may interact with a novel family of extracellular ligands and/or be recruited into a heterodimeric PTK-competent receptor complex. Alternatively, RYK could conceivably function to attenuate signaling from such a complex by failing to transactivate the protein tyrosine kinase activity of the partner receptor in competition with homodimeric receptor formation. Third, atypical substrate contacts mediated by the nucleotide binding cleft of RYK subfamily proteins, comprising variant Hanks' motifs I and VII, may indicate an altered specificity for the phosphate, sugar, and/or divalent cation moieties of the phosphodonor complex, as proposed by Kelman et al. (39). We are further investigating the function of RYK in embryonic and early postnatal development, and its potential role in cancer, through the generation and analysis of Ryk-deficient mice and a screen for ligands of the RYK receptor.