Structure and Promoter Analysis of the Human unc-33 -like Phosphoprotein Gene E-BOX REQUIRED FOR MAXIMAL EXPRESSION IN NEUROBLASTOMA AND MYOBLASTS*

The human unc-33 -like phosphoprotein (hUlip/ CRMP-4) is a member of a family of developmentally regulated genes that are highly expressed in the nervous system. Mutations in the C. elegans unc-33 gene lead to worms with abnormal movements. The hUlip gene encodes a 570-amino acid protein with 98% homology to its murine (Ulip) (Byk, T., Dobransky, T., Cifuentes-Diaz, C., A. (1996) J. Neurosci. 16, 688–701) and rat (CRMP-4) (Wang, L. H., Strittmatter, S. M. J. Neurosci. 16, 6197–6207) counterparts (Gaetano, C., T., and Thiele, C. J. (1997) J. Biol. Chem. 272, 12195–12201). The hUlip gene was isolated from a human genomic library. It contains 15 exons, including an exon defined by an anaplastic oligodendroglioma expressed sequence tag, and spans at least 61.7 kilobases. hUlip lacks sequences corresponding to the first six exons found in unc-33 . unc-33 exons correspond to homologous hUlip exons

To study the genetics of animal behavior, Caenorhabditus elegans were mutagenized, and mutants were identified that had uncoordinated movements and partial defects in their egglaying behavior (5,6). The unc-33 mutations were unique in that they were associated with alterations in axonal outgrowth and guidance of several classes of neurons yet did not involve any defects in neural cell lineages, numbers, or positioning (7,8). The gene responsible for this defect encoded a novel intracellular protein, unc-33, that was highly expressed within neuronal processes during early embyrogenesis (1). The first putative mammalian homologue of unc-33 was identified functionally as a 62-kDa protein, collapsin response-mediated protein (CRMP-62) 1 that mediated collapsin signaling (9). During neural development, collapsins participate in axonal pathfinding by stimulating growth cone collapse and preventing neurite extension. The identification of other unc-33-related genes followed; TOAD-64 (turned on after cell division) was identified as a 64-kDa protein that was highly expressed in the brain, particularly in the cerebral granular neurons, after cessation of cell proliferation (10); Ulip (unc-33-like phosphoprotein) (2) was identified as a highly expressed anomalous protein in the brain detected by anti-stathmin antibodies; hUlip was identified as an anonymous gene induced during the neuronal differentiation of human neuroblastoma cells (4); 5,6dihydropyrimidine amidohydrolase (an enzyme involved in uracil and thymine catabolism) and several genes termed 5,6dihydropyrimidine amidohydrolase-related proteins (DRP-1-3) (11) were found to be homologous by sequence analysis.
A systematic study indicated that there are at least four mammalian homologs of unc-33 (Ulip/hUlip/CRMP-4/DRP-3; Ulip2/CRMP-2/CRMP-62/TOAD-64/DRP-2; CRMP-1; and CRMP-3) that are expressed in distinct yet partially overlapping patterns of expression during the development of the nervous system (3). CRMP-1 and Ulip (CRMP-4) have the most restricted pattern of expression, being turned on in E15 and off in P15 rat brain tissue. Ulip/hUlip/CRMP-4/DRP-3 is highly expressed in fetal but not adult neural tissues by in situ analysis and evaluation of total RNA, yet it is not detected at comparable levels in fetal or adult muscle tissue (3,4). However, when poly(A) ϩ Northern blots of normal tissues are examined, Ulip/hUlip/DRP-3/CRMP-4 is detected in newborn rat muscle and in the adult human heart and skeletal muscles (2,11).
The hUlip protein encodes a 62-kDa protein with no signal sequence, transmembrane domain, or classical protein interaction domains. However, hUlip contains a number of consensus phosphorylation sites that may serve as substrates for signaling molecules such as Cdk, protein kinase C, and prolinedirected kinases (4). Increases in phosphorylation and dephos-* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM  phorylation of hUlip and Ulip are detected upon induction of neuritogenesis by retinoids in neuroblastoma cells or by nerve growth factor in PC12 cells (2). Collapsin-induced neurite retraction is blocked in cells injected with antibodies to CRMP-62 (9), and experiments indicate that CRMP-62 is involved in G protein-mediated transduction of the collapsin signal (9) as well as other non-G protein-dependent effects on neurite extension and growth cone morphology. Functional studies in other tissues have not been evaluated.
To evaluate the developmental regulatory mechanism of hUlip, we have isolated the hUlip genomic sequence and studied its structure, chromosomal location, 5Ј-flanking sequences, and expression. We demonstrate that two promoters exist in the 5Ј region of the hUlip gene and that an evolutionarily conserved MyoD/myogenin site is necessary for optimal expression of hUlip in vitro.

EXPERIMENTAL PROCEDURES
Isolation and Characterization of hUlip Genomic Clones-Two million phage clones from a human leukocyte genomic library in EMBL3 were screened using a 32 P-labeled 1.8-kb XhoI-EspI fragment of the hUlip cDNA (4), which encompasses the entire coding region. Five positive clones were identified, and phage DNA was isolated with a Lambda DNA isolation kit (Qiagen). For the characterization of selected genomic clones, phage DNA was analyzed by restriction endonuclease digestion with EcoRI or BamHI and subjected to Southern blot analysis to identify genomic DNA fragments that hybridized with a 32 P-labeled hUlip cDNA fragment. These fragments were subcloned into pBluescript SK(Ϫ) (Stratagene) and analyzed by restriction endonuclease digestion and sequencing. Synthetic oligonucleotide sequencing primers used to define exon/intron boundaries are listed in Table I.
Identification of a Genomic Fragment Containing hUlip Exon 2-To identify the putative hUlip exon 2 in the human genome, the PCR-based genome walking strategy was utilized (12). Sets of sense and antisense primers that are specific for exon 2 sequence and adaptor primers were used to amplify human genomic gene fragments with adaptor-ligated genomic DNA as templates (Genomewalker Kit, CLONTECH). The amplified products were subcloned into the PCR 2.1 vector (Invitrogen), and their sequences were verified.
Sequence Analysis of hUlip 5Ј-Flanking Region-A 2.5-kb BamHI fragment of phage clone 4 that contained 2.3 kb 5Ј to hUlip exon 1 was subcloned into pBluescript SK(Ϫ), and the entire nucleotide sequence was determined in both orientations (GenBank TM accession number AF246692). A computer search for putative cis-elements was performed with TESS (Transcription Elements Search Software; University of Pennsylvania) (13). Additionally, a search for putative transcriptional start sites was executed using a neural network algorithm (14,15).
Fluorescence in Situ Hybridization-Phage clone hUlip-1 DNA was labeled with biotin-14-dATP using the BioNick TM kit (Life Technologies, Inc.). The labeled DNA was purified over a Bio-Spin 6 column (Bio-Rad) and coprecipitated with 10 g of human Cot-1 TM DNA (Life Technologies). Fluorescence in situ hybridization followed the method of Pinkel (16). Biotin-labeled phage DNA (1.5 g) in a 10-ml hybridization mix (50% formamide, 2ϫ SSC (pH 6.3), 10% dextran sulfate) was hybridized overnight to normal human peripheral blood lymphocyte metaphase chromosomes. The slides were then washed three times in 50% formamide, 2ϫ SSC at 45°C for 20 min each. The hybridization signal was detected by two layers of fluorescein isothiocyanate-conjugated avidin (Vector) and amplified with one layer of anti-avidin antibody (Vector). Slides were counterstained with 1 g/ml 4Ј,6-diamidine-2Ј-phenylindone dihydrochloride (Roche Molecular Biochemicals) in an antifade solution. A Zeiss Axiophot microscope equipped with a cooled charge-coupled device camera (Photometrics) was used for image acquisition. Image analysis was carried out with IP Laboratories Spectrum software (Signal Analytics). Fractional length measurements were made using the Fract Length 1 Probe Macro in the NIH Image software (Life Sciences Division, Los Alamos National Laboratory).
Cell Culture-C 2 C 12 myoblasts were passaged subconfluently in DMEM (Life Technologies, Inc.) supplemented with 20% fetal bovine serum. For myogenic differentiation, parallel myoblast cultures were allowed to reach confluency, at which time they were induced to differentiate by feeding with DMEM supplemented with 2% horse serum and ITS (insulin, transferrin, and selenium; Life Technologies, Inc.). The neuroblastoma cell line SMS-KCNR was passaged in RPMI 1640 supplemented with 10% fetal bovine serum. Differentiation was induced by adjusting the medium to 5 M all-trans-retinoic acid, and cells were harvested after 4 days of this treatment.
Transient Transfection Analysis-The promoter region of the hUlip gene consisting of 2.3-kb 5Ј-flanking sequences and the first exon containing the AUG translational initiator was cloned into the promoterless luciferase reporter vector pGL2-basic (Promega). Initially, a 2.5-kb BamHI fragment from the genomic subclone 4B2.5 was subcloned into the BglII site of pGL2-basic, creating Ϫ2300UlipLuc. Subsequently, a series of 5Ј nested deletions was derived from Ϫ2300UlipLuc using available restriction sites. The construct Ϫ2300UlipLuc was cut with XhoI in the pGL2 multicloning site to create a fixed 5Ј boundary followed by digestion with one of the following restriction enzymes to produce progressively larger deletions (numbers in parenthesis are relative to the putative exon 1 transcriptional start site): SacI (Ϫ1180), BstXI (Ϫ940), NotI (Ϫ540), NruI (Ϫ290), PvuII (Ϫ150), SmaI (Ϫ20), or PstI (ϩ120). Each of the double-digested reactions was treated with T4 DNA polymerase to remove overhangs, gel-purified, and recircularized with ligase. Insertion mutations were made in a putative E-box sequence at position Ϫ150 by cutting Ϫ2300UlipLuc with PvuII and using the terminal transferase activity of Taq polymerase to add bases before recircularizing with ligase (17). Two mutants were manufactured by this technique, converting the wild type E-box CAGCTG to CAGGCCTG (E-GCUlipLuc) and CAGTACTG (E-TAUlipLuc).
The above constructs and pCMV␤gal were used to co-transfect C 2 C 12 myoblasts and the neuroblastoma cell line NGP in parallel. DNA was introduced into the cells by the calcium phosphate coprecipitation method (18). One day before transfection, C 2 C 12 myoblasts and NGP neuroblastoma cells were plated at 0.5 ϫ 10 5 and 0.5 ϫ 10 6 cells/well, respectively, on six-well plates using DMEM supplemented with 10% fetal bovine serum for C 2 C 12 and RPMI 1640 supplemented with 10% fetal bovine serum for NGP cells. Four hours before transfection, the cells were fed with 2.5 ml of DMEM supplemented with 10% fetal bovine serum. Thirty-six micrograms of DNA, including 3 g of CMV␤gal, was resuspended in 750 l of 250 mM CaCl 2 This solution was added dropwise, with vortexing, to 750 l of 2ϫ HEPES-buffered saline (280 mM NaCl, 50 mM N-2-hydroxyethylpiperazine-NЈ-2-ethanesulfonic acid, 1.5 mM Na 2 HPO 4 ⅐7H 2 O, pH 7.1) and kept for 30 min at room temperature. The suspension was then added at 240 l/well to six-well plates. After 24 h, the medium was removed, and the plate was washed with 0.25 mM EDTA/phosphate-buffered saline and fed with DMEM (C 2 C 12 ) or RPMI 1640 (NGP) supplemented with 10% fetal bovine serum. Cell extracts were prepared 24 h later using 200 l of reporter lysis buffer (Promega) per well. For determination of luciferase activities, 40 l of cell lysate was aliquoted into a Microlite 2 96-well plate (Dynatech Laboratory Inc., Chantilly, VA), and the values of luciferase activities were determined using 100 l of luciferase injection buffer (8.8 mM Tricene, 33 mM dithiothreitol, 5 mM ATP, 0.1 mM EDTA, 4 mM MgSO4, 0.6 mM luciferine, 0.3 mM coenzyme A, pH 7.8) in a luminometer (Wallac). For the ␤-galactosidase assay, 10 l of supernatant was mixed with 100 l of ␤-galactosidase assay buffer (0.1 M NaHPO 4 , pH 8.0, 1 mM MgCl 2 ) containing 1 g/ml AMPDG buffer (Tropix, Bedford, MA) and incubated for 45 min at room temperature in darkness. Activities were then determined by injecting ␤-galactosidase injection buffer (0.2 M NaOH, 10% emerald enhancer (Tropix, Bedford, MA)) and quantitation in a luminometer.

Isolation of hUlip Genomic DNA Sequences and Determination of Exon
Structure-Previously, we identified a 1.71-kb open reading frame in a 5.5-kb hUlip cDNA, which encoded p62 hUlip (4). To determine the genomic structure of hUlip, we screened an EMBL3 leukocyte genomic library and found three overlapping human genomic DNA fragments (clones 1, 3, and 7) spanning 29 kb of DNA. These fragments were cloned, characterized, and found to correspond to 1.62 kb of hUlip cDNA including the 3Ј-untranslated region (Fig. 1A). The hUlip phage clone 4, whose insert size was 10.5 kb, contained a portion of the hUlip 5Ј-untranslated region, a region encoding the initiation methionine, an additional 12 amino acids, and a splice donor site. Eighty-nine base pairs of hUlip coding region sequence were not identified in the phage clones. Several library screens using the cDNA fragment corresponding to the putative exon 2 as well as the 5Ј intronic region of exon 3 and the 3Ј intronic region of exon 1 failed to isolate a genomic fragment containing exon 2 in the EMBL3 library. Therefore, we em-ployed the PCR-based walking strategy to identify exon 2 sequences in the human genome, and two successive rounds of PCRs with the sets of sense or antisense primers each amplified 1.5-kb products. Sequence analysis reveals that these products overlap one another and contain the missing 89 bp of hUlip cDNA sequences (data not shown). Putative RNA splice acceptor and donor sites detected in these PCR products are compatible with the exon 1 donor and exon 3 acceptor splice sites. From these results, we conclude that these PCR products contain hUlip exon 2. Table I lists the oligonucleotide sequences used to define the intron/exon boundaries of the hUlip gene, and the sequences at the intron/exon junctions are given in Table II. The 1.71-kb open reading frame of the hUlip cDNA is distributed over 14 exons, and sequence analysis of the exon/ intron boundaries reveals that all splice donor and acceptor sites follow the GT-AG splice junction consensus (19) (underlined in Table II).
While this manuscript was in preparation, a search of the unfinished high throughput genomic sequence data base (htgs) identified a human chromosome 5 "working draft" sequence, consisting of five ordered contigs (accession number AC011373) that encompassed our hUlip genomic sequence. Alignment with this sequence definitively places the exon 2 PCR products between flanking genomic clone 3 and 4 sequences. Using these sequences in conjunction with sequences obtained from our overlapping genomic clones allowed the calculation of intron and exon sizes shown in Table II. Additionally, a search of EST data bases revealed an anaplastic oligodendroglioma sequence (accession number A1570709) that overlaps with exons 1-3 and extends 5Ј for 127 bp. Alignment of this expressed sequence tag with genomic sequences defines an upstream exon (designated exon AO) (Fig. 1A) and suggests that a cognate promoter also exists to transcribe this sequence in anaplastic oligodendroglioma.
Since the hUlip gene product shares a significant number of conserved amino acid residues with the protein sequence of the C. elegans unc-33, we compared the genomic structure of hUlip and unc-33 (Fig. 1B). In the worm, there are three unc-33 mRNA transcripts of 3.8 kb (exons I-X), 3.3 kb (exons V-X) and 2.8 kb (exons VII-X). The 2.8-kb unc-33 mRNA contains the 3Ј-half of exon VII as well as exons VIII-X, and this transcript corresponds to sequences conserved in all mammalian unc-33 family genes including the hUlip cDNA (1). The 5Ј regions of the unc-33 transcripts are distinct, while the 3Ј regions are identical, with all predicted transcripts being translated in the same open reading frame. The unc-33 exon VII and VIII and exon IX and X splice sites are identical to the hUlip exon 2 and 3 and exon 12 and 13 splice sites, respectively. hUlip exons 1-3 include the protein subdomain region A ( Fig. 1A and Table III), which contains a region highly conserved with unc-33. This region is functionally important for the collapsin-induced Ca 2ϩ influx response (9). An overall comparison of the exonic structure of hUlip with unc-33 reveals four highly homologous regions (Table III). Amino acids encoded by hUlip exons 2 and 3 (region A), exons 6 and 7 (region B), and exons 10 and 11 (region C and the D region within C) showed significant sequence similarities of 41-56% with unc-33 (Fig. 1A). These regions correspond to domains that were previously identified ( Fig. 1A) (2,4). An additional conserved domain encoded by hUlip exon 14 (region E, Fig. 1A and Table III) shows a 41.4% homology with unc-33.
Chromosomal Location-hUlip phage clone 1 was used to determine its chromosomal location. Specific hybridization was detected on the long arm of chromosome 5 in all 20 metaphases examined. Fractional length measurements (n ϭ 30) yielded an average FLpter value of 0.823 Ϯ 0.021, corresponding to a regional assignment of 5q32 (Fig. 2). In support of our finding, DRP-3 has been mapped to chromosome 5 by radiation hybrid mapping by the Whitehead Institute Center for Genome Research.
Promoter Region of hUlip-A 2.5-kb BamHI fragment containing exons AO and 1 was sequenced to study the putative promoter region of the hUlip gene (Fig. 3, A and B). The outlined letters represent exonic sequence from hUlip cDNA. The shaded boxes designate extended exon 1 and exon AO sequences derived from an anaplastic oligodendroglioma EST clone (accession number A1570709). 5Ј-Rapid amplification of cDNA ends was used to map transcriptional start sites in RNA from RA-treated neuroblastoma cells. Several putative start sites were found, although all of these were 3Ј (within exon 1) to the 5Ј-end of a hUlip cDNA from a fetal brain cDNA library (data not shown). The 5Ј-end of the Ulip cDNA corresponds to Ϫ136 (relative to A ϩ1 TG) in the hUlip gene (Fig. 3B). Additionally, this position corresponds to a putative start site predicted by a neural network algorithm (14,15). A search for a Ϫ30  Table III. hUlip Genomic Structure and Promoter Analysis TATA box correlating with the 5Ј terminus of known exon AO sequence did not reveal a TATA sequence. Although a consensus TATA sequence is found at Ϫ1166 bp, it seems unlikely that it is required for expression of exon AO sequences, since the size of the hUlip mRNA has been estimated to be between 5.5 and 5.8 kb by Northern analysis, which corresponds to the sum total of known exon sequences (4). The 2.8-kb unc-33 mRNA, which is most homologous to the hUlip mRNA, does not contain a TATA box in its 5Ј region (1). These findings along with the known characteristics of TATA-less promoters suggest

. . . TGTAG/GCACC
hUlip Genomic Structure and Promoter Analysis that a primary start site exists at Ϫ136 with many heterogeneous starts occurring downstream in neuroblastoma cells. In anaplastic oligodendroglioma, presumably an upstream promoter is used, giving rise to a larger primary transcript beginning with exon AO, which is spliced to exon 1. Given that Ulip is expressed in nervous system tissue during development and is selectively expressed in adult muscle, the presence of a consensus MyoD/myogenin binding site at Ϫ288 bp in the putative hUlip promoter may be important. To determine if the MyoD/myogenin site is conserved evolutionarily, the corresponding 5Ј region of the murine Ulip gene was sequenced (GenBank TM accession number AF246693). The sequence MGD 1 (5Ј-CACTCGCACCCTCCTCCACT-3Ј) located Ϫ238 bp upstream of the putative initiation codon was used as a sequencing primer. This primer was used in a dideoxy sequencing reaction with the mUlip genomic subclone MG 5 to obtain a sequence encompassing the putative transcriptional start site (of the previously published mUlip cDNA; accession number X87817) and a consensus E-box located further 5Ј. Alignment of the sequence with the corresponding human Ulip genomic sequence by the GCG Lite version of the program Gap (Genetics Computer Group) demonstrates a region of near identity between ϩ1 and Ϫ120, including identical E-box sequences represented by the boxed nucleotides in Fig. 3C. The MyoD/myogenin site as well as the c-Myc, Oct-1, and Ets-1 binding sites are conserved both in sequence and spatial relationship to one another.
hUlip Expression and Promoter Analysis-Some members of unc-33 family are expressed in adult heart and skeletal muscle (2,11). Consistent with this finding, we detect differential expression of hUlip mRNA in poly(A) ϩ Northern blots from heart and skeletal muscle and low levels of expression in other adult tissues including brain, kidney, lung, liver, placenta, and pancreas (Fig. 4A). As a prerequisite to promoter analysis by transient transfection, we evaluated the expression of hUlip message in total RNA isolated from C 2 C 12 myoblasts and KCNR neuroblastoma cells (Fig. 4B). Although it appears that hUlip may be more highly expressed in myoblasts compared with the neuroblastoma cells, 5-fold more myoblast total RNA was evaluated. PhosphorImager analysis, normalized to total RNA, indicated that in total RNA the neuroblastoma cells express almost twice the level of Ulip mRNA that is detected in myoblasts (Fig. 4B). Additionally, myogenic differentiation of myoblasts into myotubes produced a down-regulation of hUlip expression to almost undetectable levels by Northern analysis. Consistent with our previous results, retinoic acid-induced neurogenic differentiation of KCNR neuroblastoma cells showed a 2.5-fold increase in hUlip expression after 4 days of treatment (4).
Since the hUlip gene is active in both neuroblastoma and myogenic cells, the neuroblastoma cell line NGP and C 2 C 12 myoblasts were chosen as model systems for hUlip promoter analysis. The 2.5-kb hUlip promoter was cloned upstream of the luciferase reporter gene. Nested deletions and point mutations were generated from the promoter using the construct Ϫ2300UlipLuc (Fig. 5). Luciferase activity was normalized to ␤-galactosidase activity, and the Ϫ2300UlipLuc activity was set to 100% in both cell lines such that relative comparison can be made between cell lines. Computer data base analysis shows a perfect MyoD/myogenin binding (CCAGCTGGC) site in close proximity (Ϫ150) to the transcriptional start site. This motif could have implications for muscle and neuroblastoma expression, since factors other than myogenic transcription factors such as NeuroD and Myc and other members of the bHLH family are known to bind to E-boxes (20). Insertional mutagenesis of the hUlip E-box created clones E-GCUlipLuc and E-TAUlipLuc that disrupted the core binding site. Activity was decreased approximately 2-fold from wild type levels in both C 2 C 12 myoblasts and NGP cells with both mutations. This demonstrates that in both neuroblastoma and myoblasts, the E-box element is necessary for full activity of the promoter.
Deletion of far 5Ј-flanking sequences to Ϫ940 in the hUlip promoter showed little or no difference from the wild type. Deletion from Ϫ940 through Ϫ540 showed a 2-fold reduction of activity in both myoblast and neuroblastoma cells. This deletion removes the region that includes the possible promoter/ start site for exon AO and may be responsible for the decrease in activity if it is contributing to the transcriptional output in neuroblastoma and myoblasts. This interval also contains several sites for ubiquitous factors such as AP-2, Sp1, NF1, Ap1, and C/EBP. Tissue-specific binding sites for MBF1 and MEF-2 also appear in this region. Between Ϫ540 and Ϫ290, there is an approximately 10 -20% decrease in both myoblast and neuroblastoma cells, corresponding to the possible deletion of more putative Sp1 binding sites. The deletion to Ϫ150 disrupts the MyoD/myogenin binding site and shows no significant decrease  hUlip Genomic Structure and Promoter Analysis in neuroblastoma expression but gives a 30% decrease in myoblast expression. This indicates that elements contributing to myoblast-specific expression reside in the region between Ϫ290 and Ϫ150 such as the MyoD/myogenin site. Further deletion to Ϫ20 gives 20 -25% loss of activity in both neuroblastoma and myoblast cells to levels no more than 2-fold above background, consistent with the activity expected of the basal promoter. DISCUSSION Previously, we identified hUlip as a cDNA up-regulated during retinoic acid-induced neuritogenesis of neuroblastoma (4). hUlip is a member of the TUC (TOAD/Ulip/CRMP) family of proteins, which are involved in intracellular signaling regulating neuronal outgrowth and axonal guidance (2). In this report, we characterize the structure, chromosomal location, and expression and delineate the promoter sequences sufficient for expression of the hUlip gene in culture.
To determine the genomic structure of the hUlip gene, we screened a human leukocyte library. A genomic sequence, encompassing all sequences included in the hUlip cDNA, was generated by genomic walking. This sequence matches a high throughput genomic sequence that appeared in the data base while this manuscript was in preparation. A search of the EST data bases revealed an overlapping sequence that extends the 5Ј terminus of the hUlip cDNA. Comparison of the anaplastic oligodendroglioma EST and hUlip cDNA to genomic sequences demonstrated that the hUlip gene is encoded by at least 15 exons spanning 61.7 kb. The genomic structure of hUlip and human CRMP-1, which also contains 14 coding region exons, exhibit identical intron-exon boundaries within the coding region (21). Furthermore, the lengths of the introns are very similar between hUlip and mouse Ulip, which was also cloned and partially sequenced (data not shown). A disproportionately large intron exists between exons 1 and 2 in both hUlip and mouse Ulip at 26.8 and 18 kb, respectively. A disproportion- 2), we analyzed the hUlip genomic 5Ј sequences for known transcription factor binding sites, which are in boldface type. The sequence coordinates are given to the left, with Ϫ1 designating the first nucleotide immediately 5Ј of the initiator codon. The mouse sequence is a composite of the 5Ј portion of a Ulip cDNA reported by Byk et al. (2) (GenBank TM accession number X87817, presented in lowercase letters with a filled triangle located below its 5Ј terminus) and the mUlip gene (shown in uppercase letters) (GenBank TM accession number AF246693). hUlip cDNA sequences are indicated by outline lettering, and the amino acid sequence is shown below the respective codons. Boxed sequences designate regions of the Ulip promoter that are highly conserved. ately large intron also exists between CRIMP-1 exons 1 and 2, indicating that there are genomic structural similarities among these genes.
Comparison of the hUlip locus with C. elegans unc-33 reveals that hUlip exon 1-14 coding sequences are homologous to worm exons VII-X. The 5Ј worm exons I-VI show no relationship to any hUlip sequences. Additionally the worm exon VII/ VIII and IX/X junctions are identical to hUlip 2/3 and 12/13, respectively. hUlip exons 1-3, including the conserved exon 2/3 junction, have high similarity with unc-33. Aside from structural similarities, this region is functionally important, since antibody to a peptide encoded by exons 1 and 2 blocks the collapsin-induced Ca 2ϩ influx response (9). The evolutionary conservation of this region in unc-33 and hUlip genes supports the functional importance of this region in the unc-33/hUlip protein, which may be to participate in collapsin signaling. The 5Ј-half of unc-33 exon IX is highly conserved with hUlip. This region does not appear to function in the collapsin response, because antibodies to peptides made from this region do not block collapsin-mediated Ca 2ϩ flux, and this region was also not included in the original cDNA functionally identified that mediated collapsin-induced Ca 2ϩ flux (9). This region is most highly conserved among unc-33, hUlip, human dihydropyrimidinase, and the D-hydantoinase enzyme from Bacillus stearothermophylus and Pseudomonas putida. Amino acids encoded by exons 6 and 7, exons 10 and 11, and exon 14 also have a significant homology of 41-56% with unc-33. A high degree of sequence conservation often implies functional importance; thus, it is possible that these regions may encode another functional activity of these proteins independent from their activity in the mediation of collapsin signals. Hamajima et al. (11) have raised the possibility that these proteins may also function as amidohydrolases, although no such activity has been reported.
In the C. elegans unc-33 mutants, mutational insertions occur within exon VII just 5Ј to the start of the 2.8-kb unc-33 mRNA that is homologous with hUlip and other members of the unc-33-like genes identified in mammals. These mutations result in deletion of the highly charged N-terminal region of the products encoded by the 3.8-and 3.3-kb unc-33 mRNAs, although the protein-encoding sequences in the 2.8-kb unc-33 mRNA are intact. It is not known whether alterations in one or all of the proteins encoded by these unc-33 mRNAs are needed for the uncoordinated phenotype. While it is possible that the loss of the products of the 3.8-and 3.3-kb unc-33 mRNAs are key, it is also possible that the altered regulation of the 2.8-kb hUlip Genomic Structure and Promoter Analysis unc33 mRNA caused by mutations in its putative promoter region contribute to alterations in the expression of this gene that result in the uncoordinated phenotype of the worm.
The finding of expression of hUlip in adult muscle suggests that unc-33 related molecules may also play a role in muscle development or function. If this is true, then the original unc-33 defect in C. elegans may not be restricted to solely premature termination of axonal processes, which raises the possibility that the alterations in muscle development or function may contribute to the uncoordinated movements characteristic of these mutant worms. Interestingly, our expression analysis of hUlip mRNA during in vitro myogenesis shows hUlip message decreasing with differentiation into myotubes. This expression pattern may be explained by the lack of innervation or other environmental factors in vitro that are otherwise present during normal development in vivo. Retinoic acidinduced differentiation of KCNR cells induces the slow accumulation of hUlip mRNA with the first detectable increase at 48 h and increasing to 2.5-fold after 96 h of treatment (4). The slow accumulation of hUlip mRNA and our previous studies showing no significant increase in hUlip transcription at 2 days (4) (22) indicate that RA does not significantly regulate hUlip transcription. In preliminary studies, transient transfection analysis of the hUlip 2.5-kb promoter also indicates that RA does not significantly alter hUlip transcription in vitro (data not shown).
RNase protection assays and 5Ј-rapid amplification of cDNA ends with neuroblastoma RNA suggest that the start site of transcription is heterogeneous and occurs near the 5Ј terminus of exon 1 in neuroblastoma cell line KCNR (data not shown). The predicted molecular weight of a message beginning in this vicinity is consistent with the size observed by Northern analysis. Sequence inspection shows that exon 1 5Ј-flanking sequences are GC-rich and do not contain an associated TATA box. These data supports the hypothesis that hUlip transcription in neuroblastoma can initiate from an intragenic TATAless promoter coincident with exon 1.
Given its high expression in nervous system tissue during development, the presence of a consensus MyoD/myogenin binding site at Ϫ288 bp in the putative promoter region of hUlip suggested that it also participated in muscle-specific expression. However, both Drosophila and vertebrate neurogenesis use a core differentiation network of transcription factors that strongly resembles the myogenic network of transcription factors. It is possible that this site also interacts with MASH, neurogenin, or NeuroD (20). The importance of the MyoD site for hUlip transcription is also supported by our finding that the 5Ј region of the murine Ulip gene has a region of near identity between Ϫ120 and ϩ1 including identical Ebox sequences. Aside from the murine and human Ulip MyoD/ myogenin site, c-Myc, Oct-1, and Ets-1 binding sites are conserved both in sequence and spatial relationship to one another. Such conservation of sequence and spatial relationship may suggest functional importance. Insertional mutagenesis of this MyoD/myogenin site decreased promoter activity approximately 2-fold from wild type levels in both C 2 C 12 myoblasts and NGP cells. This suggests that factors present in both C 2 C 12 and NGP cells interact with this site and contribute to the activity of the hUlip promoter. As mentioned above, these factors could be specific for each cell type, or a ubiquitous factor such as USF might be responsible for the observed activity.
Deletion of the interval between Ϫ940 and Ϫ540 showed a 2-fold reduction of activity in both cell lines. There are several sites present for transcription factors including Sp1, NF1, Ap1, MBF1, MEF-2, and C/EBP. The MBF1 and MEF-2 transcription factors are involved in myogenic differentiation. Interest-ingly, MEF-2C is a MEF-2 isoform and is highly expressed in the brain, and this may play a role in hUlip transcription (23). MEF-2 family members can synergize with bHLH factors such as MyoD or MASH. The neuronal and musclular pattern of MEF-2 expression complements the dual pattern of bHLH family members, reinforcing the similarity between neuronal and muscular transcriptional mechanisms. An AP-2 site also appears in this sequence, which is a marker for cells of neural crest lineages of which neuroblastoma is a derivative and is also regulated by retinoic acid (24). Future studies will be aimed at delineating the involvement of these factors in hUlip expression. An alternative explanation for the decrease in transcription between Ϫ940 and Ϫ540 is that the putative promoter/exon AO is deleted in this construct, eliminating its contribution to hUlip transcription. Curiously, deletion to Ϫ150 disrupts the MyoD/myogenin binding site but shows no significant decrease in neuroblastoma expression compared with Ϫ290, yet it gives a 30% decrease in myoblast expression. This indicates that elements contributing only to myoblast-specific expression reside in the region between Ϫ290 and Ϫ150 such as the MyoD/myogenin site. Alternatively, the E-box could be necessary only to mediate the effects of upstream elements but has no transactivation activity on its own in neuroblastoma cells. Deletion to Ϫ20 results in a decrease to 2-fold over background and is consistent with the deletion of basal promoter elements such as binding sites for Sp1 and C/EBP.
In summary, we have presented the isolation and structural analysis of the human Ulip gene and initial characterization of its mechanism of expression. Although Ulip functional and expression studies have previously demonstrated its association with the neuronal phenotype (2), we have shown that in adult tissues it is most highly expressed in cardiac and skeletal muscle tissues. Additionally, we have ascertained that the transcriptional mechanism of hUlip expression relies on the presence of an E-box motif in both myoblast and neuroblastoma cells. Studies are in progress to further analyze hUlip function by gene knockout and the pattern of its expression by ␤-galactosidase knock-in. This will allow a better understanding of the role of hUlip in neuronal and muscular development and function.