Peripheral nervous system-specific genes identified by subtractive cDNA cloning.

An improved method for constructing and screening subtractive cDNA libraries has been used to identify 46 mRNA transcripts that are expressed selectively in neonatal rat dorsal root ganglia (DRG) as judged by Northern blots and in situ hybridization. Sequence analysis demonstrates that both known (e.g. peripherin, calcitonin gene-related peptide, myelin P0) and novel identifiable transcripts (e.g. C-protein-like, synuclein-like, villin-like) are present in the library. Half of the transcripts (23) are undetectable in liver, kidney, heart, spleen, cerebellum, and cerebral cortex. Of the DRG-specific transcripts, 12 contain putative open reading frames that show no identity with known proteins. The construction of such a subtractive library thus provides us with both known and novel markers, and identifies new predicted DRG-specific proteins. In addition, the DRG-specific clones provide probes to define the regulatory elements that specify peripheral nervous-system-specific gene expression.

Tissue-specific gene expression specifies cell fate and function. The transcriptional regulatory events that define the complement of genes expressed by particular cell types are therefore of considerable interest. One approach to understanding the development of specialized cells is to search for vertebrate homologues of key regulators defined by genetic studies of simple organisms such as Caenorhabditis elegans or Drosophila melanogaster (see, e.g., Ghysen and Dambly-Chaudiere (1993) and Jan and Jan (1994)). Thus, homologues of the Drosophila achaete-scute genes and the C. elegans unc-86 gene that specify peripheral neuron fate are known to be expressed in the peripheral nervous system of vertebrates (Anderson, 1993;Ninkina et al., 1993), where they may have a related function, for example in the specification of sympathetic neuron development (Guillemot et al., 1993). A number of molecules, invariably transcription factors, that specify particular cell fates have been identified in this way (He and Rosenfeld, 1991). A complementary approach to identifying key regulators relies upon the isolation of mRNA transcripts that are selectively expressed in defined cell types, followed by the identification of the regulatory elements that specify the pattern of their expression. Thus, two homeodomain proteins (Lmx-1 and Cdx-3) that bind to the insulin enhancer were isolated by screening expressed proteins that bound to regulatory elements of the insulin gene (German et al., 1992). Similarly, a transcription factor selectively expressed in olfactory neurons (Olf-1) has been identified, because its cognate binding sequence was found upstream of a number of olfactory neuron-specific genes (Wang and Reed, 1993).
We are studying the mechanisms that specify the development of a peripheral sensory neuron phenotype and, in particular, the development and function of small unmyelinated sensory neurons. Subsets of these cells respond to tissue damage, and thus play a critical role in the initiation of some forms of pain perception (Scott, 1993). As a first step toward the definition of critical regulatory steps in sensory neuron development and function, we have developed a modified method for subtractive library construction and a novel difference screening method to isolate a battery of transcripts exclusively or selectively expressed in rat dorsal root ganglia. By subtracting liver, kidney, cortex and cerebellum mRNA transcripts from cDNA generated from rat DRG, 1 we have found 23 transcripts that are expressed in DRG, but not heart, liver, spleen, cortex, or cerebellum. We present a classification of these transcripts into known, identifiable, and novel transcripts by DNA sequencing and demonstrate the selectivity of their expression by Northern blotting and in situ hybridization.

MATERIALS AND METHODS
cDNA Synthesis from DRG-derived Poly(A) ϩ RNA-DRG from all spinal levels of neonatal Sprague-Dawley male and female rats were frozen in liquid nitrogen. RNA was extracted using guanidine isothiocyanate and phenol/chloroform extraction (Sambrook et al., 1991). Poly(A) ϩ RNA was then isolated by oligo(dT)-cellulose chromatography (Aviv and Leder, 1972). cDNA was generated using 0.5 g of DRG poly(A) ϩ mRNA, oligo(dT)/NotI primer adapters, and SuperScript reverse transcriptase (Life Technologies, Inc.). One half of the cDNA was labeled by including 2 MBq of [ 32 P]dCTP (Amersham Corp.) in the reverse transcriptase reaction, and subsequently purified on Qiagen-5 tips.
Enrichment of DRG-specific cDNA-Poly(A) ϩ RNA from other tissues (10 g) was incubated with 10 g of photoactivatable biotin (Clontech) in a total volume of 15 l and irradiated at 4°C for 30 min with a 250-watt sunlamp. The photobiotin was removed by extraction with butanol and the cDNA co-precipitated with the biotinylated RNA without carrier RNA (Sive and St. John, 1988) .
Hybridization was carried out at 58°C for 40 h in 20% formamide, 50 mM MOPS, pH 7.6, 0.2% SDS, 0.5 M NaCl, 5 mM EDTA. The total volume of the reaction was 5 l, and the reaction was carried out under mineral oil, after an initial denaturation step of 2 min at 95°C. 100 l of 50 mM MOPS, pH 7.4, 0.5 M NaCl, 5 mM EDTA containing 20 units of streptavidin (Life Technologies, Inc.) was then added to the reaction mixture at room temperature, and the aqueous phase retained after two phenol/chloroform extraction steps. After sequential hybridization with biotinylated mRNA from liver and kidney, followed by cortex and cerebellum, a 80-fold concentration of DRG-specific transcripts was achieved.
One-third of the 1-2 ng of residual cDNA was then G-tailed with * We thank the Wellcome Trust for generous support. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EMBL Data Bank with accession number(s) X90475 and X86789.
terminal deoxynucleotide transferase at 37°C for 30 min. PCR was used to amplify the cDNA using an oligo(dT)/NotI primer adapter and oligo(dC) primers starting with the sequence AATTCCGA(C) 10 . Amplification was carried out using two cycles of 95°C for 1 min, 45°C for 1 min, 72°C for 5 min, followed by two cycles of 95°C for 1 min, 58°C for 1 min, 72°C for 5 min. The resulting product was then separated on a 2% Nu-sieve agarose gel, and material running at a size of greater than 0.5 kb eluted and further amplified with 6 cycles of 95°C for 1 min, 58°C for 1 min, and 72°C for 5 min. This material was further separated on a 2% Nu-sieve agarose gel, and the material running from 6 kb on the gel was eluted and further amplified using the same PCR conditions for 27 cycles. The amplified DNA derived from this high molecular region was than further fractionated on a 2% Nu-Sieve gel, and cDNA from 0.5 to 1.5 kb, and from 1.5 to 5 kb, were pooled.
Library Construction--Zap II (Stratagene) was cut with NotI and EcoRI and dephosphorylated. DRG cDNA was digested with Klenow enzyme in the presence of dGTP and dCTP to construct an EcoRI site from the oligo(dC) primer (see above) at the 3Ј end of the cDNA, and cut with NotI for directional cloning. The cDNA was ligated into -Zap II for 16 h at 12°C, and packaged with Gigapack gold (Stratagene). 0.1% of the packaged DNA was plated onto Escherichia coli BB4 cells to calculate the number of independent clones generated.
Differential Screening-The library was then plated at a low density (10 3 clones/12 ϫ 12-cm 2 dish) and screened using three sets of 32 Plabeled cDNA probes and multiple filter lifts. The filters blotted with bacteria were briefly dried and replica-plated on fresh agar plates to increase the quantity of phage and the subsequent hybridization signals. The probes were derived from (a) cortex and cerebellum poly(A) ϩ RNA, (b) DRG poly(A) ϩ RNA, and (c) subtracted cDNA from DRG. The two mRNA probes were labeled with [ 32 P]dCTP using a reaction mixture containing 2-5 g of RNA, 50 l of 5 ϫ RT buffer, 25 l of 0.1 M dithiothreitol, 12.5 l of 10 mM dATP, dGTP, and dCTP, 30 pM oligo(dT), 75 l of [ 32 P]dCTP (30 MBq; Amersham), 25 l of 100 M dCTP, 2 l of RNasin (2 units/l), and 2 l of SuperScript reverse transcriptase (Life Technologies, Inc.) in a final volume of 250 l. The reaction was incubated at 39°C for 60 min, and the RNA subsequently destroyed by adding 250 l of water, 55 l of 1 M NaOH, and incubating at 70°C for 20 min. The reaction mixture was neutralized with acidified Tris base (pH 2.0) and precipitated with carrier tRNA (Boehringer Mannheim) with isopropanol.
The subtracted and amplified double-stranded DRG cDNA was random-prime labeled with [ 32 P]dATP (Life Technologies, Inc. Multiprime kit). Replica filters were then prehybridized for 4 h at 68°C in hybridization buffer (see below). Hybridization was carried out for 20 h at 68°C in 4 ϫ SSC, 5 ϫ Denhardt's solution containing 150 g/ml salmon sperm DNA, 20 g/ml poly(U), 20 g/ml poly(C), 0.5% SDS, 5 mM EDTA. The filters were briefly washed in 2 ϫ SSC at room temperature, then twice with 2 ϫ SSC with 0.5% SDS at 68°C for 15 min, followed by a 20-min wash in 0.5% SDS, 0.2 ϫ SSC at 68°C. The filters were autoradiographed for up to 1 week on Kodak X-Omat film.
Clones that hybridized with DRG probes but not cortex and cerebellum probes were picked and excised into Bluescript according to the maker's instructions. The plasmids were plaque-purified and finally cross-hybridized with each other. Unique clones were further analyzed by Northern blotting and sequencing.
Northern Blots and in Situ Hybridization-20 -30 g of total RNA from neonatal rat tissues extracted with guanidine isothiocyanate/phenol chloroform were separated on 1.2% agarose-formaldehyde gels, and capillary blotted onto Hibond-N (Amersham) (Ninkina et al., 1993). The amounts of RNA on the blot were roughly equivalent, as judged by ethidium bromide staining of ribosomal RNA or by hybridization with the ubiquitously expressed L-27 ribosomal protein transcripts (Le Beau et al., 1991). Each Northern blot contained DRG, cortex, cerebellum, liver kidney, spleen, and heart RNA. Probes (50 ng) were labeled with [ 32 P]dATP (Amersham) by random priming. Filters were prehybridized in 50% formaldehyde, 5 ϫ SSC containing 0.5% SDS, 5 ϫ Denhardt's solution, 100 g/ml boiled salmon sperm DNA, 10 g/ml poly(U), and 10 g/ml poly(C) at 45°C for 6 h. After 36 h of hybridization in the same conditions, the filters were briefly washed in 2 ϫ SSC at room temperature, then twice with 2 ϫ SSC with 0.5% SDS at 68°C for 15 min, followed by a 20-min wash in 0.5% SDS, 0.2 ϫ SSC at 68°C. The filters were autoradiographed for up to 1 week on Kodak X-Omat film. In situ hybridization on fresh frozen 10-m sections as carried out using digoxygenin cRNA transcripts of about 300 base pairs (Schaeren-Wimers and Gerfin-Moser, 1993).

Generation of Clones-
The approach used to generate DRGspecific clones is schematized in Fig. 1. Neonatal rat DRG RNA was used for the generation of oligo(dT)-primed cDNA, which was sequentially hybridized with photobiotin-labeled RNA from liver and kidney, and subsequently from cortex and cerebellum. The DNA/RNA hybrids were removed by phenol extraction when complexed with streptavidin, and the small amount of residual cDNA was then used to generate doublestranded DNA (Sive and St. John, 1988). After addition of poly(G), the polymerase chain reaction was used to amplify the remaining cDNA. In order to obtain a representative library, a crucial step in the protocol was to size-fractionate the cDNA on agarose gels and use a multistep PCR amplification for the equivalent amplification of different size transcripts (see "Discussion").
The products of this multi-step PCR protocol were then directionally cloned into -Zap II. The resulting library was screened with radiolabeled probes derived from DRG mRNA and a mixture of cortex and cerebellum mRNA, using reverse transcription, and the final cDNA derived from the PCR amplification. By isolating only those clones that tested positive for homology with DRG poly(A) ϩ probes, a subset of clones was isolated. This method of differential screening allowed us to identify 91 clones that were likely to be expressed selectively in DRG.
The 91 putative DRG-specific clones were rescued into Bluescript, plaque-purified, and their insert sizes analyzed. Inserts ranged in size from 0.3 to 4 kb (Table I). In order to eliminate redundant clones, the inserts were random-prime labeled with [ 32 P]dCTP and cross-hybridized with each other. This analysis showed that there were, in fact, 46 distinct clones present in the library.
Distribution of Expression-In order to determine how many of the clones present in the library were DRG-specific, we used the inserts to probe Northern blots of a variety of rat tissues at high stringency. This approach enabled us to determine not only whether the clones were DRG-specific, but also what were the sizes of transcripts hybridized to probes derived from the difference library. This in turn allowed us to determine how many of the clones that we had isolated were full length, and whether different size transcripts corresponding to related or differentially spliced messages were present. Northern blots were carried out with equivalent amounts of RNA derived from DRG, heart spleen, liver, kidney, cortex, and cerebellum, by normalizing them to the level of 18 and 28 S RNA or the L27 ubiquitous ribosomal protein transcript. A representative series of Northern blots is shown in Fig. 2. The information obtained from a complete Northern analysis is shown in Table  I, where the size of the isolated probes is compared with the size of the transcripts detected on Northern blots. Some of the 41 distinct clones that were isolated hybridized to multiple transcripts (e.g. clone i5).
In all 46 different transcripts were analyzed by Northern blotting. Of these, 23 (50%) were apparently DRG-specific, while essentially all the clones are substantially enriched in DRG RNA compared to RNA from other tested tissues. A number of clones (13) of size up to 3.5 kb were either full-length or almost complete, suggesting that the entire coding region of the transcripts would be present. The Northern analysis obviously gives no information on the cell types expressing the various DRG-specific transcripts. In some cases, therefore, cRNA transcripts labeled with digoxygenin-UTP were synthesized and used to analyze the distribution of expression of mRNA transcripts in sections of neonatal rats. Fig. 3 shows a number of examples (e.g. clone H7, a villin-like molecule) of transcripts that are clearly expressed in large diameter cell bodies corresponding to DRG sensory neurons, but not other cell types. Interestingly, some transcripts (e.g. clone G7) were expressed in subsets of sensory neurons (Fig. 3). The analysis of distribution of expression suggested that many clones were indeed selectively expressed in DRG and were suitable for further analysis.
Sequence Analysis-We sequenced the 5Ј ends of all 23 clones that showed an apparent DRG-specific pattern of expression using primers from the Bluescript vector, in an attempt to identify protein coding regions of the transcripts. The results obtained are summarized in Table II. A number of transcripts  could be aligned with known sequences in the data base. Thus we found that both neuronal (peripherin, CGRP) and some peripheral glial markers (myelin P0) were present in the library. This further supported the conclusion that the differential screening method was effective, and the subtractive library construction had resulted in DRG-specific clones. Interestingly, we found a number of clones that are normally associated with skeletal muscle, such as fast-troponin I, troponin T, myosin fetal heavy chain, skeletal muscle myosin light chains 1/3 and 2, parvalbumin, carbonic anhydrase III, skeletal muscle creatine kinase, and a C-protein-like molecule (Kojima et al., 1990). Although some trunk smooth muscle, like sensory neurons and glia, derives from migrating neural crest cells, the presence of high levels of these transcripts in neonatal DRG compared to other tissues was initially surprising. However, earlier immunocytochemical studies have shown that the protein products of transcripts such as troponin are indeed present at high levels in DRG neuronal cell bodies, and carbonic anhydrase is associated with a proprioreceptive phenotype (Roisen et al., 1983;Mayeux et al., 1993). The results of the sequencing analysis are summarized in Table II. The clones analyzed fall into three classes; known proteins, proteins that are homologous to known proteins, and unknown transcripts. Clone E0 is a rat homologue of the skeletal muscle myosin-associated immunoglobulin superfamily molecule C-protein (Kojima et al., 1990;Einheber and Fischman, 1990). Clone E8 encodes a protein with a similar but not identical sequence to carbonic anhydrase III (Mayeux et al., 1983). Clone H7 is a close homologue of the actin-bundling protein villin, found in secretory epithelial cells, and associated with cell shape determination (Otto, 1994). Clone D3 is a homologue of the major central nervous system phosphoprotein synuclein, which is associated with presynaptic membranes, but whose function remains undiscovered (Maroteaux et al., 1988(Maroteaux et al., , 1991Nakajo et al., 1993;Jakes et al., 1994). Interestingly, clone D3 is also homologous to synuclein-like proteins associated with amyloid deposits found in Alzheimer's disease . We further analyzed the distribution of expression of this novel transcript with a variety of other tissues ( Fig. 4 and Table II). This analysis confirmed an association between this transcript and sensory neurons, as the transcript is highly enriched in DRG compared with other tissues.
The synuclein family thus comprises both central and peripheral nervous system proteins. The complete sequence of the synuclein-like clone and a partial sequence of the rat C-proteinlike clone are shown in Fig. 5.

DISCUSSION
Subtractive library production provides a powerful approach to defining tissue-specific transcripts that are likely to play a role in the specialized function of the cell-types targeted. In addition, an analysis of the genomic organization of tissuespecific genes should define the motifs that play a critical role in the regulation of gene expression of the tissue of interest.
Here we describe the subtractive cloning and differential screening of a number of new DRG-specific transcripts, with the eventual aim of defining the critical steps in the specification of sensory neuron cell fate.
The construction of a representative library from relatively small amounts of tissue is problematic, and sensitive screening protocols are necessary to isolate interesting clones. The development of the photobiotin/streptavidin subtractive hybridization technique (Sive and St. John, 1988) is the basis for the construction of the subtractive library described here. Recently this approach has been successfully extended to identify transcripts specifically expressed even in very small numbers of cells (Wieland et al., 1990;Klar et al., 1994;Korneev et al., 1994). The protocols described here address the two problems of representative library production and sensitive screening by means of a number of technical innovations. First, a novel multi-step PCR amplification procedure was used to generate a library from picogram quantities of subtracted DRG mRNA. The major problem with the use of PCR to generate libraries is the over-representation of small transcripts, because of their more efficient amplification. One approach to overcoming this difficulty is to amplify larger transcripts in the pool of cDNA for more PCR cycles (Belyavsky et al., 1989). The approach described here was arrived at empirically and uses three separate PCR amplification steps. First, all the subtracted cDNA was amplified, and the fraction of a size Ͼ0.5 kb isolated and further amplified, in order to discard small transcripts containing unsubtracted material. The resulting products were separated on an agarose gel, and material running with an apparent size of 6 -9 kb was isolated and further amplified. By isolating and amplifying this apparently high molecular weight fraction, which also contains relatively small amounts of cDNA of a size 0 -6 kb, cDNA encoding a truly representative spectrum of transcripts was generated. This material was used for the production of a representative DRG-specific library and encoded a number of large DRG-specific partial transcripts of size up to 4 kb. Another important factor in the production of a representative library is the choice of the "driver" RNA that is used to remove irrelevant transcripts. Both the spectrum of transcripts as well as their abundance in the subtracted library  panel E, clone E0; panel F, clone A4; panel G, clone E2; panel H, clone H7 (villin-like). Northern blots were stripped and reprobed with L27 probes (LeBeau et al., 1990) to confirm that equivalent amounts of mRNA were present in each lane.
are influenced by the driver RNA, because the relative proportion of various DRG-specific genes may be altered by different levels of subtraction. Thus if skeletal muscle RNA had been included in place of one of the components of the driver RNA used here (cerebellum, cortex, liver, and kidney), then a different repertoire of DRG-specific genes would be identified.
Screening of the library also involved technical modifications. The subtracted amplified DRG cDNA, as well as cDNA FIG. 4. Tissue-specific expression of clone D3 (DRG-enriched synuclein-like clone) Northern blot carried out using total rat RNA extracted from skeletal muscle (1), hypothalamus (2), hippocampus (3), DRG (4), kidney (5), spleen (6), cerebellum (7), ileum (8), spinal cord (9), lung (10), liver (11), cortex (12), and heart (13).  derived from DRG RNA, was used to screen the library. Because the subtracted DRG cDNA contained a relatively high proportion of DRG-specific clones, it provided sensitive probes for the identification of cognate clones in the library. In fact, all those clones that were detected both by DRG cDNA and the subtracted DRG DNA were effectively DRG-specific. Of the 19 clones that were detected only with the DRG poly(A) ϩ -derived probes, all were relatively DRG-specific. Essentially all the clones isolated showed a DRG-enriched pattern of expression.
The differential screening and the confirmatory screening using both DRG-derived probes and subtracted DRG probes thus avoided the isolation of any artifactual clones.
Several of the clones that show relatively selective expression in DRG are also expressed in skeletal muscle. It is known that troponin and myosin are expressed at high levels in DRG neurons (e.g. Roisen et al. (1983)). Although such genes are clearly not useful as probes for the definition of regulatory sequences that specify expression solely in DRG, their regulatory elements must include positive signals for DRG expression. Increasing evidence suggests that developing skeletal muscle shows overlapping programs of gene expression with neuronal cell types. Thus myogenic cells have been identified in the developing neural tube (Tajbaksh et al., 1994), and the myogenic gene Mef-2 has been identified in neural crest cells (Edmondson et al., 1994) Given the physical interactions among sensory neurons, motor neurons, and skeletal muscle, these overlapping repertoires of gene expression may underlie aspects of cell-cell recognition.
The analysis of tissue-specific cDNA libraries as described here provides complementary information to that obtained by the random sequencing of genomic DNA or expressed messages The transcripts that are expressed are likely to play a role in the function of the tissue used, because of their selective expression. For example, the peripheral nervous system-specific synuclein, the central nervous system homologues of which are known to be abundantly expressed at presynaptic terminals (Maroteaux et al., 1988), could be associated with aspects of neurotransmitter release that are characteristic of sensory neurons such as neuropeptide release.
In summary, we have successfully developed differential cloning and screening approaches to identify DRG-specific transcripts. The isolation of two markers associated with proprioreceptive neurons, carbonic anhydrase (Mayeux et al., 1993) and parvalbumin (Zhang et al., 1993), both of which are ablated in trk-c null mutant animals that have lost large number of proprioreceptive neurons (Ernforth et al., 1993), shows that the difference library encodes markers for specific subpopulations of sensory neurons, as well as for non-neuronal cells present in DRG cell types (e.g. myelin P0, a Schwann cell marker) (Lemke and Axel, 1985). The precise cell-type distribution of the majority of DRG-neuron-specific clones remains to be established. It will be particularly interesting to define those transcripts that arise in the early stages of neural crest cell commitment and that are expressed in sensory, but not sympathetic, neurons.