Spermatid-specific overexpression of the TATA-binding protein gene involves recruitment of two potent testis-specific promoters.

The gene encoding the TATA-binding protein, TBP, is highly overexpressed during the haploid stages of spermatogenesis in rodents. RNase protection analyses for mRNAs containing the previously identified first, second, and eighth exons suggested that most TBP mRNAs in testis did not initiate at the first exon used in somatic cells (here designated exon 1C). Using a sensitive ligation-mediated cDNA amplification method, 5′ end variants of TBP mRNA were identified, and the corresponding cDNAs were cloned from liver and testis. In liver, a single promoter/first exon is used to generate a steady-state level of roughly five molecules of TBP mRNA per diploid cell equivalent. In testis, we detect modest up-regulation of the somatic promoter and recruitment of at least five other promoters. Three of the alternative promoter/first exons, including 1C and two of the testis-specific promoter/first exons, 1D and 1E, contribute roughly equivalent amounts of mRNA which, in sum, account for greater than 90% of all TBP mRNA in testis. As a result, round spermatids contain an estimated 1000 TBP mRNA molecules per haploid cell. Testis TBP mRNA also exhibits several low abundance 5′ end splicing variants; however, all detected TBP mRNA leader sequences splice onto the common exon 2 and are expected to initiate translation at the same site within exon 2. The precise locations of the three major initiation exons are mapped on the gene. The identification of the strong testis-specific promoter/first exons will be important for understanding spermatid-specific tbp gene regulation.

The TATA-binding protein, TBP, 1 is required for transcription initiation by all three nuclear RNA polymerases (1). Somatic tissues contain from 1 to 5 TBP mRNA molecules per cell; these modest differences correspond qualitatively with relative differences in overall transcriptional activity in nuclei from different tissues (2,3). In contrast, adult rodent testes exhibit roughly 350 molecules of TBP mRNA per cell (2). Testis-specific TBP mRNA overexpression arises primarily as a result of transcriptional up-regulation (2).
During puberty, the first stem cells to undergo spermatogenesis do so almost synchronously, such that different stages of puberty can be correlated to the first appearance of a specific spermatogenic stage in the testis (4). The onset of TBP overexpression during puberty correlates to the appearance of the first haploid cells. Moreover, as shown by immunocytochemistry, TBP protein overexpression occurs in the early haploid cells (2). Because these data suggest that TBP overexpression is restricted to only a subset of the cells in testis, we estimate that TBP mRNA levels in these cells are more than 1000-fold greater than those in somatic cell types. This high level of cell type-specific overexpression makes us suspect that TBP plays a role in spermatogenesis that differs from its ubiquitous functions in somatic cells.
With the goal of understanding the mechanisms regulating spermatid-specific tbp gene expression, we have performed a molecular analysis of the 5Ј end of the tbp gene. Sumita et al. (5) previously reported the intron/exon structure of the mouse tbp gene from exons 2 to 8 (formerly designated exons 1-7). In brain, spleen, and liver, use of a single upstream promoter/first exon has been reported (6). Quantitative analyses presented here show that testis-specific tbp expression involves both modest up-regulation of the somatic tbp promoter and recruitment of at least 2 other major and 3 minor promoters. Using a sensitive method for amplifying and cloning cDNA 5Ј ends, we characterize 10 TBP mRNA 5Ј end variants in testis that differ in promoter usage and/or splicing. The relative contribution of each variant to total testis and liver TBP mRNA levels, the gene structure, and the entire 5Ј end genomic sequence are reported.

EXPERIMENTAL PROCEDURES
Animals and RNA Preparation-Genomic sequencing was performed on mouse 129 ES cell genomic clones. All other samples were prepared from fresh tissues harvested from MORO mice or Sprague-Dawley rats as indicated. Total RNA was prepared by sedimentation through CsCl cushions as described previously (3); nuclear RNA was prepared from citric acid-prepared nuclei sedimented through two sucrose cushions followed by purification of RNA through CsCl cushions as described previously (2).
RNase Protection Assays-RNase protection assays were performed as described previously using the indicated amounts of sample RNA supplemented with yeast RNA to 100 g (2). Control lanes contained probe and 100 g of yeast RNA; probe control lanes contained roughly a 1:100 dilution of nondigested probe and 10 g of yeast RNA carrier. Pseudo-pre-mRNA was transcribed from a mouse genomic clone extending from the SacI site 700 bp upstream of exon 1C to the SacI site in exon 3 using T3 RNA polymerase; yield was determined spectrophotometrically (7). The pseudo-pre-mRNA migrated as a single 5.3-kilobase band on agarose gels, thus validating molar comparisons for probes hybridizing to different regions of the pseudo-pre-mRNA. Pseudo-mRNA containing exons 2-8 was transcribed from a mouse cDNA clone containing sequences beginning near the 5Ј end of exon 2 (37 bp upstream of the BglII site) and extending through the poly(A) tail. Pseudo-mRNAs containing exons 1D-2 or exons 1E-2 were transcribed from clones 62 and 36, respectively, of the rat 5Ј end amplification products shown in Fig. 4.
Probes used in this paper were as follows. The BglII/EcoRI exon 2/3 probe was described previously (2). The BglII/StuI and Sau3AI/StuI probes were both transcribed from a subcloned genomic fragment extending from the SacI site at Ϫ700 to the StuI site 35 bp downstream of exon 1C linearized with BglII or Sau3AI, respectively. The Sa/X probe was transcribed from a fragment spanning from the SacI site at Ϫ700 to the XhoI site 343 bp downstream of exon 1C linearized with Sau3AI. The X/A probe was transcribed from a fragment spanning from the SacI site at Ϫ700 to the ApaI site 1280 bp downstream of exon 1C linearized with XhoI. The A/R1 probe was transcribed from a subclone of the 699-bp ApaI/EcoRI fragment. The R1/Bg probe was transcribed from a subclone of the 1043-bp EcoRI/BglII fragment. The exon 1C/2 probes were transcribed from clones 56 (mouse) or 1 (rat) of the 5Ј cDNA amplification products; the exon 1D/2 probes were transcribed from clones 95 (mouse) or 62 (rat); and the exon 1E probes were transcribed from clones 132 (mouse) or 36 (rat).
Amplifying and Cloning TBP cDNA 5Ј End Variants-General methods are described elsewhere. 2 cDNA was synthesized from 1 g of total or poly(A) ϩ -selected mRNA using a primer that hybridizes to mRNA sequences in tbp exon 6 (5Ј-CCATGTTCTGGATCTTGAAG-3Ј). A universal adapter was ligated to the 3Ј end of the cDNA. TBP cDNAs were amplified by using an anti-adapter primer and, first, a primer specific for sequences in exon 4 (5Ј-GAAGTGCAATGGTCTTTAGGTCAAGTT-TACAG-3Ј), followed by a 25-base primer (5Ј-CTCAGATCTTCCCTA-GAGCATCCTC-3Ј), which spanned the BglII site in exon 2 (underlined). Amplified cDNAs were cloned into BamHI/NotI-cut pBluescript KSϩ plasmids (Stratagene) using the BglII site in exon 2 and a NotI site in the adapter primer. Clones were sequenced using the BglII-containing exon 2 primer. Plasmids with no inserts (2% of the clones) would not sequence with this primer. False-primed cDNA products did not exhibit the remaining 68 bases of exon 2 but rather gave unrelated sequences (about 20% of the clones).
Reverse Transcriptase-mediated Polymerase Chain Reaction-cDNAs were synthesized using 1 g of total or poly(A) ϩ mRNA. cDNAs (2% of each reaction) were amplified using the BglII exon 2 primer described above and primers specific to exon 1C (5Ј-GGCGGGTATCTGCTGGCG-GTTTGGCT-3Ј) or exon 1D (5Ј-GGACCATCGCCTCGGCGGAGGTCCT-3Ј) using Taq polymerase and standard amplification conditions. Amplified products were separated by agarose gel electrophoresis and were visualized by ethidium bromide staining.

RESULTS
Initiation at the Somatic Cell Promoter/First Exon-We previously showed that a transcriptional mechanism is primarily responsible for testis-specific overexpression of TBP mRNA (2). In that study, probes complementary to sequences within either the 5Ј or 3Ј regions of the protein coding sequences of TBP mRNA gave quantitatively indistinguishable results in all assays. Thus, we had no a priori reason to suspect that testis TBP mRNA differed qualitatively from somatic TBP mRNA. Recently, a cDNA clone from mouse brain revealed that the gene contains an additional upstream non-protein-coding exon (designated exon 1C; 6). Exon 1C is located 2888 base pairs upstream of the first protein-coding exon (designated exon 2). We initiated an analysis of the use of this promoter/exon in testis.
RNase protection assays using a genomic probe spanning sequences from roughly 200 bases upstream of exon 1C to 35 bases downstream of exon 1C (designated Bgl2/StuI probe, Fig.  1) revealed two clusters of protected fragments around 80 and 100 bases in length (Fig. 1A, clusters designated 1 and 2). Higher resolution conditions revealed that the same pattern was reiterated by fragments 35 bases longer (105 and 135 bases in length, designated 1* and 2*) which were roughly 5-fold less abundant (Fig. 1B). Similar patterns of protected fragments were observed in rat testis (Fig. 1A), mouse testis, and mouse spleen (Fig. 1B). We interpreted this pattern as representing two clusters of initiation sites represented in mRNA (clusters 1 and 2) and in pre-mRNA (clusters 1* and 2*); however, because the probe was uniformly labeled and spanned both the 5Ј and 3Ј ends of the exon, we could not be certain that the heterogeneity did not occur at the 3Ј end of the exon. Therefore, we designed a second probe (designated Sau3A1/StuI probe) which was truncated just downstream of the predicted clustered initiation sites. This probe should give a precise 5Ј end of the protected fragment and thus allow us to test for heterogeneity at the 3Ј end of the exon. The results using this probe confirmed a precise 3Ј end of the protected fragment ( Fig. 1C) and, thus by inference, confirmed the interpretation that the multiple bands observed in Fig. 1, A and B, represents heterogeneity at the 5Ј end of the mRNA.
Using the Sau3AI/StuI probe (Fig. 1C) or the BglII/StuI probe (not shown), the protected fragments that were 35 bases longer than the mRNA fragments (designated 1* and 2* in Fig.  1, A and B) were found to be over-represented in nuclear RNA preparations (Fig. 1C) and under-represented in polyadenylated mRNA preparations (not shown), as compared with the mRNA-specific signals. This corroborates the identity of these species as pre-mRNAs.
A clone of the entire genomic 5Ј end region was transcribed in vitro to produce a synthetic pseudo-pre-mRNA. Using a standard curve generated with this pseudo-pre-mRNA, we determined that adult rat testis had 5-10 amol of exon 1Ccontaining mRNA, and 1 amol of exon 1C-containing pre-mRNA, per g of total RNA (Fig. 1C). Moreover, we found that testis had only 10-fold more exon 1C-containing TBP mRNA per mass of total RNA than spleen (Fig. 1B). These quantitative values were unexpected because previously, using a probe complementary to sequences in exon 8, we had determined that rat testis contains between 20 and 40 amol of TBP mRNA per g of total RNA, and spleen contains much less than 1 amol/g of total RNA (2). Therefore, our data suggested that only about 30% of the TBP mRNA in testis, albeit possibly all TBP mRNA in spleen, contains exon 1C.
Sequences that always coexist on the same mRNA molecule will show a 1:1 stoichiometry; molar quantities of exons 1C and 8 in testis exhibit a 1:3 stoichiometry. This indicated that there must be at least two species of TBP mRNA in testis (those with exon 1C and those without); however, the point of divergence could lie anywhere between exons 1C and 8. To more precisely determine which sequences differed between TBP mRNA species in testis, the molar quantities of exon 1C-and exon 2-and 3-containing TBP mRNA in testis were determined by RNase protection (Fig. 2). The results showed that TBP mRNAs containing exons 2 and 3, like those containing exon 8, were 3-fold more abundant than mRNAs containing exon 1C in testis. This suggests that all testis TBP mRNAs might contain exons 2, 3, and 8, but they are divergent at their 5Ј ends. We wished to determine the 5Ј sequences of the 70% of testis TBP mRNAs that lack exon 1C. Because we could detect no pre-mRNA signals containing sequences upstream of exon 1C using the BglII/StuI probe (Fig. 1, A and B), we predicted that very little transcription initiated upstream of exon 1C. Thus we focused our investigation on the region between exons 1C and 2.
Sequences within the First Intron Hybridize to Testis-specific mRNAs, Suggesting the Presence of Additional Testis-specific Exons-A series of four long probes completely encompassing the region between exons 1C and 2 were used to search for evidence of alternative exons (Fig. 3). Due to UTP deprivation during high specific activity transcription and to radiolysis thereafter, it is nearly impossible to produce and maintain full-length high specific activity probes of this size, so we expected to find many smaller protected fragments on the gels. Therefore experiments were designed such that true exonic mRNA-derived signals could be distinguished from probe heterogeneity products based on three criteria. First, products arising from probe heterogeneity should appear upon hybridization to either cellular pre-mRNA or synthetic pseudo-pre-mRNA, whereas signals arising from spliced TBP mRNA should not appear in pseudo-pre-mRNA controls. Second, relative to pre-mRNA signals, signals corresponding to bona fide mRNA should be under-represented in nuclear as compared with total RNA preparations (Fig. 1C). Finally, signals corresponding to mRNA should be over-represented in polyadenylated mRNA preparations as compared with signals arising from pre-mRNA. Using the Sa/X probe (Fig. 3A), which contains 83 bases of exon 1C and 342 bases of sequence downstream of the exon 1C splice site, we were able to validate the method. Exon 1Ccontaining mRNA appears as an 83-base band which is underrepresented in nuclear RNA preparations and is not detected using pseudo-pre-mRNA; unspliced pre-mRNA appears as a 425-base band which is abundant in nuclear RNA preparations and is indistinguishable from the signal obtained with pseudo-pre-mRNA. Interestingly, this probe also revealed a protected fragment about 350 bases long in total and nuclear RNA but not in pseudo-pre-mRNA (  2). B, the indicated amounts of total mouse spleen or testis RNA were assayed as in A. The sensitivity of this exposure allows identification of a reiteration of the strong clusters of protected fragments (1 and 2) at a position corresponding to fragments 35 bases longer (designated 1* and 2*). This is the expected size for pre-mRNAs. C, a probe truncated at the Sau3AI site (Sau3A1/ StuI probe) defines a precise 3Ј end for exon 1C and confirms the identity of mRNA-and pre-mRNA-specific signals. Comparison of equal masses of testis total and nuclear RNA shows the relative enrichment of pre-mRNA in nuclei. Comparison to the standard curve after correcting for protected fragment size differences indicated that the total RNA sample contained roughly 8 amol of TBP mRNA and 0.6 amol of TBP pre-mRNA, per g of total RNA. The schematic at lower right shows the relative regions of contiguous hybridization for each probe on mRNA and pre-mRNA. Due to intronic sequences in each probe, the region of hybridization (protected fragment) is 35 bases longer with pre-mRNA as compared with mRNA.

FIG. 2. Concentrations of testis TBP mRNAs containing exons 1C or containing exons 2 and 3.
Quantitative RNase protections were performed as in Fig. 1 using the indicated probes as depicted at the bottom of the figure. The Sau3A1/StuI probe hybridizes to exon 1C, whereas the Bgl2/EcoRI probe hybridizes to exons 2 and 3. With the latter probe, hybridization to exon 2 alone in pre-mRNA gives a 15-base shorter protected fragment which is apparent below the full-length protected fragment in the autoradiogram. Comparisons to the standard curves indicated that exon 1C was present at a concentration of 8 amol/g total testis RNA; exon 2/3 fusions were present at a concentration of 25 amol/g total testis RNA.
(not shown). Based on the size and distribution of this fragment, we suspect that it represents intronic sequences that have been excised from exon 1C. This species is present at about 0.2 amol/g total RNA, which is 1/25 and 1/5 of the concentrations of exon 1C-containing TBP mRNA and pre-mRNA, respectively.
The Sa/X probe also yielded a cluster of protected fragments centered around 200 bases in length that fit the criteria for mRNA. Thus, we predicted that this region likely contains an alternate exon with multiple initiation sites. Indeed, these bands were subsequently found to arise from hybridization to TBP mRNAs containing the various initiation sites for exon 1D (see below).
With the X/A (Fig. 3B) and A/R1 (Fig. 3C) probes we also detected protected fragments that fit the criteria for alternate exons. Thus, within the region between XhoI and ApaI, we found evidence for exonic regions of 152 and 50 bases in length (Fig. 3B). Between ApaI and EcoRI, we found evidence for an exonic region of roughly 55 bases in length (Fig. 3C). The region between EcoRI and exon 2 showed no evidence of containing exonic sequences (not shown).
Cloning TBP mRNA 5Ј End Variants from Liver and Testis-Our data indicated that additional promoters and alternate first exons were used in testis. Since the complexity observed by RNase protection suggested that there were multiple "missing exons," we wished to perform an exhaustive search that could recover all possible mRNA 5Ј end variants. Therefore, a sensitive ligation-mediated method for amplifying and cloning the 5Ј ends of cDNAs 2 was used to clone TBP cDNA 5Ј ends using rat testis, mouse testis, and mouse liver RNA preparations. This method is analogous to a "RACE"; however, it is more efficient at recovering rare cDNA ends. Of 144 clones, 74% contained TBP cDNAs. Restriction digestion revealed that, although exons 1C and 2 contain no PvuII site, 20% of the positive clones contained an internal PvuII site (Fig. 4), strongly suggesting the presence of a novel first exon. Moreover, Southern blots indicated that only 28% of the clones contained exon 1C (Fig. 4).
The Southern blots revealed a great amount of size complexity in exon 1C-containing clones (Fig. 4). This was taken as further evidence that exon 1C initiates at multiple sites (see below). The size heterogeneity also indicated that the method had recovered a very large number of distinct clones, and thus, the resultant library should have sufficient complexity to allow recovery of clones representing even very rare mRNA species.
Of the 119 clones that were sequenced, 88 contained a TBP cDNA as evidenced by having non-primer-encoded TBP exon 2 sequences (Fig. 4). Six different first exons and several splicing variations were cloned from testis, whereas only two of these variants were cloned from liver. In sum, 10 different mRNA "types" (this does not include different initiation site variants within a cluster; see below) were identified. No species-specific differences in exon or initiation site usage were observed between rat and mouse. Of the six first exons recovered, three To aid identification of exonic fragments, pseudo-pre-mRNA controls were included, nuclear and total RNA preparations were compared, and total and poly(A) ϩ -enriched preparations were compared. The Bg/St probe showed no evidence for mRNA or pre-mRNA species containing sequences from upstream of exon 1C (Ͻ0.1 amol/g total RNA; Fig. 1). A-C, bands corresponding to putative exonic sequences are demarcated by dark arrowheads on the autoradiograms and are indicated at the right of each panel. A, the Sa/X probe yielded one major (425 bases long) and one minor protected fragment (roughly 310 bases long) with pseudo-pre-mRNA. Both bands also appear at similar levels in total and nuclear testis RNA preparations and thus are attributed to pre-mRNA. Exon 1C appears as an 83-base band which is enriched in total as compared with nuclear RNA preparations. A series of fragments centered around 200 bases in length are also enriched in total as compared with nuclear RNA and thus likely represent mRNA sequences. The 350-base fragment seen with total and nuclear RNA but not with pseudo-pre-mRNA ( (1C, 1D, and 1E) were recovered numerous times; the other three were cloned only once each. However, as these latter three first exons were precisely spliced onto exon 2, we consider them rare, but bona fide, alternate first exons.
Mapping the tbp Gene Promoters and Exons-To locate the new exons, we sequenced the entire 5416-base pair genomic region from the SacI site 700 base pairs upstream of exon 1C to the PstI site in exon 3. Within this region, we were able to map four of the six first exons (exons 1B-1E) and one alternate intervening exon (see below). Two of the three rare first exons were not found and almost certainly lie further upstream than we have sequenced. Since these exons were not mapped on the gene, it is possible that each is composed of more than one exon. Because these exons, as well as exon 1B, represent less than 2% each of total TBP mRNA in testis (data not shown), we have not investigated them further.
Relative Contribution of TBP mRNA Variants to Total Liver and Testis TBP mRNA Pools-RNase protection probes specific for exons 1B, 1C, 1D, and 1E were used to quantitate the relative abundance of mRNAs containing each exon in liver and testis. Liver was chosen as the representative somatic tissue specifically because, of all the somatic tissues analyzed to date, liver contains the most TBP mRNA per cell (2). Thus, the probability of detecting rare TBP mRNAs is higher in liver than in other somatic tissues. In liver, only the probe to exon 1C gave a detectable signal (Fig. 5A). This was unexpected, as 37% of the cDNAs that we cloned from liver contained exon 1E. We are not sure why exon 1E-containing cDNAs were recovered from liver; we suspect that a stochastic bias might have favored production, amplification, or cloning of rare exon 1E-containing cDNAs from liver. In testis, probes to exons 1C, 1D, and 1E all gave strong signals (Fig. 5A). Quantitative comparisons of the amounts of TBP mRNAs containing exon 1C, exon 1D, or exon 1E indicated that, in adult testis, all three of these alternate first exons are similarly abundant (data not shown). mRNAs FIG. 4. Cloning TBP cDNA 5 end variants. TBP cDNA 5Ј ends were synthesized, amplified, and cloned. 144 individual clones were cut with PvuII, separated electrophoretically, stained with ethidium bromide, and photographed. Southern blots of the gels were probed with the Sau3AI/StuI RNase protection probe (Fig. 1) which is specific for exon 1C (Southern blots are shown below the ethidium bromide-stained gels). and are lanes containing HinFI-cut pBR322 and HindIII/ EcoRI-cut DNA markers, respectively. Below is listed the origin and identity of the clones. The abbreviations used are: r, rat; m, mouse; B, C, D, and E, exons 1B, 1C, 1D, and 1E, respectively; U, clones 97 and 99 contain different unmapped TBP first exons. Hyphens (Ϫ) denote clones that did not contain a TBP insert, including both clones with no insert and those with inserts not having non-primer-encoded TBP exon 2 sequences. Clones with no exon designation were not sequenced. Footnotes are as follows: 1 clone 32 was a mixed colony containing one clone with exon 1C and one with exon 1E; 2 clone 74 contained the 5Ј region of exon 1D spliced to an alternate intervening exon (designated * in Fig.  6), which was in turn spliced to the 3Ј end of exon 1E, and finally to exon 2; 3 clones that initiated at an internal position in exon 1D (see Fig. 8); 4 clones that used the alternate upstream splice donor site in exon 1D; 5 clones that initiated at an internal position in exon 1E (see Fig. 8).

FIG. 5. Quantitation of testis and liver levels of mRNAs containing exons 1C fused to 2 (1C/2), 1D/2, 1E/2, and 2/3. A,
RNase protection analysis using exon-specific probes. 100 g of total rat liver RNA (liver RNA:DNA ratio ϭ 5) was compared with equal DNA equivalents (40 g; testis RNA:DNA ratio ϭ 2) and serial dilutions of total rat testis RNA using probes specific for the indicated exons. The probes differed in length and in specific radioactivity, and thus, signal intensities in the various panels cannot be compared directly. These comparisons should be directed to Figs. 2 and 6. B, RT-PCR analysis of exon 1C/2 and 1D/2 fusions in testis, liver, and brain. Exon 1C/2-containing mRNA could be detected in equal amounts (1 g) of total and poly(A) ϩ mRNA from all tissues; exon 1D/2-containing mRNA could be detected in both total and poly(A) ϩ testis mRNA samples but not in either total or poly(A) ϩ mRNA preparations from non-testis tissues. Parallel control RT-PCR reactions using yeast RNA for the reverse transcriptase reaction and either primer set for PCR gave no detectable signal, thus confirming specificity (not shown). A Southern blot of the exon 1D/2 gel using the exon 1D-specific RNase protection probe showed that the band detected contained exon 1D and confirmed that detectable levels of exon 1D-containing PCR product did not exist in any of the non-testis samples (not shown). The abbreviations used are: B, brain; L, liver; T, testis.
containing exon 1B were too rare to detect in either testis or liver (less than 0.1 amol/g RNA; not shown).
Quantitative comparison of liver RNA to serial dilutions of testis RNA by RNase protection revealed the relative testis specificity of each exon type (Fig. 5A). A probe to exons 2 and 3 confirmed that overall TBP mRNA levels are 25-fold higher per equal DNA equivalent of tissue in testis (top panel). mRNA containing exon 1C was only 4-fold more abundant in testis than in liver. In contrast, although mRNAs containing exons 1D or 1E could not be detected in liver, dilutions of testis RNA indicated that our assay had the ability to detect 1/125 of the testis-specific signal. Thus, up-regulation of mRNAs containing exons 1D or 1E in testis is greater than 125-fold in magnitude. Similar RNase protection comparisons for all three first exons in testis, liver, brain, lung, and thymus confirmed that exons 1D and 1E were testis-specific (not shown). In a final attempt to detect exon 1D-containing TBP mRNA in liver, we used a reverse transcriptase-mediated polymerase chain reaction (RT-PCR) assay (Fig. 5B). The results showed that, whereas exon 1C/2 mRNAs could be detected in liver-and testis-polyadenylated mRNA samples and in liver, testis, and brain total RNA samples, exon 1D/2 mRNAs could only be detected in the testis samples. In conclusion, exons 1D and 1E appear to be truly testis-specific, whereas exon 1C is used in all tissues and is up-regulated modestly (4-fold above liver or 20-fold above thymus levels) in testis.
tbp Initiation Sites and Promoter Sequences-The ligationmediated cDNA amplification method allows precise identification of the 3Ј nucleotide of the first-strand cDNA synthesis product. 2 Thus, assuming that the template RNA was intact and the reverse transcriptase was processive, the sequence of the cDNA clone allows precise identification of the initiation nucleotide. The initiation sites thus identified on 5 clones containing exon 1C confirmed that this exon initiates at two clusters of sites as suggested by RNase protection (Fig. 1). Exons 1D and 1E were also found to initiate at multiple sites. For these two exons, two zones of initiation separated by 130 or 55 bases, respectively, were identified ( Fig. 6 and see below).
Although the sequence of the cDNA 5Ј end clones presented above allowed precise identification of the transcription initiation sites of individual mRNAs, it was possible that, after amplification and cloning, individual clones might have become either over-or under-represented in the population. Thus, we wished to confirm the relative frequencies of initiation site usage by direct primer-extension analysis. Primer extension on poly(A) ϩ -selected rat testis mRNA using either of two primers specific for sequences in exon 2 confirmed the initiation sites (Fig. 7). Samples containing poly(A) ϩ -selected liver RNA confirmed that the signals were testis-enriched, as expected for TBP mRNA-specific signals. The sizes of the individual products corresponded to those predicted from the cDNA 5Ј end clones in Fig. 4 and to the sizes of RNase protection products mapped in Figs. 1 and 3. The start sites are diagrammed in Fig. 8.
The regions upstream of all of the initiator exons are TATAless, which is consistent with their having clustered initiation sites (8). Roughly 35 bases upstream from the major site of internal initiation for exon 1E is the sequence TATAT, which bears some resemblance to a TATA box (Fig. 8) (9). However, the imprecise initiation noted in this region is reminiscent of TATA-less promoters, and thus it is likely that this sequence is not sufficient to direct transcription initiation to a single nu-FIG. 6. Gene structure, initiation site usage, and splicing patterns for the tbp gene in liver and testis. At top is shown a map of exon (heavy lines), intron, and 5Ј-nontranscribed (light lines) regions of the TBP gene between the SacI site 700 base pairs upstream of exon 1C and the BglII site in exon 2. The entire region depicted has been sequenced and submitted to GenBank. Note, the restriction enzyme sites are only for reference to probes and sequences in other figures in this paper; several of the enzymes shown will also cut at one or more other positions in the region shown. Below, bent arrows above the exons represent regions of transcription initiation; arrow thickness suggests relative rates of initiation at each region. In liver, only transcripts initiated at exon 1C and spliced onto exon 2 could be detected by RNase protection. However, cDNA clones containing exon 1E/2 fusions were obtained from liver, making it possible that these exist at a low level in liver (see text). Therefore exon 1C/2 fusions are estimated to represent greater than 95% of the TBP mRNA in liver. In testis, the eight mRNA types depicted as well as two others containing unmapped first exons (see text) were identified. The relative abundance of each mRNA type in testis as determined by RNase protection is depicted at the right.
cleotide. Comparison of the entire tbp 5Ј region with published data bases revealed no striking similarities (excluding repetitive sequences) with other published sequences. Indeed, with the exception of an SP1 binding site in the exon 1C promoter (6), the sequences upstream of all of the TBP initiation sites bear little obvious resemblance to previously identified promoters (see below). DISCUSSION TBP protein is required for all nuclear transcription initiation and, thus, is a fundamental component of all cells. Unexpectedly for a gene with such ubiquitous and apparently well defined functions, the tbp gene is highly overexpressed in the early haploid stages of spermatogenesis. Because spermatogenesis is a very complex process that involves interactions between many cell types (10), we suspect that a full understanding of the roles of TBP in spermatogenesis can only come through manipulation of TBP expression in animals. Identification of the spermatid-specific regulatory mechanisms for the tbp gene is a requisite step on the path to manipulating TBP expression in transgenic animals and, ultimately, to understanding the roles of TBP overexpression in spermatogenesis.
A molecular analysis of the 5Ј end of the tbp gene is presented. Our results show that liver uses almost exclusively a single promoter/first exon and produces predominantly a single species of TBP mRNA. In contrast, testis initiates transcription at no fewer than 3 major and 3 minor first exons and produces at least 10 different TBP mRNA types (6 abundant and 4 rare). This work precisely localizes the testis-specific transcription initiation sites and testis-specific exons on the tbp gene. This indicates where to focus a search for the spermatid-specific regulatory mechanisms, and more importantly, it indicates which sequences to target in a rational mutagenesis of putative spermatid-specific regions of the tbp gene in transgenic animals.
Testis-specific TBP mRNAs-What is the purpose for the testis-specific TBP mRNA heterogeneity? One possibility was that the different mRNAs might generate distinct protein products. However, only mRNAs initiated at the more upstream region of exon 1E have a translational start codon that could produce an alternate protein (72 amino acids long, not in the TBP reading frame), and the ATG for this polypeptide is in a poor context for initiation (11). Thus, we predict that all 10 types of TBP mRNA characterized here only give rise to normal TBP protein. A similar situation has been reported for the cytochrome c gene, which produces a testis-specific mRNA that is predicted to use the same open reading frame as somatic cytochrome c mRNA (12). It is possible that the upstream ATGs noted on some of the TBP mRNA 5Ј end variants might play a role in translational regulation (see below).
A second possibility is that each individual promoter is favored by a specific subset of nuclei types. Indeed, somatic tissue nuclei appear to use exclusively exon 1C for initiation. It was possible, for example, that exons 1C, 1D, and 1E were each preferred by germ cells at a specific developmental stage. However, we have analyzed the developmental onset of accumulation of mRNAs containing exons 1C, 1D, and 1E in testis, and we cannot detect differential temporal accumulation of any of these mRNAs (not shown).
A third possible reason for generating different mRNA species might be to allow different post-transcriptional regulation of TBP protein accumulation. Recently, a study on the copperzinc superoxide dismutase gene showed that, like for tbp, testisspecific expression involves recruitment of two testis-specific promoters in addition to the somatic promoter (13). Of the three different mRNA types that accumulate in testis, one (arising from one of the testis-specific promoters) is sequestered as ribonucleoprotein particles; the other two are predominantly polysomal (13). A fraction of the TBP mRNA in whole testis is also sequestered as ribonucleoprotein particles (not FIG. 7. Primer extension analysis of TBP mRNA start sites. Primer extension assays were performed using two end-labeled primers (designated 1 and 2) specific for sequences in TBP exon 2. Primer 1 was the BglII-containing primer; primer 2 hybridizes 39 bases farther downstream on the mRNA, yielding a 39-base longer extension product for each initiation site. Both primers gave similar patterns of extension products that differed between primers by 39 bases, as expected. Poly(A) ϩ -selected RNA from rat testis (T) or liver (L) were used. Liver Aϩ RNA gave little or no detectable extension, corroborating specificity. Using primer 1, several faint low molecular weight products were observed; however, these appear at similar intensities using either testis or liver Aϩ RNA (shown) as well as with testis, liver, or yeast total RNA (not shown), indicating that these products were nonspecific. All of the bands detected in the testis Aϩ RNA lanes shown were also detected on long exposures using total testis RNA (not shown); controls using various length pseudo-mRNAs containing any of the major TBP first exons gave only products of the predicted lengths, corroborating the processivity of the reverse transcriptase on TBP mRNA 5Ј end sequences. The testis lanes exhibit four major clusters of products and, below these, a weaker and more diffuse cluster of products. Products in the two largest clusters map to the predicted start sites for transcription initiation at the 5Ј end of exon 1D. Products in the lower clusters map to the sizes expected for initiation at the 5Ј end of exon 1C, and 1E (longer products), or for initiation at the internal positions of exons 1D and 1E (shorter products). Non-extended primers appear at the bottom of the autoradiogram; primer 2 was five bases shorter than primer 1, and therefore most of primer 2 ran off the gel in the experiment shown. Markers were from top, 1111, 510, 448, 399, 347, 301, 224, 157, and 75 bases long. shown). Presumably, these stored mRNAs are translated at a later time. Different sequences in the 5Ј-nontranslated leader of TBP mRNAs resulting from different first exon usage, possibly including the presence of upstream ATGs, might target mRNAs to be either translated immediately or stored for later use. 3 The numerous splicing variations that arise in testis are another curiosity. We can currently find no rational explana-tion for why transcripts initiating at the upstream region of exon 1D must use two splice donor sites (positions ϩ68 and ϩ179 in Fig. 8). Spermatids might simply provide a "sloppy" or "promiscuous" splicing environment, such that otherwise cryptic splice donor and acceptor sites can be used. Indeed, numerous genes have been shown to exhibit splicing patterns in testis that are not found in somatic tissues (e.g. Refs. 14 -16). In the case of TBP mRNA, where all of the alternatively spliced mRNAs are expected to produce the same protein, such alternate splicing would not be deleterious. 3 E. E. Schmidt and U. Schibler, submitted for publication. , strain-and species-specific differences in exonic sequences were negligible. Exonic sequences are shown by capital letters, arbitrarily assigning position ϩ1 to the most upstream cDNA that was cloned and sequenced for each initiation exon. For each initiation exon, 200 bases of upstream promoter sequence are shown. Initiation sites are indicated by vertical lines above the sequence; line thickness reflects the number of independent cDNA clones initiating at each site that were sequenced to the initiation site. For exon 1C, the initiation sites determined for sequenced clones map to the two clusters of initiation sites predicted in Figs. 1 and 2. The underlined region upstream of exon 1C represents the putative SP1 binding site (6). The 3Ј end of each first exon is labeled with an inverted black triangle and with the position of the splice site relative to ϩ1 for that exon. In each case, the first 10 bases of exon 2 are shown; these are underlined by a stippled box. For exon 1D, the alternative splice donor site for clones 31, 32, 34, and 74 is shown. The PvuII site noted by restriction mapping and sequencing of exon 1D cDNA clones from both MORO mice and Sprague-Dawley rats (Fig. 4) did not exist in the mouse 129 cell genomic clone; this sequence polymorphism is indicated. For exon 1E, the TATA-like sequence upstream of the internal initiation zone is boxed. The alternate splice acceptor site for clone 74 is indicated. At bottom, the sequence of the alternate intervening exon for clone 74 is listed; this sequence was spliced between the exon 1D alternate donor and the exon 1E alternate acceptor sites.
Testis-specific Transcription of the tbp Gene-Comparison to public data bases indicates that the tbp gene 5Ј end exhibits no notable sequence similarities to previously described spermatid-specific promoters. A search for putative transcription factor recognition sequences 4 revealed little evidence of what factors might be regulating this gene. Thus, although the entire 5Ј region of tbp is peppered with putative recognition sites for the "testis-determining factor," SRY (frequency, 1 site per 168 bases over 5416 base pairs; 17), a similar frequency (1 site in 281 bases) was found in the 111,400-base pair 0 -2.4-min region of the Escherichia coli genome. Thus, the concentration of putative SRY sites in the tbp gene does not appear to differ significantly from that found in an arbitrary and physiologically irrelevant sequence.
Putative binding sites were noted for two other spermatidenriched transcription factors, a site for the SRY-related Sox-5 protein (18) and several putative cAMP-response elements (CREs; 19) which are the binding sites for CREM and CREB. The one putative Sox-5 site is located roughly 850 base pairs downstream of the exon 1E promoter. Five putative CREs are found as follows: one roughly 165 base pairs upstream of exon 1B, one at the major point of initiation for exon 1C, one roughly 220 base pairs, and one roughly 35 base pairs upstream of exon 1E, and one roughly 200 base pairs downstream of the promoter of exon 1E. CREB is expressed in many testis cell types, whereas CREM is predominantly restricted to the germ cells (20). Two recent reports show that CREM-deficient mice cannot complete spermatogenesis (21,22). However, in one of these reports, TBP mRNA was used as a control for an RNase protection experiment, and its expression did not appear diminished in the CREM-deficient mice (22). Thus, it appears that the major CRE-binding protein in spermatids is dispensable for TBP overexpression. Binding sites for other putative spermatid-specific transcription factors such as Tet-1 (23) or Zfy-1 (24) have not yet been selected from random sequences, and thus remain largely undefined. It remains possible that, despite a lack of obvious sequence identity, the tbp gene might share testis-specific regulatory signals with other spermatid-specific genes. Accurate delineation of what cis-regulatory sequences are important for testis-specific TBP overexpression will require a functional analysis of these sequences in transgenic animals.
A final point to consider is the reason for the rare promoter/ first exons in testis. The mRNAs arising from these exons in testis are at least 10-fold less abundant than TBP mRNA in somatic cells and are predicted to yield the exact same protein product. Thus, we suspect that these transcripts do not have a unique function. Rather, they might result from promiscuous transcription initiation in spermatids. Spermatids contain greatly elevated levels of all measured components of the basal RNA polymerase II transcription machinery (2). 3 A model has been proposed for how increased levels of the transcription machinery should decrease promoter stringency and thus promote transcription initiation at sequences that would otherwise not be recognized as promoters (25). This model might explain many cases of spermatid-specific gene expression, including the rare TBP mRNAs.
In summary, testis-specific up-regulation of the tbp gene involves recruitment of two very strong testis-specific promoter/first exons. This work will be important for further resolving the signals regulating spermatid-specific TBP expression and ultimately for understanding the reason why spermatids contain 1000-fold more TBP mRNA molecules per cell than do somatic cell types.