Transcript Abundance in Yeast Varies over Six Orders of Magnitude*

In the current era of functional genomics, it is remarkable that the intracellular range of transcript abundance is largely unknown. For the yeast Saccharomyces cerevisiae, hybridization-based complexity analysis and SAGE analysis showed that the majority of yeast mRNAs are present at one or fewer copies per cell; however, neither method provides an accurate estimate of the full range of low abundance transcripts. Here we examine the range of intracellular transcript abundance in yeast using kinetically monitored, reverse transcriptase-initiated PCR (kRT-PCR). Steady-state transcript levels encoded by all 65 genes on the left arm of chromosome III and 185 transcription factor genes are quantitated. Abundant transcripts encoded by glycolytic genes, previously quantitated by kRT-PCR, are present at a few hundred copies per cell whereas genes encoding physiologically important transcription factors are expressed at levels as low as one-thousandth transcript per cell. Of the genes assessed, only the silent mating type loci,HML and HMR, are transcriptionally silent. The results show that transcript abundance in yeast varies over six orders of magnitude. Finally, kRT-PCR, cDNA microarray, and high density oligonucleotide array assays are compared for their ability to detect and quantitate the complete yeast transcriptome.

Measurements of the intracellular range of transcript abundance relied initially on hybridization-based complexity analysis and more recently on SAGE 1 analysis. For the yeast Saccharomyces cerevisiae, hybridization-based complexity analysis (1) and SAGE analysis (2) showed that 75% of poly(A) mRNA is encoded by only 20% of yeast genes. SAGE analysis also showed that 75% of yeast genes are expressed at 1 or fewer copies per cell (2). Because of technical limitations, neither method provides an accurate estimate of the range of low abundance transcripts encoded by the majority of yeast genes.
We previously demonstrated the accuracy, sensitivity, and reliability of kRT-PCR for quantitating mRNA levels in complex mixtures of total cellular RNA over a wide range of relative transcript abundance (3)(4)(5). In contrast to second order hybridization-based complexity analysis or SAGE analysis, where signal to noise decreases exponentially with decreasing transcript abundance, signal to noise is constant for kRT-PCR; only the PCR cycle number where product accumulation is detected varies with transcript abundance. For highly expressed yeast metabolic genes, mRNA levels determined by kRT-PCR are in good agreement (within 2-fold) with those made by Northern blotting, enzyme activity measurements, and SAGE (5). For highly repressed genes, fold repression measured by kRT-PCR versus enzyme activity are within 2-fold down to transcript levels of 0.01 copy per cell (5). Here we employ kRT-PCR to assess the full range of transcript abundance in yeast using selected subsets of the yeast transcriptome and total cellular RNA isolated from early log phase cultures of strain BY4742 (derived from strain S288C) grown in YPD medium.

EXPERIMENTAL PROCEDURES
Each kRT-PCR assay was performed in a 20-l reaction tube containing: 50 mM Tricine buffer, pH 8.3, 110 mM potassium acetate, 13% glycerol, 0.3 mM dATP, dCTP, and dGTP, 0.05 mM dTTP, 0.5 mM dUTP, 2.4 mM Mn(OAc) 2 , 2.5 M ethidium bromide, 0.25 M primers, 4 units of rTth DNA polymerase, 2 units of uracil N-glycosylase, and 120 ng of total yeast cellular RNA template. Yeast strains were grown to early log phase in YP medium containing 2% glucose. Total cellular RNA was extracted from glass bead-disrupted cells and treated with RNase-free DNase I to eliminate residual genomic DNA (5). Primer pair design, instrumentation, and data analysis were as described previously (5). Transcript copy numbers per cell reported here are the average of 15 independent kRT-PCR assays for each mRNA.

RESULTS
To assess the range of expression levels for genes along a single chromosome, transcripts from all 65 genes and computationally annotated ORFs on the left arm of yeast chromosome III were quantitated. Steady-state transcript copy number varied by more than four orders of magnitude, from GLK1 (glucokinase) at 17 copies per cell to YCL069 at 0.001 copy per cell (Fig. 1). With the exception of low abundance transcripts encoded by six ORFs located between the telomere and the HML locus (0.04 -0.001 transcript per cell), no obvious clustering of genes encoding high versus low abundance transcripts was detected. The majority of abundant transcripts (2.2-17 transcripts per cell) encoded by 12 genes on the left arm of chromosome III code for metabolic enzymes.
In contrast to mRNAs encoding metabolic enzymes, those encoding transcription factors should be representative of low abundance transcripts. As expected, steady-state levels for 185 transcription factor genes varied over four orders of magnitude with the majority (82%) present at 1 or less transcript per yeast cell (Fig. 2). GCN4 encodes a leucine zipper transcription factor involved in general amino acid control and was the most abundant at 7 copies per cell. The relatively high level of GCN4 mRNA may compensate for the fact that translation of this mRNA in vivo is under strong negative control (6). SPT15, which encodes TBP (TATA-binding protein), required for all nuclear transcription (7) was present at 0.8 copy per cell whereas the SUA7 transcript, encoding TFIIB, required for all transcription by RNA polymerase II (8) was present at 4 copies per cell. Transcripts encoding TAFs (TBP associated factors) (9), subunits of the SRB-mediator complex (9) and the swi-snf complex (9) ranged from 2 to 0.1 copies per cell with the majority present at 1 to 0.5 transcripts per cell. Transcripts encoding eight members of the basic helix-loop-helix family of transcription factors (TYE7, INO4, RTG3, PHO4, INO2, RTG1, HMS1, and CBF1) ranged from 3 to 0.09 transcripts/cell. IXR1, GAT1, GLN3, SWI6, and SPT3 (0.0009, 0.002, 0.004, 0.005, and 0.005 transcripts per cell, respectively) encoded very low transcript levels.
To fully assess low level transcription in yeast cells, expression of the mating-type specific regulatory genes was determined using total cellular RNA isolated from haploid and diploid cells (Fig. 3). Expression of the regulatory genes present at the silent mating type cassette HML␣ measured in mating type a haploid cells or the silent mating type cassette HMRa measured in mating type ␣ haploid cells was not detected using the kRT-PCR assay (less than 0.0001 transcript per cell). Similarly, ␣1 expression in diploid cells was not detected using the kRT-PCR assay. Expression of the regulatory genes at the MAT locus was very similar for the strains derived from S288C or D27310B. Unexpectedly, expression of a1 and ␣2 in diploid cells was 4-to 5-fold lower than in mating type a haploid cells or mating type ␣ haploid cells, respectively.
Hybridization-based cDNA microarray and high density oligonucleotide array technologies are widely used for transcript profiling. To access the detection limits and accuracy of the cDNA microarray assay for quantitating the full range of yeast transcript abundance, cDNA microarray raw fluorescence data (10) were plotted versus transcript copy number per cell as determined by kRT-PCR assay for 275 yeast transcripts, Fig.  4a. These genes include: 25 relatively high abundance mRNAs coding for enzymes involved in glycolysis, gluconeogenesis, eth-anol synthesis, and glycerol metabolism, previously quantitated by kRT-PCR (5); 65 transcripts encoded by genes on the left arm of chromosome III; and the 185 transcription factor transcripts described above. Both sets of data were for early log phase cultures of a derivative of strain S288C grown in YPD medium (10). The plot recapitulates a second order hybridization curve where most of the change in fluorescence occurs between 1 and 100 transcript copies per cell. Below 1 transcript per cell, the fluorescence data are scattered, and the large changes in transcript copy number per cell as measured by kRT-PCR assay are not evidenced in the raw fluorescence data. A similar curve is obtained when expression levels measured by high density oligonucleotide arrays for log phase cultures of a derivative of strain S288C grown in YPD medium (11) were plotted versus transcript copy number per cell determined by kRT-PCR assay (Fig. 4b). An exponential curve fit to these latter data displays a breakpoint at about 2 transcript copies per cell (Fig. 4c). Transcript copy numbers per cell, determined by kRT-PCR, are in reasonable agreement with both array analyses down to 2 copies per cell. Below 2 transcripts per cell, however, there is little coincidence between raw fluorescence determined by cDNA microarray or expression level determined by high density oligonucleotide array and transcript copy number per cell determined by kRT-PCR. Furthermore, transcripts displaying significant deviation (scatter) from the exponential curve fit for the cDNA microarray data (Fig. 4a) are different from those displaying significant deviation from the exponential curve fit for the high density oligonucleotide array data (Fig. 4b). DISCUSSION The results presented here using kinetically monitored RT-PCR extend previous hybridization complexity (1) and SAGE (2) analyses to reveal the full range of transcript abundance in yeast. Yeast transcript abundance ranged from a few hundred copies per cell for glycolytic mRNAs to one-thousandth transcript per cell for transcripts encoding some of the transcription factors. No transcripts were detected from the silent mating type loci in haploid cells or ␣1 in diploid cells (less than 0.0001 transcript per cell). Consistent with these latter observations, the HM␣ and HMRa loci are subject to active transcriptional silencing mechanisms (12), and ␣1 expression in diploid cells is repressed by the diploid cell-specific a1/␣2 repressor (13). Thus, specialized silencing or repression mechanisms are necessary to reduce transcription below the detection level of the kRT-PCR assay. Taken together, these results show that yeast transcript abundance varies over six orders of magnitude. The physiological importance of low level transcripts is illustrated by the ␣1 transcript (0.003 and 0.008 transcript/cell in the S288C and D273-10B ␣ haploid strains, respectively), which encodes a transcription factor required for expression of ␣-specific genes (13). The levels of transcripts encoded by IXR1, GAT1, GLN3, SWI6, and SPT3 (0.0009, 0.002, 0.004, 0.005, and 0.005 transcript per cell, respectively) are comparable with or below those observed for the telomere proximal ORFs described above for chromosome III. Remarkably, the level of the IXR1 transcript is comparable with the level of "readthrough" transcript observed for the intergenic region between the divergently transcribed yeast GAL1 and GAL10 genes (0.00075 copy per cell) (5). This latter steady-state concentration would be obtained if a transcript with a half-life of 10 min were synthesized once during a 2-h cell division cycle. Because cellular proteins are typically more stable than the mRNAs that encode them and because the cytoplasm and nucleoplasm are shared during cell division, very low steady-state levels of mRNA can direct synthesis of physiologically important proteins. IXR1 encodes a high mobility group domain protein that binds cisplatin-DNA adducts and represses transcription of the FIG. 3. Transcript copy number per cell for the yeast mating type-specific (MAT) regulatory genes in a haploid, ␣ haploid, and a/␣ diploid yeast cells. The organization of the mating typespecific regulatory genes a1, a2, ␣1, and ␣2 are indicated for the two silent mating type cassettes HML␣ and HMRa as well as the MAT␣ locus. The X and Z1 sequences are identical at all three loci whereas Y␣ and Ya sequences are unique. Unique primer pairs were used to quantitate the ␣1, ␣2, and a1 transcripts. Because the a2 transcript is identical to sequences within the ␣2 transcript, a2 transcript levels were quantitated in haploid cells utilizing a primer pair that quantitates a2 plus ␣2 transcript. Mating type-specific transcripts were quantitated in strains S273-6B, S273-29A, and S273-6BX29A, which are derived from the parental strain S273-10B. Mating type-specific transcripts were also quantitated in strains YPH499, YPH500, and YPH501, which are derived from the parental strain S288C.

FIG. 4.
Comparison of the range of yeast transcript copy number determined by kRT-PCR assay and hybridization-based array technologies. a, raw fluorescence data for 275 yeast transcripts, determined by cDNA microarray analysis (10), were plotted against transcript copy number per cell determined for the same yeast transcripts by kRT-PCR. b, expression level data for 275 yeast transcripts, determined by high density oligonucleotide array (11), were plotted against transcript copy number per cell determined for the same yeast transcripts by kRT-PCR assay. c, an expanded version of the data shown in b is shown for transcripts in the range of 10 to 0.001 copies per cell. For each panel, an exponential curve fit of the data is shown.
yeast COX5b gene (14). GAT1 and GLN3 encode members of the GATA transcription factor family involved in activation of transcription of genes subject to nitrogen catabolite repression (15). The SWI6 gene product is involved in regulation of transcription at the G 1 /S boundary of the mitotic cell cycle (16). Finally, SPT3 encodes a subunit of the SAGA complex involved in histone acetylation (17).
The kRT-PCR assay and the hybridization-based array technologies were compared with respect to transcript detection and quantitation of the 250 yeast transcripts reported here and 25 abundant transcripts previously quantitated by kRT-PCR (5) using published array data for the same transcripts (10,11). The comparison reveals some significant limitations for arraybased detection and quantitation of yeast transcript present at one or fewer copies per cell. The analysis shows that these limitations are imposed by the second order nature of hybridization kinetics. Whereas the array technologies offer the advantage of relatively high throughput analysis of large numbers of transcripts, kRT-PCR offers clear advantages for monitoring mRNA levels over the complete range of intracellular levels. This is an important consideration because the majority of yeast transcripts are present at one or fewer copies per cell. These detection limits are not restricted to yeast, however, because the intracellular range of transcript abundance in bacteria as well as metazoans is likely to be comparable with yeast when adjusted for cell volume and RNA content.
The sensitivity of the kRT-PCR assay is ideally suited for assessing whether or not computationally annotated ORFs revealed by genomic sequences are expressed. More generally, kRT-PCR measures the full range of change in transcript level in different genetic, physiological, or developmental changes whereas the array technologies are likely to underestimate the magnitude of such changes. This latter point is illustrated for yeast genes that are subject to carbon catabolite repression. Derepression of the yeast phosphoenolpyruvate carboxykinase (PCK1), fructose bisphosphatase (FBP1), and iso-1-cytochrome c (CYC1) genes measured by kRT-PCR for cells grown on a nonfermentable carbon source versus glucose were 1000-, 200-, and 40-fold, respectively (5). Derepression of these same genes measured by cDNA microarray after diauxic shift (the switch from growth on glucose to nonfermentative growth on the products of glycolysis, ethanol and glycerol) were 14-, 13-, and 3-fold, respectively (10). The levels of these transcripts are above 1 copy per cell after diauxic shift; however, the fully repressed levels of these mRNAs are well below the detection limit of the cDNA microarray assay. Thus, large changes in gene expression can be underestimated or entirely missed by microarray assay depending on the abundance range over which a particular transcript varies.