Reproducibility of Oligonucleotide Microarray Transcriptome Analyses

Assessment of reproducibility of DNA-microarray analysis from published data sets is complicated by the use of different microbial strains, cultivation techniques, and analytical procedures. Because intra- and interlaboratory reproducibility is highly relevant for application of DNA-microarray analysis in functional genomics and metabolic engineering, we designed a set of experiments to specifically address this issue. Saccharomyces cerevisiae CEN.PK113-7D was grown under defined conditions in glucose-limited chemostats, followed by transcriptome analysis with Affymetrix GeneChip arrays. In each of the laboratories, three independent replicate cultures were grown aerobically as well as anaerobically. Although variations introduced by in vitrohandling steps were small and unbiased, greater variation from replicate cultures underscored that, to obtain reliable information, experimental replication is essential. Under aerobic conditions, 86% of the most highly expressed yeast genes showed an average intralaboratory coefficient of variation of 0.23. This is significantly lower than previously reported for shake-flask-culture transcriptome analyses and probably reflects the strict control of growth conditions in chemostats. Using the triplicate data sets and appropriate statistical analysis, the change calls from anaerobicversus aerobic comparisons yielded an over 95% agreement between the laboratories for transcripts that changed by over 2-fold, leaving only a small fraction of genes that exhibited laboratory bias.

Together with other system-wide analytical techniques, genome-wide transcriptional analysis with oligonucleotide microarrays is rapidly transforming modern biological science (1). This transformation is reflected in the growing quantity of research and review articles published in the last few years that deal specifically with this topic.
Initially, the majority of microarray experiments focused on the yeast Saccharomyces cerevisiae as the experimental system, because the quality of its sequence information is extremely high and therefore appropriate for whole genome-coverage array design (2,3). Following several papers that were focused solely on the use of microarrays in profiling global gene expression, more recent articles have reflected their use as an integrated tool for research of cellular physiology. This includes identification of target genes for functional analysis (4) as well as in the developing field of integrative whole-cell analyses (5).
In practical terms, DNA microarray experiments are expensive and generate vast amounts of data from which relevant changes in transcript levels need to be identified. To optimize experimental efficiency, it is of critical importance to minimize experimental noise, standardize handling protocols, and perform appropriate experimental replication and statistical analyses. The Affymetrix oligonucleotide GeneChip arrays (6) have been designed to overcome some concerns of reproducibility and hybridization specificity by including multiple (ϳ16) pairs of oligonucleotides per transcript to measure both specific and nonspecific hybridization. Furthermore, all procedures associated with probe preparation, hybridization, and washing have been standardized (7). In the literature, reports concerning the quantitative accuracy of microarray results (8,9) and the quality of the associated protocols (10,11) have set commercial oligonucleotide array experiments in a favorable light. On the question of data interpretation, however, many research articles still apply arbitrary rules to decide whether transcript levels differ (change calls) between arrays of an experiment. Furthermore, there is uncertainty that arises from the exchange of data among different laboratories, for example via freely accessible databases, because no direct interlaboratory comparisons have been performed on the reproducibility of DNA-array data.
In addition to analytical procedures, the techniques used for cultivation and sampling may contribute to experimental variation. Many of the available microarray data for S. cerevisiae have been obtained with shake-flask cultures (for a selection of literature see Refs. [12][13][14][15][16][17][18][19][20][21][22] where it is known that important cultivation conditions (dissolved oxygen concentration, metabolite concentrations, pH, etc.) change over time. These problems are augmented when complex media, such as the commonly used yeast extract-peptone-based media for yeast cultivation (23) are used. In such media, the sequential use of available substrates (in particular nitrogen sources) is likely to increase experimental variation between replicate cultures. This is combined with the fact that in shake flasks growth can only be studied at the maximum specific growth rate ( max , h Ϫ1 ) for the given set of environmental conditions. This limits the applicability of transcriptome analysis, because, in industrial applications of microorganisms as well as in their natural environments, growth is nearly always limited by nutrient availability. Moreover, because specific growth rate is known to have a drastic impact on gene expression in S. cerevisiae (24), changes in the environmental conditions or introduction of targeted genetic modifications that affect specific growth rate will indirectly impact on genome-wide expression levels. These factors complicate data interpretation.
Chemostat cultivation is a laboratory technique that has been especially developed to grow microorganisms and cell cultures of higher eukaryotes under constant, carefully controlled conditions. Automated adjustments made in response to on-line monitoring enable strict control of culture pH, dissolved oxygen concentration, and temperature, all of which are known to affect transcription (13,18,25). Furthermore, by controlling the rate at which a single growth-limiting nutrient is supplied, the specific growth rate (, h Ϫ1 ) can be maintained at a chosen, constant value. This allows for transcriptome comparisons between different environmental conditions or between different strains at an identical, submaximal specific growth rate.
The aim of the present study was to critically evaluate the reproducibility of DNA-microarray transcriptome analyses in nutrient-limited chemostat cultures of S. cerevisiae, using commercially available oligonucleotide arrays. We based this study on a previous experiment (25) by comparing multiply replicated aerobic and anaerobic chemostat cultures. This earlier work involved comparison of only single arrays from each condition and revealed 359 genes that differed by more than 3-fold between aerobic and anaerobic growth. By expanding this analysis to include two different laboratories that ran independent chemostat cultures and ran independent microarray analyses, we also investigated the interlaboratory reproducibility of this microarray experiment. The primary goal of this study was to study the methodology and reproducibility of microarray analyses and not to biologically interpret the different transcript profiles in anaerobic and aerobic cultures. The data sets used for this study are available at www.cbs.dtu.dk/yeast/.

MATERIALS AND METHODS
Strain and Maintenance-This study was performed with the prototrophic laboratory strain S. cerevisiae CEN.PK113-7D (MATa) (26). The two laboratories involved in this study independently obtained this strain from the EUROSCARF strain collection (Frankfurt, Germany) courtesy of Dr. P. Kötter. Upon arrival, the yeast was grown in shakeflask cultures and stored frozen with glycerol in small aliquots as described previously (27). These frozen stock cultures were used to inoculate precultures for chemostat cultivation.
Chemostat Cultivation-Steady-state chemostat cultures were grown in Applikon laboratory fermentors of 1-liter working volume as described in detail elsewhere (28). In brief, the cultures were fed with a defined mineral medium containing glucose as the growth-limiting nutrient (29). The dilution rate (which equals the specific growth rate) in the steady-state cultures was 0.10 h Ϫ1 , the temperature was 30°C, and the culture pH was 5.0. Aerobic conditions were maintained by sparging the cultures with air (0.5 liter⅐min Ϫ1 ). The dissolved oxygen concentration, which was continuously monitored with an Ingold model 34-100-3002 probe, remained above 80% of air saturation. For anaerobic cultivation, the reservoir medium was supplemented with Tween 80 and ergosterol as described previously (29). Anaerobic conditions were maintained by sparging the medium reservoir and the fermentor with pure nitrogen gas (0.5 liter⅐min Ϫ1 ). Furthermore, Norprene tubing and butyl rubber septa were used to minimize oxygen diffusion into the anaerobic cultures (30).
DNA Microarrays-For a detailed description of the Affymetrix GeneChip, see Ref. 6. Briefly, genes are represented on the arrays as a panel of spots, with each spot containing a different 25-mer oligonucleotide sequence that is complementary to part of a transcript (perfect match oligonucleotide). In addition, each perfect match sequence is accompanied by a neighboring spot that contains an oligonucleotide with a single nucleotide different from its partner (mismatch oligonucleotide). The difference between the signals from these "probe pairs" is combined for the whole "probe set" to give a value called the "average difference." This is a specific measure of transcript abundance in the sample. Most genes are represented by 16 probe pairs, but if unique sequences are limited for a gene, an incomplete probe set is used. The Affymetrix S98 yeast microarrays contain probe sets representing 9335 distinct transcription features, of which 6383 were nominated yeast genes due to assignment of either a standard yeast open reading frame abbreviation (e.g. YAL001c) or a known function (e.g. SUC4 encoding invertase) (31).
Sampling and RNA Isolation-Samples from the chemostat cultures were taken as rapidly as possible to limit any potential changes in transcript profiles during the procedure. 40 -60 ml of culture broth was sampled directly from the chemostat into a beaker containing 200 ml of liquid nitrogen. With vigorous stirring, the sample froze instantly. The frozen sample was then broken into small fragments and transferred to a 50-ml centrifuge tube. The sample was then thawed at room temperature, ensuring that it remained as close to zero as possible. Cells were pelleted (5000 rpm at 0°C for 4 min), resuspended in 2 ml of ice-cold AE buffer (50 mM sodium acetate, 10 mM EDTA, pH 5.0) and aliquoted into 5 Eppendorf tubes. This corresponded to ϳ20 mg of dry weight per tube. For each array, total RNA was extracted from a single tube using the hot-phenol method (32) or the FastRNA kit, Red (BIO 101, Inc., Vista, CA).
Probe Preparation and Hybridization to Arrays-mRNA extraction, cDNA synthesis, cRNA synthesis and labeling, as well as array hybridization were performed as described in the Affymetrix users' manual (7). Briefly, poly(A) ϩ RNA was enriched from total RNA in a single round using the Qiagen Oligotex kit. Double-stranded cDNA synthesis was carried out incorporating the T7 RNA-polymerase promoter in the first round. This cDNA was then used as template for in vitro transcription (ENZO BioArray High Yield IVT kit), which amplifies the RNA pool and incorporates biotinylated ribonucleotides required for the staining procedures after hybridization. 15 g of fragmented, biotinylated cRNA was hybridized to Affymetrix yeast S98 arrays at 45°C for 16 h as described in the Affymetrix users' manual (7). Washing and staining of arrays were performed using the GeneChip Fluidics Station 400 and scanning with the Affymetrix GeneArray Scanner.
Data Acquisition and Primary Analysis-Acquisition and quantification of array images as well as primary data analysis were performed using the Affymetrix software packages: Microarray Suite version 4.0.1, MicroDB version 2.0, and Data Mining Tool version 2.0. Microsoft Excel was used for further statistical analyses.
All arrays were globally scaled to a target value of 150 using the average signal from all gene features using Microarray Suite version 4.0.1. When pairwise comparisons were performed (using Microarray Suite version 4.0.1), a transcript was considered "changed" when a call of Increase or Decrease was made, the -fold change was at least 2, and the higher of the two average difference values was called present.
Statistical Comparison of Data from Replicate Experiments-The Significance Analysis of Microarrays (SAM version 1.12) 1 add-in to Microsoft Excel was used for comparisons of replicate array experiments (33). SAM assesses the difference between two mean values when taking into account the standard errors of those means. The significance of that difference is estimated by comparing it against the probability of its occurrence by chance alone. The model of chance occurrence is generated by permutation of the input data, rather than a predetermined model (e.g. a normal distribution), as is used by the t test.
The SAM algorithm was used instead of the standard t test, because it showed a better ability to scale down to small numbers of replicates. This was determined by comparing the significant change calls made by SAM and the t test for triplicate arrays against the change calls made by the t test using the sextuplicate arrays (p Ͻ 0.005; the lowered p threshold was used to reduce the expected number of false positives, which increases linearly with the number of t tests performed; see for review, Ref. 34). This conclusion concerning SAM was also reached in a recent study by Lönnstedt and Speed (35) using a simulated data set.

RESULTS
Non-biological Sources of Experimental Variation-Many in vitro handling steps are required for transcriptome analyses with Affymetrix GeneChips. Each of these processing steps has the potential to generate non-biological variability in the data. To quantify the potential sources of error, we carried out a set of replicate preparations on a single sample. To ensure any variability measured was of non-biological origin, our two laboratories performed the same set of preparations on cells from different growth conditions. Cells used by laboratory A were grown aerobically while those used by laboratory B were grown anaerobically.
To test the combined reproducibility of hybridization, washing, staining, amplification, and microarray scanning, a single pool of fragmented cRNA was used for hybridization to two arrays within each laboratory ( Fig. 1; arrays 1 and 1a, and 11 and 11a). This yielded 36 and 10 difference calls (based on the criteria outlined for pairwise comparisons under "Materials and Methods") in laboratory A and laboratory B, respectively. By visual inspection of the scanned images for each of these changes, ϳ10% of these calls were shown to be due to microscopic faults on the arrays (e.g. scratches, bright spots, or darkened areas).
The variation introduced by cell breakage, mRNA preparation, cDNA synthesis, and in vitro transcription was assessed by hybridizing two arrays with different cRNA preparations from a single chemostat sample (Fig. 1, arrays 2 and 2a, and 12 and 12a). Once again, this comparison was performed with aerobically grown cells in laboratory A and anaerobically grown cells in laboratory B. These comparisons yielded 80 and 106 difference calls, respectively. In all comparisons, less than 20% of the changes exceeded 3-fold (this category included all changes that originated from array faults).
On a genomic scale (6383 designated yeast genes and open reading frames were included in the comparisons), the numbers for variability introduced by the sample preparation steps were satisfyingly low, indicating that they were highly reproducible. When the identities of these differentials were compared for the two laboratories, we found the variation introduced to be unbiased because only five genes were found to be common to any two comparisons. Two of these (YAL005c and YNL140c) could be explained by incomplete probe sets on the arrays.
Intralaboratory Reproducibility-To assess the experimental variation that results from replicate cultivation, each laboratory grew three independent steady-state chemostat cultures under aerobic as well as under anaerobic conditions (Fig. 1). To quantify the variation introduced by replicate cultivation, all possible pairwise comparisons from within growth conditions (15 each) were performed. On average, this resulted in 402 Ϯ 170 (range, 66 -869) difference calls (using a change threshold of 2.0) for independent replicate cultures grown under the same experimental conditions. These experiments illustrate that, even after extensive precautions to standardize cultivation conditions, any comparison of two conditions that is based on a single-array from each condition could result in extremely high numbers of difference calls. Such potential high false-discovery rates are not acceptable in most biological experiments and demonstrate that replication of experiments is a prerequisite for meaningful application of microarray analysis.
To visualize the variability for each individual S. cerevisiae transcript within triplicate experiments performed in a single laboratory, the coefficient of variation (standard deviation divided by the mean) for each transcript was plotted as a function of its average transcript level (Fig. 2). This representation revealed several important features concerning intralaboratory variation. First, for the majority of transcripts, the variability measured in each laboratory was very low; second, at low signal intensities, the signal-to-noise ratio decreased; third, some genes with high average-signal intensities still exhibited large variability; and fourth, the signals from anaerobically cultivated cells were subject to slightly higher variability than those from aerobically cultivated cells.
Based on these observations, we deemed the 900 transcripts with the lowest expression to be poorly reproducible (illustrated by an upturn in the trend line representing the coefficient of variation; Fig. 2). These genes gave a signal intensity on the array of less than 6% of the average intensity for a yeast gene. When the average coefficient of variation was calculated for the remaining 5483 (86%) "measurable" yeast genes, the values were 0.23 and 0.20 for aerobic cultures grown in laboratory A and laboratory B, respectively, whereas the corresponding values for anaerobic cultures were 0.27 and 0.29.
The closeness of these values indicates that chemostat cultivation and microarray analysis were carried out with a similar level of reproducibility in the two laboratories. This, however, did not yet exclude the possibility that some, or even many, transcripts were present at reproducibly different levels in the two laboratories.
Interlaboratory Reproducibility-One of the key goals of transcriptome analysis is to identify biologically meaningful differences in gene expression under varying experimental conditions or in different microbial strains. The simplest way of assessing these differences is using change calls. This requires an appropriate statistical tool to decide on the significance of gene expression changes between growth conditions. To investigate whether the change calls between the aerobic and anaerobic cultures found in laboratories A and B were consistent, we used the statistical analysis software package Significance Analysis of Microarrays (SAM (33)). Specifically, SAM was used to evaluate the interlaboratory agreement of observed changes between the aerobic and anaerobic cultures, by inves- FIG. 1. Experimental design for assessing reproducibility of oligonucleotide-microarray analysis on chemostat cultures. Both laboratories performed biological and non-biological replicate microarray experiments covering both aerobic and anaerobic cultivation conditions. Two pairs of microarrays were run to assess the non-biological variability from the in vitro handling steps. These were replicate microarrays from a single source of cRNA (arrays 1 and 1a in laboratory A and arrays 11 and 11a in laboratory B) and replicate microarrays from a single source of cells (arrays 2 and 2a in laboratory A and arrays 12 and 12a in laboratory B). Furthermore, to assess the reproducibility of independent replicate cultures, three aerobic and three anaerobic replicates were done in each laboratory.
tigating the anaerobic/aerobic comparison from each laboratory separately.
A graphical representation of the -fold changes found in the two laboratories already indicated generally good agreement (Fig. 3). Not surprisingly, the agreement worsened at low -fold change values, with a substantial number of genes changed in opposite directions (found in the upper left and lower right quadrants of the graph). This observation was quantitatively supported by the statistical analysis. The consistency of the change calls in the two laboratories was over 95% for genes with a -fold change above 2.0, but strongly decreased at lower -fold changes (Table I). Above 3.0-fold change, only less than 3% of the change calls (representing a total number of 14 transcripts) was laboratory-specific.
In a further analysis, randomly picked combinations of two aerobic and two anaerobic array data sets were considered for genes that exhibited a 1.5-to 2-fold change in transcript level. Statistical analysis by SAM revealed that, at these low -fold changes, pairs of data sets originating from one laboratory only agreed on average in 42 and 47% of cases for laboratory A and laboratory B, respectively. When "mixed" pairs of data sets were considered, poorer consistency was observed because agreement dropped to 20% on average. This indicates that, at low -fold changes, some laboratory bias occurred. In a similar interlaboratory comparison of the triplicate data sets for each cultivation condition using SAM, 246 differentials were ob-served between laboratories for aerobic conditions and 90 differentials for anaerobic conditions. In both cases, at least twothirds of these differences fell between 1.5-to 2-fold change. Because these interlaboratory differences increase with decreasing -fold change values, they highlight the caution with which such small changes should be treated.
Despite the problems outlined above, many genes with relatively low -fold changes were in good agreement between the two laboratories due to similar absolute transcript abundance. This is illustrated in Fig. 4, which compares data from the two laboratories for the transcripts encoding enzymes of the tricarboxylic acid cycle and branched-chain amino acid biosynthesis.

Intralaboratory Reproducibility of Transcriptome Analysis-
Using the procedures for RNA extraction and microarray analysis recommended by the manufacturer of the equipment used in this study (7), the sum of the variations introduced by all handling steps was very low. More importantly, our results indicated that the direction and magnitude of these variations were not reproducible, implying that they can be eliminated by replication of experiments. These findings extend the conclusion of Baugh et al. (10) that a single round of amplification and labeling by the in vitro transcription step was the source of low, unbiased variation.
In contrast to sample handling, independent culture replica-  1 to 6383) on the x-axes were generated by ranking the genes by increasing average transcript level. A list of the 900 transcripts with the lowest average expression in all conditions can be found at www.cbs.dtu.dk/yeast/. tion was found to introduce significant experimental variability. The present study involved the use of a standardized commercial system for DNA-microarray analysis, standardized protocols for chemostat cultivation, and a single S. cerevisiae strain. From triplicate cultures under these optimized conditions, 14% of the genome was expressed below the reliable detection limit of this technique, and analysis of the remaining 5483 genes showed an average experimental variation of 0.20 -0.23 and 0.27-0.29 for aerobic and anaerobic chemostat cultures, respectively (Fig. 2).
Because a total of 12 independent cultivations, run in two different laboratories, was taken into account in this study, the reported experimental variations can be taken as reliable indicators for the reproducibility of transcriptome analysis using chemostat cultures. The higher experimental variation that was observed in the anaerobic cultures as compared with the aerobic cultures may reflect the technical difficulties in maintaining "true" anaerobic conditions in laboratory fermenters (30). Because the biosynthetic requirements of yeasts for oxygen can be extremely small (30,36), minute leakages of oxygen into the cultures might already have a significant impact on gene expression.
The experimental variation for each transcript, derived from independent replicate experiments, enables the application of statistical algorithms to evaluate whether transcript levels have changed as a function of cultivation conditions or genetic interventions. This issue lies at the center of DNA microarray studies for metabolic engineering and functional analysis. We found our data from triplicate arrays to be most informative when using the statistical algorithm SAM in combination with a lower -fold change threshold of 2.0. For data sets with higher average experimental variation than found in the present study, fewer genes would meet the criterion established here when using SAM and the -fold change cutoff of 2.0 would be too low for reliably assigning biologically meaningful change calls.
The experimental variability of independent replicate cultures would lead to the identification of many false positives if culture replication were not performed. Previous data published from one of our laboratories also addressed the question of genome-wide transcriptional differences between anaerobically and aerobically grown cells (25). In this study, only two arrays were used, one for the aerobic condition and one for the anaerobic condition. This data set was compared with that from the full comparison in the present study (encompassing six aerobic cultures and six anaerobic cultures), in both cases applying an arbitrary -fold change threshold of 2.0. If it is assumed that the six-by-six comparison provides the "true" transcriptional response of S. cerevisiae, 257 of the 818 change calls reported in the previous publication were "false positives." When the threshold for change calls was raised to 3.0, 65 of 259 change calls were identified as "false." As expected, an increase in the -fold change threshold is accompanied by an increase in the accuracy of the change calls, but at the cost of ignoring many genes that should have been called changed. These data graphically illustrate the need for experimental replication in DNA-microarray studies.
Interlaboratory Reproducibility of Transcriptome Analysis-At a time when formulation of hypotheses and experimental design are increasingly based on transcriptome data from other laboratories compiled in electronic databases (37), interlaboratory reproducibility of DNA-microarray data is a key issue. Both in terms of absolute transcript levels and in terms of change calls in the aerobic/anaerobic comparison, a very good interlaboratory consistency of the results was found in our two laboratories.
Although considerable effort was invested in standardizing the yeast strain, cultivation conditions, and analytical procedures, the presence of a small subset of genes that exhibited laboratory-dependent transcript levels indicated that some parameters were not perfectly reproduced. The (mostly unknown) roles of the proteins encoded by these transcripts did not enable us to identify a plausible explanation for these differences. Although our two laboratories obtained CEN.PK113-7D strain  from the same culture collection, we cannot exclude the possibility that minor mutations occurred during propagation and storage. Other factors that might, at least in theory, contribute to these differences include non-obvious differences in medium composition, for example as a result of obtaining chemicals with marginally different purities from different suppliers or differences in the equipment for chemostat cultivation. A caveat that arises from our experiments is that, even after rigorous standardization of experimental conditions, transcriptome analysis in different laboratories may lead to a low but significant number of laboratory-specific results. The occurrence of (apparent) laboratory bias in the change calls became more prominent at low -fold changes between the aerobic and anaerobic transcript levels. However, by employing SAM in combination with a lower threshold of 2.0 for change calls, a value that is often employed intuitively in microarray analyses (38 -40), resulted in an over 95% consistency in the change calls between the aerobic and anaerobic cultures.
To further explore the consistency of data obtained in different laboratories, our results were compared with two other  (41)). d Average and standard deviation measurements for three independent chemostats. e opp, change in the opposite direction from data reported here. f ND, genes whose expression was not reliably detectable. g NS, genes whose change was deemed not significant by the statistical criterion as published (41). array studies performed via similar (but not identical) experimental approaches. To this end, the ten most strongly induced genes during aerobic growth (Table II) and anaerobic growth (Table III) are listed, accompanied by data found in the literature for genome-wide aerobic/anaerobic comparisons. ter Linde et al. (25) used the same yeast strain as used in the present study and identical chemostat cultivation conditions. However, they used a different version of the GeneChip microarrays, applied only one array per cultivation condition, and used a different sampling protocol. Kwast et al. (41) compared aerobic and anaerobic transcript profiles on gene filters, with a different yeast strain grown on galactose in batch cultures.
The results obtained in our two laboratories exhibit similar rankings when transcripts were ordered by magnitude of their -fold change. Furthermore, the rankings are relatively well conserved in the study of ter Linde et al. (25) and for the genes with higher transcript abundance under anaerobic conditions in the study of Kwast et al. (41). However, in the comparison of the genes that are higher during aerobic growth, there is a large divergence between the data reported here and that of Kwast et al. (41). This is most likely to be due to a combination of differences in strain background and media composition. This observation indicates that physiological interpretation of transcriptome data should always take into account the context of experimental design.
Chemostat Cultivation and Transcriptome Analysis-To compare the reproducibility of transcriptome analysis in chemostat cultures with that in shake-flask cultures, we analyzed data from two laboratories who used the same commercial set-up for DNA-microarray analysis and made extensive datasets on independent replicate shake-flask cultures available on the World Wide Web (43)(44)(45). Following the same procedure as illustrated in Fig. 2, again eliminating the 900 transcripts with the lowest expression level, the experimental variations for these shake-flask-based datasets were calculated at 0.45 and 0.32, respectively (calculations not shown). These values are higher than those found in the chemostat studies, thus supporting the notion that chemostat cultivation offers a more rigorous control of cultivation conditions than shake-flask cultivation. This enhanced reproducibility, in combination with the possibility to study the impact of genetic modifications and environmental conditions at a fixed specific growth rate, makes chemostat cultivation a useful tool for quantitative transcriptome analysis in functional genomics and metabolic engineering. The comprehensive dataset for aerobic and anaerobic chemostat cultures that has been compiled in the present study and is available via the World Wide Web may serve as a useful set of reference data for other yeast researchers who intend to work with chemostat cultures of the CEN.PK113-7D strain.