|
Advertisement | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
J. Biol. Chem., Vol. 280, Issue 15, 15084-15096, April 15, 2005
Global Gene Expression Profiling in Escherichia coli K12
EFFECTS OF OXYGEN AVAILABILITY AND ArcA*
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| ABSTRACT |
|---|
|
|
|---|
| INTRODUCTION |
|---|
|
|
|---|
Expression of E. coli genes involved in oxygen utilization is down-regulated as oxygen is depleted, and in a reciprocal fashion, expression of genes encoding alternative anaerobic electron transport pathways or genes needed for fermentation is switched on. Many of these metabolic transitions are controlled at the transcriptional level by the activities of the ferric nitrate reductase global regulatory protein FNR and/or the two-component ArcAB regulatory system (4, 5). The role of the FNR protein in the global control of E. coli gene expression has been profiled in response to anaerobiosis (1). Based on this analysis of whole genome transcription data, it was estimated that the expression of over one-third of the genes expressed during growth under aerobic conditions are altered when E. coli cells transition to an anaerobic growth state and that the expression of half of these genes is modulated either directly or indirectly by FNR. Thus, the fnr gene family was estimated to be
10-fold larger than the 70 members previously recognized as members of the fnr gene regulatory network (6, 7).
The ArcAB (aerobic respiratory control) two-component regulatory system is recognized as a second global regulator of anaerobic gene regulation (3, 6, 8). The ArcAB system is composed of a classical OmpR-like receiver regulator, ArcA, and a membrane-associated sensor transmitter protein, ArcB (6). Together, these components have been shown to regulate expression of oxygen-requiring pathways, including the tricarboxylic acid cycle (e.g. sdhCDAB, icd, fumA, mdh, gltA, acnA, and acnB), and the aerobic cytochrome oxidase complexes (918). ArcAB is also known to be required for proper expression of certain catabolic genes for pyruvate utilization and sugar fermentation (1921).
In this genome-based study, we have identified additional E. coli genes under oxygen control that are differentially expressed in response to the ArcA global regulatory protein. This was accomplished by the use of DNA microarrays to analyze gene expression profiles in E. coli cells cultured at steady-state growth rates under aerobic (+O2) or anaerobic (-O2) growth conditions and in cells cultured under anaerobic growth conditions in the presence (-O2, +ArcA) or absence (-O2, -ArcA) of the ArcA protein or in otherwise arcA+ and arcA- isogenic strains. These experiments show that about one-half of the genes whose expression levels are affected by aerobic to anaerobic transitions are also affected by the ArcA protein. Thus, the number of E. coli genes differentially regulated by the ArcA protein is much larger than the 30 (5) or 100 (22) genes/operons previously recognized. The results of the gene expression profiling experiments further show that as many as two-thirds of the genes whose expression levels are affected by the ArcA protein are also affected by the FNR protein (1).
| MATERIALS AND METHODS |
|---|
|
|
|---|
-33P]dCTP (23000 Ci/mmol) was obtained from PerkinElmer Life Sciences. DNA filter arrays (Panorama E. coli gene arrays) were obtained from Sigma. SYBR Gold was purchased from Molecular Probes, Inc.. All other chemicals were obtained from Sigma. All reagents and baked glassware used in RNA manipulations were treated with diethyl pyrocarbonate prior to their use.
Bacterial Strains and Growth ConditionsE. coli strains MC4100 (F- araD139
(argF-lac)U169 rpsL150 relA1 flb-5301 deoC1 ptsF25 rbsR) (23) and PC35 (MC4100
arcA::kan) (15) were used in this study. Cells were grown in MOPS1 medium (24) containing 40 mM glucose. Aerobic cultures were grown as described previously (1) in 125-ml Erlenmeyer flasks with constant aeration. Anaerobic cultures were grown in 15-ml anaerobic tubes fitted with butyl rubber stoppers (15). The same medium was made anaerobic by flushing with O2-free N2 gas for 20 min and then dispensed anaerobically into N2-flushed tubes. Cultures of the indicated strain were inoculated from overnight cultures grown under identical conditions (15). Cells were grown to A600 = 0.50.6 (mid-exponential growth phase) and harvested as described previously (1, 25).
Total RNA Isolation, cDNA Synthesis, and Target Labeling ConditionsTotal RNA was isolated from 10-ml cultures; cDNA was synthesized and labeled with [
-33P]dCTP; and filters were hybridized exactly as described by Hung et al. (25). Stripping and reusing filters four times as described here results in a <3% increase in variance (26).
Data AcquisitionThe commercial software package DNA Array-Vision obtained from Research Imaging Inc. was used to grid the 16-bit image file obtained from a PhosphorImager, to record the pixel density of each of the 18,432 addresses on each filter, and to perform the background subtractions. 8580 of the addresses on each filter were spotted with duplicate copies of each of the 4290 E. coli open reading frames (ORFs). The remaining 9852 empty addresses were used for background measurements. Because the backgrounds were constant, a global average background measurement was subtracted from each experimental measurement, although local background calculations are possible.
Experimental DesignThe experiments described here (Fig. 1) were performed at the same time as our previously reported experiments profiling gene expression levels in the presence or absence of oxygen and FNR (1). The data for strain MC4100 (ArcA+) grown aerobically (Experiment 1, Filters 1 and 2) and anaerobically (Experiment 2, Filters 3 and 4) have been reported by Salmon et al. (1). For Experiment 3, Filters 5 and 6 were hybridized with random hexamer-generated 33P-labeled cDNA fragments complementary to each of three independently prepared RNA preparations (RNA 2527) obtained from three individual cultures of strain PC35 (arcA-) grown under anaerobic conditions. These three 33P-labeled cDNA target preparations were pooled prior to hybridization to the full-length ORF probes on the filters (Experiment 3, Replicate 1, Filters 5 and 6). Following PhosphorImager analysis, these filters were stripped and again hybridized with pooled 33P-labeled cDNA target fragments complementary to each of another three independently prepared RNA preparations (RNA 2830) from the same strain (PC35; Experiment 3, Replicate 2). This procedure was repeated one more time with Filters 5 and 6 with yet another independently prepared pool of cDNA targets (Experiment 3, Replicates 3; RNA 3133). The data for the fourth replicate of this experiment were lost.
|
Statistical AnalysesData processing and statistical methods implemented in the Cyber-T software used for the analysis and interpretation of the data obtained from the DNA microarray experiments described in this study were the same as those described previously by Salmon et al. (1). For each target signal, a background subtracted estimate of the expression level was obtained and scaled to total counts on the membrane by dividing each individual gene expression value by the total of all target signals on the membrane. Thus, each normalized gene level is expressed as a fraction of the total mRNA hybridized to each DNA array. For any given measurement, a value greater than zero (indicating an expression level) or a zero (indicating an expression level lower than background) was obtained. Only those genes exhibiting an expression level greater than zero in all replicates were used for statistical analysis. These gene expression level measurements were analyzed by a regularized t test based on a Bayesian statistical framework (2529). For analysis of the data reported here, we ranked the mean gene expression levels of the replicate experiments in ascending order, used a sliding window of 101 genes, and assigned the average S.D. of the 50 genes ranked below and above each gene as the Bayesian S.D. for that gene. The p values for each gene measurement based on a regularized t test with a confidence value of 10 are reported in the Supplemental Material. A comprehensive discussion of the use of a regularized t test and the modifications applicable to the analysis of DNA microarray data of the type presented here is described in detail elsewhere (26).
Gene measurements containing zero expression values in one or more replicates were set aside. Among this set of genes, those with zero expression values for all replicates in one experiment and all values greater than zero for all measurements of another experiment were identified. Because these gene measurements could not be analyzed with a t test, the significance of these results was evaluated by ranking these genes in ascending order according to their coefficients of variance of the four greater than zero measurements of each experiment.
Cyber-T employs a mixture model-based method described by Allison et al. (30) for the computation of the global false positive and false negative levels inherent in a DNA microarray experiment (25, 26). With this method, described by Hung et al. (25), we estimated the rates of false positives and false negatives as well as true positives and true negatives at any given p value threshold. In other words, we obtained a posterior probability of differential expression PPDE(p) value for each gene measurement and a PPDE(<p) value at any given p value threshold based on the experiment-wide global false positive level and the p value exhibited by that gene (25, 26). In most instances, PPDE(<p) values are reported below and Tables I, II, III, IV, V, VI, VII, VIII. However, both PPDE(p) and PPDE(<p) values are given for each gene in the Supplemental Material.
|
|
|
|
|
|
|
|
|
Data AccessionAll raw and processed data for the experimental results reported here are provided in tabular format as Excel files in the Supplemental Material.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
|
Differential Gene Expression in the Absence of Oxygen and in the Presence and Absence of the ArcA Global Regulatory Protein
A comparison of the gene expression levels between cells grown in the absence of oxygen and in the presence or absence of ArcA revealed 2264 genes that exhibited expression levels above the background for all replicates of Experiments 2 and 3 (-O2, +ArcA versus -O2, -ArcA) (Fig. 1). Again, about one-half of the gene expression levels were modulated by this treatment condition. An examination of the distribution of p values suggested that the expression levels of 1243 genes with p values <0.05 were modulated either directly or indirectly by ArcA during growth transition from aerobic to anaerobic conditions. Because the PPDE(<p) value for this group of genes is 0.97, 37 false positives are expected. The individual p values and PPDE values, as well as additional statistical data, for all genes are provided in the Supplemental Material.
Identification of Differential Gene Expression Patterns Resulting from Two-variable Perturbation Experiments
To identify the global changes and adjustments of gene expression patterns that facilitate a transition from aerobic to anaerobic growth conditions and to determine the effects of genotype on these gene expression patterns, we analyzed E. coli gene expression profiles obtained from cells cultured under aerobic (+O2) or anaerobic (-O2) growth conditions and under anaerobic growth conditions in the presence (-O2, +ArcA) or absence (-O2, -ArcA) of ArcA, the global regulatory protein for anaerobic metabolism. Because ArcA is presumed to be inactive under aerobic conditions (5, 6, 31), we did not perform experiments comparing arcA genotypes under aerobic conditions.
Only two general regulatory patterns can be observed when two experimental conditions are compared, e.g. growth in the presence or absence of oxygen. However, when two conditions are compared, at least eight general regulatory patterns are expected. The data in Fig. 3 diagram the eight basic regulatory patterns that could be observed among three experiments conducted in the presence and absence of oxygen in an arcA+ strain and in the absence of oxygen in an arcA- strain. For simplicity, only three expression levels for each of these three experimental conditions were assumed: low, medium, and high.
|
To identify those genes differentially expressed at a high level of confidence under the treatment conditions of Experiments 1 and 2 but expressed at the same or similar levels under the treatment conditions of Experiments 2 and 3 (patterns III and IV) (Fig. 3), the 500 genes of Experiments 1 and 2 with the highest probability for differential expression values were compared with the 500 genes of Experiments 2 and 3 with the lowest probability for differential expression values. This comparison identified 40 genes that were present in both lists, i.e. genes whose regulatory patterns fulfill this criterion. Likewise, to identify those genes differentially expressed under the treatment conditions of Experiments 2 and 3 but expressed at the same or similar levels under the treatment conditions of Experiments 1 and 2 (patterns VI and VIII) (Fig. 3), the 500 genes of Experiments 2 and 3 with the highest probability for differential expression values were compared with the 500 genes of Experiments 1 and 2 with the lowest probability for differential expression values. This comparison identified 35 genes that were present in both lists. These gene lists were combined into a single list of 175 genes differentially expressed under at least one treatment condition. All of the differentially genes of this list exhibited p values <0.00013 and a global confidence based on the experiment-wide false positive level of 99% (PPDE(<p) = 0.99). They constitute the "gold standard" gene set for the following analyses.
Hierarchical Clustering and Principal Component Analysis
GeneSpringTM software was used to empirically determine parameters for hierarchical clustering of these 175 genes into the eight patterns of Fig. 3 as discussed by Salmon et al. (1) and shown in Fig. 4. Interestingly, 83 of these ArcA-regulated genes are also differentially regulated directly or indirectly by FNR (patterns I, II, and VVIII) (1). As an independent test to corroborate the accuracy of this supervised hierarchical clustering method, we used principal component analysis to cluster and visualize the same set of 175 genes (14). The principal component analysis clustering results shown in Fig. 5 illustrate that this unsupervised method produced the same results as the supervised hierarchical clustering method.
|
|
15 ArcA-regulated promoters, is 5'-WGTTAATTAW-3' (where W is A or T) (31). Liu and De Wulf (22) used a weight matrix and a subset of 10 ArcA-footprinted promoter regions to define a slightly different consensus sequence of 5'-GTTAATTAAATGTTA-3'. This sequence resembles the previous 10-bp consensus sequence; however, it is extended by 5 residues at the 3'-end, and the first nucleotide of the original consensus sequence (5'-(A/T)) turned out to be poorly conserved and is not included in their motif (22). For the analyses reported here, a set of 26 known ArcA-binding sites in E. coli, including the 15 sites reviewed by Lynch and Lin (31) plus three newly footprinted ArcA-binding sequences,2 was compiled (see the Supplemental Material). Analysis of these sequences with MEME Version 3.0 (32, 33) identified a partially degenerate 15-bp motif. A weight matrix was generated from the motif found by MEME. The E. coli K12 genome was then scanned for sequences on either strand that had a weight matrix score exceeding the threshold and that were within 300 bp of an ORF origin, as identified by Regulon_DB (34). A total of 386 such sequences were located.
When ArcA acts as an activator of gene expression, it most often binds to upstream sites centered from 60 to 120 base pairs before the transcriptional start site of the affected gene or operon. When it acts as a repressor of gene expression, it binds to other sites often located near the transcriptional start site of the affected gene or operon (31). Of the 42 genes down-regulated in the presence of ArcA (patterns I, V, and VI) (Fig. 3), 12 contain a documented ArcA-binding site or a predicted ArcA-binding site at or near the transcriptional start site using the above MEME/weight matrix method (Tables I, V, and VI). Of the 93 genes up-regulated in the presence of ArcA (patterns II, VII, and VIII) (Figs. 4 and 5), 14 contain an upstream documented or predicted ArcA-binding site (Tables II, VII, and VII). Because the expression levels of the 40 genes of patterns III and IV were not affected by the presence or absence of ArcA, they are not expected to possess binding sites for this regulatory protein. However, five of these genes are predicted to possess a putative ArcA-binding site (Tables III and IV). Of these, three genes, cydA, nuoG, and nuoF, are known to be ArcA-regulated; however, the expression data are not consistent with previously published data, and this is likely due to paralog issues within the E. coli genome. Thus, the statistical and clustering methods described here produced results consistent with biological expectations.
Functional Classes of Genes Affected by Oxygen Availability and ArcA
The following discussion is limited to the 175 genes (our gold standard set) of regulatory patterns IVIII (Fig. 4), although ArcA control of many other genes may be deduced from the information supplied in the Supplemental Material. As in our previous study (1), they represent many genes known to be oxygen-controlled and another larger set for which no previous information is available. These genes are listed in Tables I, II, III, IV, V, VI, VII, VIII and represent genes involved in a large number of processes, including small molecule biosynthesis, macromolecular synthesis, and aerobic/anaerobic respiration and fermentation. Regardless of their metabolic role, these genes are discussed below in the context of their expression patterns (Fig. 3).
Expression Pattern I: Decreased Expression during Anaerobiosis and Increased Expression in an ArcA StrainAmong the 175 genes displayed in the clustering procedures described above, 37 showed decreased expression under anaerobic conditions due to regulation by ArcA (Table I). Of these 37 genes, 10 have been reported to be directly regulated by ArcA (6), and 27 are newly discovered genes that are regulated either directly or indirectly by this global regulatory protein. In addition, 23 of the genes clustered into pattern I were also identified as being down-regulated by the FNR protein in our previous study (1). Previously described ArcA-regulated genes will be discussed first, followed by a discussion of the newly discovered ArcA-down-regulated genes.
Seven genes of the tricarboxylic acid cycle clustered into pattern I: icdA, sdhAB, lpdA, mdh, sucD, and gltA. Each of these seven genes has been shown previously to be anaerobically repressed by the ArcA protein (5, 6, 9, 10, 12, 13, 31, 35, 36). Regulation of lpdA by FNR was also observed in our previous study (1). A search for putative ArcA-binding sites using our customized MEME/weight matrix method (see "Materials and Methods") identified one or more sites upstream of each of these genes (Table I).
The cyoA gene is the first member of the cyoABCDE operon, which encodes all of the subunits of the cytochrome o ubiquinol oxidase. The cyoA gene was expressed 10-fold higher when cells were grown aerobically and 23-fold higher when cells were grown anaerobically in the ArcA-deficient strain (Table I). A previous study by our laboratory using a cyoA::lacZ fusion in the same ArcA+ and ArcA- isogenic strains used in this work showed the same regulatory pattern (16). A site similar to the ArcA consensus sequence has been identified upstream of the cyoA promoter3 and was also shown to be subject to regulation by FNR in our previous study (1), but this is likely indirect. Our MEME/weight matrix identified four putative ArcA-binding sites (Table I) upstream of the cyoA gene.
The nuoB and nuoE genes, which belong to the nuoAN operon, encode NdhI (NADH dehydrogenase I), a membrane-associated, multisubunit, proton-translocating enzyme similar to complex I of eukaryotic mitochondria (37). Expression of both of these genes was lower under anaerobic conditions and elevated in the arcA mutant (Table I). A previous study using nuo::lacZ fusions established that nuo expression is subject to ArcA-mediated anaerobic repression (38). Two putative ArcA-binding sites were identified
140 and 190 bp upstream of the nuoA gene using our MEME/weight matrix method (Table I). The nuoE gene also appeared to be subject to FNR regulation in our previous work (1), but the effect of FNR may be indirect as a consequence of its role in regulating ArcA expression (39).
The remaining genes in this group have not been shown previously to be subject to ArcA regulation. These newly discovered genes fall into the same functional classes as the genes regulated by the leucine-responsive regulatory protein Lrp under aerobic conditions (25) and FNR under anaerobic conditions (1). These functional classes include genes for small molecule biosynthesis and transport and macromolecule biosynthesis. More interestingly, of the remaining 27 genes of this expression group, 20 were also found to be regulated by FNR under anaerobic conditions (1).
12 genes of this cluster belong to the small molecule metabolism and transport groups. Nine of these genes were also found to be repressed in anaerobiosis due to regulation by FNR (1). These genes are crr (phosphocarrier protein for glucose transport); gpmA (phosphoglyceromutase); gatY (D-tagatose-1,6-bisphosphate aldolase); talA (transaldolase A); trpB (tryptophan synthase); speD and speE (biosynthesis of spermidine); prlA (secY, protein translocator of the secYEG operon); and ompA, which encodes an outer membrane protein. The remaining three genes are rbsC and rbsD (ribose high affinity transport system) and yjcU (alsE, allulose-6-phosophate 3-epimerase). Putative ArcA-binding sites were identified using the MEME/weight matrix for two of these: gatY and trpB.
11 of the remaining genes of this expression group belong to the macromolecule synthesis class. Eight of these were also observed to be regulated by FNR (1). These are rpsA, rpsT, rpsJ, rplS, rplT, and rplM (ribosomal proteins); tufA (elongation factor Tu); and oppA (oligopeptide permease). The remaining three genes are rplX (ribosomal protein), pal (essential lipoprotein), and atpG (ATP synthase). Putative ArcA-binding sites were identified using the MEME/weight matrix for two of these: oppA and atpG.
The functions of the remaining four genes in this list, ycdC, yajG, yceD, and yfiA, remain to be characterized. Three of these four genes, yajG, yceD, and yfiA, were also observed to be regulated by FNR in anaerobiosis (1).
Recently, Liu and De Wulf (22) identified 234 ORFs as being repressed by ArcA under anaerobic conditions in a microarray-based study. In our gold standard set, we identified a total of 42 genes as being up-regulated in an arcA mutant (patterns I, V, and VI) or 37 genes in pattern I. Only three genes, gltA, icd, and mdh, are conserved between the two reported data sets. However, our clustering set of 175 genes is highly restricted, with a strict PPDE(<p) cutoff level of 0.997, and eliminates false positives and other genes for which the data are of lower statistical significance.
Expression Pattern II: Increased Expression during Anaerobiosis and Decreased Expression in an ArcA StrainTranscription of the 57 genes of expression pattern II (Table II) was both induced in the absence of oxygen and positively regulated by ArcA. Moreover, of these 57 genes, 34 were also observed to be positively regulated by FNR in anaerobiosis (1). 19 of these genes are members of the small molecule metabolism and transport group. Among the genes for metabolism, eight were also observed to be positively regulated by FNR in anaerobiosis. These are pyrD (dihydro-orotate dehydrogenase), glnD (uridylyltransferase), mobB (molybdenum cofactor biosynthesis), speC (ornithine decarboxylase), narY (cryptic nitrate reductase subunit), glnE (glutamine synthetase/adenylyltransferase), tdh (threonine dehydrogenase), and tynA (tyramine oxidase). One of these genes, glnD, is predicted to have a putative ArcA-binding site (Table II).
The gadA and gadB genes, encoding two highly homologous glutamate decarboxylases, also clustered into this group. In agreement with our previous study (1), lacZ fusion studies have shown that their anaerobic induction is due solely to the presence of the arcA gene product,2 but only the gadA gene has a predicted ArcA-binding sites upstream of its start codon. The gadX and gadW genes also clustered into pattern I. These two genes encode transcription factors that control the expression of the gadA and gadBC operons (4043). A putative ArcA-binding site(s) was identified upstream of each of these two genes (Table II). Two other genes, rhaA (L-rhamnose isomerase) and glgC (glucose-1-phosphate adenylyltransferase), have not been shown previously to be regulated by ArcA.
Six genes of this expression pattern belong to the small molecule transport functional class. Four of these genes were shown previously to be subject to FNR-mediated regulation (1). These genes are yabM (setA, glucose/lactose efflux transporter), yadQ (clcA, mammalian chloride channel protein homolog), nanT (sialic acid transporter), and uraA (transport of uracil). The remaining two genes belonging to this group are nfrA (an outer membrane protein) and pnuC (nicotinamide mononucleotide transporter).
As in our previous study (1), several genes of this expression pattern belonging to the macromolecular synthesis class are for DNA repair: recB and recC (subunits of the RecBCD enzyme complex), dinG (encodes a LexA-regulated DNA repair enzyme), and sbcC (co-suppressor of recBC mutations). Of the remaining five genes belonging to this functional group, only one was also observed to be regulated by FNR: glgA (glycogen synthesis). The other four genes are degQ (hhoA, periplasmic serine endopeptidase); cdh (CDP-diglyceride hydrolase); and two hydrogenase-encoding genes, hycD (hydrogenase-3 subunit) and hyaB (hydrogenase-1 subunit). Putative ArcA-binding sites were identified upstream of recB and hycD (Table II).
Of the remaining genes clustered into this expression pattern, two genes, mrcA (penicillin-binding protein 1A) and rarD (involved in chloramphenicol resistance), were also observed to be regulated by FNR (1). Two other genes, organized in an operon encoding a putative alternative cytochrome oxidase, appCB (cbdAB), were not observed previously to be regulated by FNR (1), and xylR (regulatory gene for the xylose operon) also clustered into this expression pattern. The 23 remaining members of this expression pattern are currently uncharacterized, 12 of which were also previously observed to be regulated by FNR (1). A putative ArcA-binding site was identified upstream of rarD and xylR and upstream of 2 of the 23 previously uncharacterized genes (ydbA and yhjE).
Only one gene in this expression pattern, glcC, was also identified in the study by Liu and De Wulf (22); however, their results indicated that glcG is repressed by ArcA (2.6-fold). Liu and De Wulf identified a total of 138 genes as being activated in the presence of ArcA. Again, in our gold standard set, we identified a total of 42 genes as being up-regulated in an arcA mutant (patterns II, VII, and VIII) or 57 genes in pattern II. However, our clustering set of 175 genes is highly restricted, with a strict PPDE(<p) cutoff level of 0.997.
Expression Pattern III: Decreased Expression during Anaerobiosis and No Change in an ArcA Strain34 genes clustered into expression pattern III. Of these, 23 clustered into the same expression pattern in our previous study (1), indicating that the expression of these genes, although decreased during anaerobiosis, is not regulated by either ArcA or FNR.
Two members of the nuoAN operon, nuoG and nuoE, which encode NdhI, a membrane-associated, multisubunit, protontranslocating enzyme similar to complex I of eukaryotic mitochondria (37), clustered into pattern I. Expression of the nuoE gene (Table III) was 3.8-fold lower under anaerobic conditions and was elevated 8.8-fold in the ArcA mutant (see Table IX). A previous study using nuo-lacZ fusions established that nuo expression is subject to ArcA-mediated anaerobic repression and NarL nitrate-mediated anaerobic activation (38). Two other members of this operon clustered into pattern I (nuoB and nuoE) (Table I).
The cydA gene (part of the cydAB operon) encodes the high affinity terminal oxidase of the oxygen respiratory chain, cytochrome d oxidase. The data obtained here show that cydA was repressed
2-fold during anaerobic growth, but was unchanged in the ArcA-deficient strain (Table III). In agreement with these findings, previous studies using cydA::lacZ fusions showed that transcription of the cydAB operon is ArcA-repressed when oxygen becomes limiting (16, 44, 45). Subsequent studies have shown that ArcA functions to anti-repress cydAB transcription when oxygen is limiting (46), whereas FNR is required for repression when the oxygen tension is decreased further (14, 17, 45). As our study was carried out in full anaerobiosis, the ArcA effect was not observed, but the FNR effect was observed in our previous study (1). There are three ArcA sites that have been footprinted (17, 31). The study by Liu and De Wulf (22) also identified cydA to be ArcA-controlled; however, their study indicated that it is ArcA-activated (5.2-fold).
The remaining 31 genes of this cluster have not been studied previously for their expression under anaerobic growth conditions; however, one contains a putative ArcA-binding site (ykgI) (Table III). Again, the genes of this cluster are members of the same functional classes of expression patterns I and II. Three genes (fabG, rfbX, and katE) are involved in small molecule metabolism. 17 genes (rplB, rplC, rplO, hflC, rplF, rplQ, rplI, rpsE, rho, prfB, rplD, rpsH, tsf, rplE, nfi, tig, and lysS) are involved in macromolecule synthesis or degradation. 10 genes of this cluster are of unclassified function, seven of which were also identified in our FNR study (1). The remaining gene, eaeH (homologous to attachment and effacement proteins), also clustered into this expression pattern.
Expression Pattern IV: Increased Expression during Anaerobiosis and No Change in an ArcA StrainThe six genes of this cluster (Table IV) showed elevated expression under anaerobic growth conditions, but were not affected by deletion of the arcA allele. Two genes of unknown function clustered into this group (ybeD and ygjD) and also clustered into the same group in our FNR study (1). The remaining members of this cluster include htpG (a heat shock protein), mrr (involved in the restriction of methylated adenine residues; also clustered into this group in Ref. 1), cysK (cysteine synthase), and cof (complementation of fur). A search of the promoter regions of these six genes identified a putative ArcA-binding site upstream of one of these genes: ybeD. None of these genes were identified by Liu and De Wulf (22).
Expression Pattern V: Increased Expression during Anaerobiosis and Increased Expression in an ArcA StrainThis cluster contains only a single gene of unknown function: ybjX (Table V). A similar pattern of expression was also observed previously (1).
Expression Pattern VI: No Change during Anaerobiosis and Increased Expression in an ArcA StrainOf the four genes of this cluster, three are involved in small molecule metabolism and transport: gapA (structural gene for glyceraldehyde-3-phosphate dehydrogenase A, essential for glycolysis), potF (member of the potFGHI operon involved in the transport of putrescine), and hisJ (member of the hisTJQMP operon encoding a histidine-binding protein that is part of the periplasmic permeases for the high affinity uptake of histidine). The final member, ydcF, is currently uncharacterized. All four members of this expression pattern clustered into the same group in our FNR study (1).
Expression Pattern VII: Decreased Expression during Anaerobiosis and Decreased Expression in an ArcA StrainThe same five genes observed in this expression pattern were also observed in our study with FNR (1). Two of the genes, frdA and nirB, have been shown previously to be FNR-regulated (4749). As we discussed previously (1), the discrepancy in these data is likely due to paralogs in the genome with these two genes (sdhA to frdA and nirD, cysI and cysJ to nirB). The remaining genes include rpmC (ribosomal protein) and two uncharacterized genes, ybdE (cusB) and ylcD (cusA).
Expression Pattern VIII: No Change during Anaerobiosis and Decreased Expression in an ArcA StrainThis cluster contains 31 genes, 20 of which are of unknown function. (12 were also identified in our previous FNR study (1).) Of the 31 genes of known function (Table VIII), two are known to be regulated by oxygen and/or ArcA under anaerobic growth conditions, and four contain putative ArcA-binding sites.
The two genes reported to be regulated by oxygen and/or ArcA are fumB and lysU. The anaerobic fumarase, encoded by fumB,is known to be activated during anaerobic fermentative growth (50, 51), and Tseng (51) showed that both ArcA and FNR are responsible for this anaerobic activation. As stated in our previous work (1), although our microarray data indicate that fumB is not regulated with respect to oxygen, its presence in this expression pattern is probably a result of the high sequence identity (80%) between fumB and the aerobically expressed fumarase, fumA. The lysU gene encodes one of the two lysyl-tRNA synthetases (the other being lysS, with which it shares 79% sequence identity (52)) and was reported previously to be induced under anaerobic conditions (53).
Eight members of this expression pattern are involved in macromolecular metabolism: cvpA (colicin V production), aceK (isocitrate dehydrogenase kinase/phosphatase), ftsY (cell division), dnaX (subunit of DNA polymerase III), umuC (involved in mutation induction), menD (o-succinylbenzoate synthase I), degS (periplasmic serine endopeptidase), and tyrB (tyrosine aminotransferase). The final member, fhuC (ferric hydroxamate-dependent iron uptake), is involved in small molecule transport. The remaining 20 genes of this expression pattern have not been characterized. Putative ArcA-binding sites were identified upstream of ycdM and yjhH (Table VIII).
The functional class distribution of the 175 genes of regulatory patterns IVIII is shown in Fig. 6. Roughly 37.7% are hypothetical or unclassified, whereas another 23.4% are involved in small molecule metabolism. Most of the previously documented oxygen-controlled genes fall into the category of carbon and energy metabolism (5%). The study by Liu and De Wulf (22) identified 58 new genes/operons that are implicated in energy metabolism, transport, survival, catabolism, and transcriptional regulation.
|
Venn Diagram
To better visualize the interaction between the oxygen, ArcA and FNR regulons, Venn diagrams were created (Fig. 7). The top 500 genes (sorted by p value) from each data set were used as in the construction of the 175-gene list described above and the 205-gene list from our previous study (1). Interestingly, 303 genes were found to be regulated by both ArcA and FNR, and 74 of these genes showed additional regulation by oxygen (Fig. 7A). This is in contrast to the 16 genes reported previously to be co-regulated (5, 6).
|
Comparison with Other Studies
When different array formats are used, the magnitudes and sources of experimental errors are surely different. This raises the question of whether or not results obtained from experiments performed with different DNA array formats such as pre-synthesized filter arrays and in situ synthesized Affymetrix GeneChips can be compared with one another. We have previously addressed this question. Hung et al. (25) compared the results of 4-fold replicated gene expression profiles of otherwise wild-type and lrp isogenic E. coli strains performed with these two DNA microarray formats. To emphasize variance due to format differences, the same RNA samples were used for target preparation for both formats, and the data were analyzed with Cyber-T software as described here. When the top 100 genes with the lowest p values obtained with each format were compared, a highly significant number of genes, 29, were in common.
Liu and De Wulf (22) have reported the transcriptional profiles of arcA+ and arcA- E. coli cells grown under anaerobic conditions and generously provided us with their raw data. A comparison of this Affymetrix GeneChip data with our filter array data, both analyzed with Cyber-T software, does not show significant agreement. Of the top 100 genes with the lowest p values (<0.018) obtained with each format, only three genes were in common. Because Liu and De Wulf use a different data analysis software package (Spotfire) and defined differentially expressed genes as those with an expression level coefficient of variance <0.8 and a mutant to wild-type signal ratio of >2 with p < 0.05, it is not possible to directly compare their results with the results presented here. In addition, Liu and De Wulf also used a different carbon source (xylose rather than glucose). We can, however, compare conclusions. They reported 58 differentially expressed genes of operons under the direct control of ArcA as evidenced by the presence of a documented or putative DNA-binding site. In our data set, these genes exhibit p values ranging from 3.8 x 10-6 to 0.9 and PPDE(p) values ranging from 1.0 to 3.2 x 10-8. This suggests many false negatives and false positives in the data set of Liu and De Wulf.
Implications for Genome-wide Control by ArcA and FNR
In this study, we employed statistical methods (1, 25) for the identification of differentially expressed genes based on experiment-wide false positive and false negative measurement levels. These methods previously allowed us to infer differential expression for more than one-third of the 4290 genes of E. coli during growth in the presence or absence of oxygen (1445 genes) (1). This study has allowed us to determine that
1243 of these changes in expression level are mediated either directly or indirectly by ArcA (Fig. 2B). These results further support our previous conclusions (1) that the network of genes required for the transition of cells from aerobic to anaerobic growth conditions is as much as 10 times larger than previously suspected. A comparison of the ArcA and FNR gold standard sets showed that 303 genes were regulated by both proteins (Fig. 7A), 74 of which were also affected by oxygen. Previous to this study, only 16 genes had been reported to be co-regulated (5, 6). Therefore, as suggested previously by us (1) and Liu and De Wulf (22), the total number of genes directly activated or repressed by ArcA and FNR is likely to be much higher than documented previously.
Rationale of Regulatory Patterns
Regulatory pattern I (anaerobic repressed gene expression, i.e. decreased expression in the presence of ArcA) (Table I) and pattern II (anaerobic activated gene expression, i.e. increased expression in the presence of ArcA) (Table II) are most easily reconciled with previous reports. Of the 94 genes of these patterns, 24 contain known or putative ArcA-binding site motifs. These results suggest that we might expect the total number of genes directly activated or repressed by ArcA to be in the range of 290 genes. Liu and De Wulf (22) estimated 372 genes.
Regulatory pattern III (anaerobically repressed, but not affected by ArcA) and pattern IV (anaerobically activated, but not affected by ArcA) are most easily explained as genes controlled by the FNR protein or by an as yet unidentified global regulator such as Lrp, IHF, FIS, or H-NS. Only two of these genes, nuoG and nuoF, are known members of the ArcA regulon.
As in our previous work (1), it is more difficult to understand the physiological roles that the genes of regulatory patterns VVIII might play in anaerobic metabolism. However, these genes are still members of the same functional classes regulated by FNR (1) and Lrp (25). To illustrate the overlap between genes regulated by ArcA, FNR, and Lrp, a Venn diagram was constructed (Fig. 7B). The 500 genes with the highest PPDE(p) values (>0.996232) and the lowest p values (<5.26E-04) obtained from the array experiments reported here comparing arcA isogenic strains under anaerobic growth conditions were compared with the 500 genes with the highest PPDE(p) values (>0.991) and the lowest p values (<0.0014) obtained from the array experiments reported here comparing fnr isogenic strains under anaerobic growth conditions compared with the highest PPDE(p) values (>0.80) and the lowest p values (<0.027) obtained from the Lrp array experiments comparing lrp isogenic strains under aerobic growth conditions (25). Among these three gene sets, 26 genes are present in all three, and 43 genes overlap between the ArcA and Lrp regulons. This further supports our previous suggestion (1) that the FNR, Lrp, and now ArcA regulons reveal overlapping functions under aerobic and anaerobic conditions.
Conclusion
In this, our fourth study of global gene expression profiling in E. coli K12, we have again employed rigorous statistical treatment of the data to infer differential expression for 1139 genes in the presence and absence of the ArcA regulatory protein. In agreement with our previous study on the FNR protein (1) and the study of Liu and De Wulf (22), these results demonstrate that the network of genes required for the transition of cells from aerobic to anaerobic growth conditions is much larger than previously suspected (
810-fold).
A total of 30 genes had been documented previously as members of the ArcA regulon (5, 6). The study by Liu and De Wulf (22) suggested that 372 genes (or
9% of the E. coli genome) are potential members of the ArcA regulon. The results presented here identify 135 of 175 genes with p values <0.000174 and PPDE(<p) values >0.9994 whose expression is affected by ArcA. However, if we include all genes expressed at a level above the background and examine the PPDE versus p value plots, we have a 63% confidence level that any gene in our oxygen-regulated set is differentially expressed (1), i.e. 63% of the 2820 genes or
1700 genes. In the same manner, using the same PPDE versus p value plots, 67% of these 1700 genes or 1139 genes are either directly or indirectly regulated by ArcA. Thus, these results greatly expand our knowledge of genes that compose the ArcA regulatory network.
| FOOTNOTES |
|---|
The on-line version of this article (available at http://www.jbc.org) contains a supplemental table. ![]()
b Both authors contributed equally to this work. ![]()
c Present address: Dept. of Microbiology and Molecular Genetics, University of California, Irvine, CA 92697. ![]()
f Recipient of a postdoctoral fellowship from the University of California Biotechnology Research and Education Program. ![]()
h Recipient of Biomedical Informatics Training Program Postdoctoral Fellowship T15 LM-07443 from the National Institutes of Health-National Library of Medicine. ![]()
i Supported by a traineeship from the UCLA-Integrative Graduate Education and Research Traineeship Bioinformatics Program funded by National Science Foundation Grant DGE-9987641. ![]()
l To whom correspondence may be addressed: Dept. of Microbiology and Molecular Genetics, University of California, Medical Science I, Campus Dr., Irvine, CA 92697. Tel.: 949-824-5344; Fax: 949-824-8595; E-mail: gwhatfie{at}uci.edu. n To whom correspondence may be addressed: Dept. of Microbiology, Immunology, and Molecular Genetics, UCLA, 609 Charles Young Dr. East, 1602A MSB, Los Angeles, CA 90095. Tel.: 310-206-8201; Fax: 310-206-5231; E-mail: robg{at}microbio.ucla.edu.
1 The abbreviations used are: MOPS, 4-morpholinepropanesulfonic acid; ORF, open reading frame; PPDE, posterior probability of differential expression. ![]()
2 R. P. Gunsalus, unpublished data. ![]()
3 J. A. Albrecht, unpublished data. ![]()
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
J. C. Harrington, S. M. S. Wong, C. V. Rosadini, O. Garifulin, V. Boyartchuk, and B. J. Akerley Resistance of Haemophilus influenzae to Reactive Nitrogen Donors and Gamma Interferon-Stimulated Macrophages Requires the Formate-Dependent Nitrite Reductase Regulator-Activated ytfE Gene Infect. Immun., May 1, 2009; 77(5): 1945 - 1958. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Yan and P. J. Kiley Dissecting the Role of the N-Terminal Region of the Escherichia coli Global Transcription Factor FNR J. Bacteriol., December 15, 2008; 190(24): 8230 - 8233. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. F. R. Buettner, I. M. Bendallah, J. T. Bosse, K. Dreckmann, J. H. E. Nash, P. R. Langford, and G.-F. Gerlach Analysis of the Actinobacillus pleuropneumoniae ArcA Regulon Identifies Fumarate Reductase as a Determinant of Virulence Infect. Immun., June 1, 2008; 76(6): 2284 - 2295. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. I. Nikel, A. de Almeida, M. J. Pettinari, and B. S. Mendez The Legacy of HfrH: Mutations in the Two-Component System CreBC Are Responsible for the Unusual Phenotype of an Escherichia coli arcA Mutant J. Bacteriol., May 1, 2008; 190(9): 3404 - 3407. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Gualdi, L. Tagliabue, and P. Landini Biofilm Formation-Gene Expression Relay System in Escherichia coli: Modulation of {sigma}S-Dependent Gene Expression by the CsgD Regulatory Protein via {sigma}S Protein Stabilization J. Bacteriol., November 15, 2007; 189(22): 8034 - 8043. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. L. Boulette and S. M. Payne Anaerobic Regulation of Shigella flexneri Virulence: ArcA Regulates fur and Iron Acquisition Genes J. Bacteriol., October 1, 2007; 189(19): 6957 - 6967. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Partridge, G. Sanguinetti, D. P. Dibden, R. E. Roberts, R. K. Poole, and J. Green Transition of Escherichia coli from Aerobic to Micro-aerobic Conditions Involves Fast and Slow Reacting Regulatory Components J. Biol. Chem., April 13, 2007; 282(15): 11230 - 11237. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Weber, S. A. Kogl, and K. Jung Time-Dependent Proteome Alterations under Osmotic Stress during Aerobic and Anaerobic Growth in Escherichia coli. J. Bacteriol., October 1, 2006; 188(20): 7165 - 7175. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Scheer, F. Klawonn, R. Munch, A. Grote, K. Hiller, C. Choi, I. Koch, M. Schobert, E. Hartig, U. Klages, et al. JProGO: a novel tool for the functional interpretation of prokaryotic microarray data using Gene Ontology information. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W510 - W515. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. I. Nikel, M. J. Pettinari, M. A. Galvagno, and B. S. Mendez Poly(3-Hydroxybutyrate) Synthesis by Recombinant Escherichia coli arcA Mutants in Microaerobiosis Appl. Envir. Microbiol., April 1, 2006; 72(4): 2614 - 2620. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Mika and R. Hengge A two-component phosphotransfer network involving ArcB, ArcA, and RssB coordinates synthesis and proteolysis of {sigma}S (RpoS) in E. coli Genes & Dev., November 15, 2005; 19(22): 2770 - 2781. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. A. Bucarey, N. A. Villagra, M. P. Martinic, A. N. Trombert, C. A. Santiviago, N. P. Maulen, P. Youderian, and G. C. Mora The Salmonella enterica Serovar Typhi tsx Gene, Encoding a Nucleoside-Specific Porin, Is Essential for Prototrophic Growth in the Absence of Nucleosides Infect. Immun., October 1, 2005; 73(10): 6210 - 6219. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. N. Vemuri and A. A. Aristidou Metabolic Engineering in the -omics Era: Elucidating and Modulating Regulatory Networks Microbiol. Mol. Biol. Rev., June 1, 2005; 69(2): 197 - 216. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| All ASBMB Journals | Molecular and Cellular Proteomics |
| Journal of Lipid Research | ASBMB Today |