Advertisement
JBC

HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Originally published In Press as doi:10.1074/jbc.M414030200 on February 7, 2005

J. Biol. Chem., Vol. 280, Issue 15, 15084-15096, April 15, 2005
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplemental Data
Right arrow All Versions of this Article:
280/15/15084    most recent
M414030200v1
Right arrow Submit a Letter to Editor
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrowRequest Permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Salmon, K. A.
Right arrow Articles by Gunsalus, R. P.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Salmon, K. A.
Right arrow Articles by Gunsalus, R. P.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Global Gene Expression Profiling in Escherichia coli K12

EFFECTS OF OXYGEN AVAILABILITY AND ArcA*{boxs}

Kirsty A. Salmon,abc She-pin Hung,bdef Nicholas R. Steffen,gh Rebecca Krupp,ai Pierre Baldi,egj G. Wesley Hatfield,dekl and Robert P. Gunsalusamn

From the aDepartment of Microbiology, Immunology, and Molecular Genetics and the mMolecular Biology Institute, University of California, Los Angeles, California 90095-1489 and the Departments of dMicrobiology and Molecular Genetics, gInformation and Computer Science, jBiological Chemistry, and kChemical Engineering and Material Science and the eInstitute for Genomics and Bioinformatics, University of California, Irvine, California 92697

Received for publication, December 14, 2004 , and in revised form, January 18, 2005.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 REFERENCES
 
The ArcAB two-component system of Escherichia coli regulates the aerobic/anaerobic expression of genes that encode respiratory proteins whose synthesis is coordinated during aerobic/anaerobic cell growth. A genomic study of E. coli was undertaken to identify other potential targets of oxygen and ArcA regulation. A group of 175 genes generated from this study and our previous study on oxygen regulation (Salmon, K., Hung, S. P., Mekjian, K., Baldi, P., Hatfield, G. W., and Gunsalus, R. P. (2003) J. Biol. Chem. 278, 29837–29855), called our gold standard gene set, have p values <0.00013 and a posterior probability of differential expression value of 0.99. These 175 genes clustered into eight expression patterns and represent genes involved in a large number of cell processes, including small molecule biosynthesis, macromolecular synthesis, and aerobic/anaerobic respiration and fermentation. In addition, 119 of these 175 genes were also identified in our previous study of the fnr allele. A MEME/weight matrix method was used to identify a new putative ArcA-binding site for all genes of the E. coli genome. 16 new sites were identified upstream of genes in our gold standard set. The strict statistical analyses that we have performed on our data allow us to predict that 1139 genes in the E. coli genome are regulated either directly or indirectly by the ArcA protein with a 99% confidence level.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 REFERENCES
 
Escherichia coli thrives in the gastrointestinal tract of many warm-blooded animals as a commensal or as a pathogen depending on a strain-dependent complement of genes (2). These enteric bacteria have the ability to switch between aerobic and anaerobic growth if oxygen is limiting. In response to microenvironments in the host, each individual cell adjusts its metabolic pathways to optimize energy generation via aerobic and/or anaerobic respiration or by fermentation of simple sugars (3). Many other cellular functions also are adjusted in response to oxygen availability, such as alterations in gene expression levels of membrane-associated nutrient uptake and/or excretion systems, biosynthetic pathways, and macromolecular synthesis (3).

Expression of E. coli genes involved in oxygen utilization is down-regulated as oxygen is depleted, and in a reciprocal fashion, expression of genes encoding alternative anaerobic electron transport pathways or genes needed for fermentation is switched on. Many of these metabolic transitions are controlled at the transcriptional level by the activities of the ferric nitrate reductase global regulatory protein FNR and/or the two-component ArcAB regulatory system (4, 5). The role of the FNR protein in the global control of E. coli gene expression has been profiled in response to anaerobiosis (1). Based on this analysis of whole genome transcription data, it was estimated that the expression of over one-third of the genes expressed during growth under aerobic conditions are altered when E. coli cells transition to an anaerobic growth state and that the expression of half of these genes is modulated either directly or indirectly by FNR. Thus, the fnr gene family was estimated to be ~10-fold larger than the 70 members previously recognized as members of the fnr gene regulatory network (6, 7).

The ArcAB (aerobic respiratory control) two-component regulatory system is recognized as a second global regulator of anaerobic gene regulation (3, 6, 8). The ArcAB system is composed of a classical OmpR-like receiver regulator, ArcA, and a membrane-associated sensor transmitter protein, ArcB (6). Together, these components have been shown to regulate expression of oxygen-requiring pathways, including the tricarboxylic acid cycle (e.g. sdhCDAB, icd, fumA, mdh, gltA, acnA, and acnB), and the aerobic cytochrome oxidase complexes (918). ArcAB is also known to be required for proper expression of certain catabolic genes for pyruvate utilization and sugar fermentation (1921).

In this genome-based study, we have identified additional E. coli genes under oxygen control that are differentially expressed in response to the ArcA global regulatory protein. This was accomplished by the use of DNA microarrays to analyze gene expression profiles in E. coli cells cultured at steady-state growth rates under aerobic (+O2) or anaerobic (-O2) growth conditions and in cells cultured under anaerobic growth conditions in the presence (-O2, +ArcA) or absence (-O2, -ArcA) of the ArcA protein or in otherwise arcA+ and arcA- isogenic strains. These experiments show that about one-half of the genes whose expression levels are affected by aerobic to anaerobic transitions are also affected by the ArcA protein. Thus, the number of E. coli genes differentially regulated by the ArcA protein is much larger than the 30 (5) or 100 (22) genes/operons previously recognized. The results of the gene expression profiling experiments further show that as many as two-thirds of the genes whose expression levels are affected by the ArcA protein are also affected by the FNR protein (1).


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 REFERENCES
 
Chemicals and Reagents—Avian myeloblastosis virus reverse transcriptase and Sephadex G-25 Quickspin columns were obtained from Roche Applied Science. Phenol and the DNA-free kit were purchased from Ambion Inc. Ribonuclease inhibitor III was purchased from Pan-Vera/Takara. Ultrapure deoxynucleoside triphosphates were purchased from Amersham Biosciences. Random hexamer oligonucleotides and T4 polynucleotide kinase were obtained from New England Biolabs Inc., and [{alpha}-33P]dCTP (2–3000 Ci/mmol) was obtained from PerkinElmer Life Sciences. DNA filter arrays (Panorama E. coli gene arrays) were obtained from Sigma. SYBR Gold was purchased from Molecular Probes, Inc.. All other chemicals were obtained from Sigma. All reagents and baked glassware used in RNA manipulations were treated with diethyl pyrocarbonate prior to their use.

Bacterial Strains and Growth Conditions—E. coli strains MC4100 (F- araD139 {Delta}(argF-lac)U169 rpsL150 relA1 flb-5301 deoC1 ptsF25 rbsR) (23) and PC35 (MC4100 {Delta}arcA::kan) (15) were used in this study. Cells were grown in MOPS1 medium (24) containing 40 mM glucose. Aerobic cultures were grown as described previously (1) in 125-ml Erlenmeyer flasks with constant aeration. Anaerobic cultures were grown in 15-ml anaerobic tubes fitted with butyl rubber stoppers (15). The same medium was made anaerobic by flushing with O2-free N2 gas for 20 min and then dispensed anaerobically into N2-flushed tubes. Cultures of the indicated strain were inoculated from overnight cultures grown under identical conditions (15). Cells were grown to A600 = 0.5–0.6 (mid-exponential growth phase) and harvested as described previously (1, 25).

Total RNA Isolation, cDNA Synthesis, and Target Labeling Conditions—Total RNA was isolated from 10-ml cultures; cDNA was synthesized and labeled with [{alpha}-33P]dCTP; and filters were hybridized exactly as described by Hung et al. (25). Stripping and reusing filters four times as described here results in a <3% increase in variance (26).

Data Acquisition—The commercial software package DNA Array-Vision obtained from Research Imaging Inc. was used to grid the 16-bit image file obtained from a PhosphorImager, to record the pixel density of each of the 18,432 addresses on each filter, and to perform the background subtractions. 8580 of the addresses on each filter were spotted with duplicate copies of each of the 4290 E. coli open reading frames (ORFs). The remaining 9852 empty addresses were used for background measurements. Because the backgrounds were constant, a global average background measurement was subtracted from each experimental measurement, although local background calculations are possible.

Experimental Design—The experiments described here (Fig. 1) were performed at the same time as our previously reported experiments profiling gene expression levels in the presence or absence of oxygen and FNR (1). The data for strain MC4100 (ArcA+) grown aerobically (Experiment 1, Filters 1 and 2) and anaerobically (Experiment 2, Filters 3 and 4) have been reported by Salmon et al. (1). For Experiment 3, Filters 5 and 6 were hybridized with random hexamer-generated 33P-labeled cDNA fragments complementary to each of three independently prepared RNA preparations (RNA 25–27) obtained from three individual cultures of strain PC35 (arcA-) grown under anaerobic conditions. These three 33P-labeled cDNA target preparations were pooled prior to hybridization to the full-length ORF probes on the filters (Experiment 3, Replicate 1, Filters 5 and 6). Following PhosphorImager analysis, these filters were stripped and again hybridized with pooled 33P-labeled cDNA target fragments complementary to each of another three independently prepared RNA preparations (RNA 28–30) from the same strain (PC35; Experiment 3, Replicate 2). This procedure was repeated one more time with Filters 5 and 6 with yet another independently prepared pool of cDNA targets (Experiment 3, Replicates 3; RNA 31–33). The data for the fourth replicate of this experiment were lost.



View larger version (18K):
[in this window]
[in a new window]
 
FIG. 1.
Experimental design. See "Materials and Methods" for details.

 
This experimental design produced duplicate filter data for four replicates performed with cDNA targets complementary to four independent sets of pooled RNA preparations for Experiments 1 and 2. Thus, because each filter contained duplicate spots for each ORF and duplicate filters were used for each experiment, a total of 16 measurements were obtained, four measurements for each ORF from each of four replicates. Duplicate filter data were obtained for three replicates performed with cDNA targets complementary to three independent sets of pooled RNA preparations for Experiment 3. Thus, because each filter contained duplicate spots for each ORF and duplicate filters were used for each experiment, a total of 12 measurements were obtained, four measurements for each ORF from each of three replicates.

Statistical Analyses—Data processing and statistical methods implemented in the Cyber-T software used for the analysis and interpretation of the data obtained from the DNA microarray experiments described in this study were the same as those described previously by Salmon et al. (1). For each target signal, a background subtracted estimate of the expression level was obtained and scaled to total counts on the membrane by dividing each individual gene expression value by the total of all target signals on the membrane. Thus, each normalized gene level is expressed as a fraction of the total mRNA hybridized to each DNA array. For any given measurement, a value greater than zero (indicating an expression level) or a zero (indicating an expression level lower than background) was obtained. Only those genes exhibiting an expression level greater than zero in all replicates were used for statistical analysis. These gene expression level measurements were analyzed by a regularized t test based on a Bayesian statistical framework (2529). For analysis of the data reported here, we ranked the mean gene expression levels of the replicate experiments in ascending order, used a sliding window of 101 genes, and assigned the average S.D. of the 50 genes ranked below and above each gene as the Bayesian S.D. for that gene. The p values for each gene measurement based on a regularized t test with a confidence value of 10 are reported in the Supplemental Material. A comprehensive discussion of the use of a regularized t test and the modifications applicable to the analysis of DNA microarray data of the type presented here is described in detail elsewhere (26).

Gene measurements containing zero expression values in one or more replicates were set aside. Among this set of genes, those with zero expression values for all replicates in one experiment and all values greater than zero for all measurements of another experiment were identified. Because these gene measurements could not be analyzed with a t test, the significance of these results was evaluated by ranking these genes in ascending order according to their coefficients of variance of the four greater than zero measurements of each experiment.

Cyber-T employs a mixture model-based method described by Allison et al. (30) for the computation of the global false positive and false negative levels inherent in a DNA microarray experiment (25, 26). With this method, described by Hung et al. (25), we estimated the rates of false positives and false negatives as well as true positives and true negatives at any given p value threshold. In other words, we obtained a posterior probability of differential expression PPDE(p) value for each gene measurement and a PPDE(<p) value at any given p value threshold based on the experiment-wide global false positive level and the p value exhibited by that gene (25, 26). In most instances, PPDE(<p) values are reported below and Tables I, II, III, IV, V, VI, VII, VIII. However, both PPDE(p) and PPDE(<p) values are given for each gene in the Supplemental Material.


View this table:
[in this window]
[in a new window]
 
TABLE I
Regulatory pattern I: genes that exhibit decreased levels during anaerobic growth and increased levels in an ArcA-deficient strain

 


View this table:
[in this window]
[in a new window]
 
TABLE II
Regulatory pattern II: genes that exhibit increased levels during anaerobic growth and further decreased levels in an ArcA-deficient strain

 


View this table:
[in this window]
[in a new window]
 
TABLE III
Regulatory pattern III: genes that exhibit decreased levels during anaerobic growth that are unaffected in an ArcA-deficient strain

 


View this table:
[in this window]
[in a new window]
 
TABLE IV
Regulatory pattern IV: genes that exhibit increased levels during anaerobic growth that are unaffected in an ArcA-deficient strain

 


View this table:
[in this window]
[in a new window]
 
TABLE V
Regulatory pattern V: genes that exhibit increased levels during anaerobic growth and further increased levels in an ArcA-deficient strain

 


View this table:
[in this window]
[in a new window]
 
TABLE VI
Regulatory pattern VI: genes that exhibit similar levels during aerobic and anaerobic growth but increased levels in an ArcA-deficient strain

 


View this table:
[in this window]
[in a new window]
 
TABLE VII
Regulatory pattern VII: genes that exhibit decreased levels during anaerobic growth and further decreased levels in an ArcA-deficient strain

 


View this table:
[in this window]
[in a new window]
 
TABLE VIII
Regulatory pattern VIII: genes that exhibit similar levels during aerobic and anaerobic growth but decreased levels in an ArcA-deficient strain

 
It is expected that for each p value threshold, there is a tradeoff between the rates of true and false positives. A low conservative p value threshold leads to few false positives, but may also reduce the true positive rate. A large p value threshold ultimately allows one to recover all the true positives, but at the cost of increasing the false positive rate. This fundamental tradeoff is usually captured in statistics using a receiver operating characteristic curve obtained by plotting the true positive rate (or sensitivity) defined by true positive/(true positive + false negative) versus the false positive rate defined by false positive/(false positive + true positive) (87). For instance, at a 77% true positive rate, we expect a 5% false positive rate when Experiment 1 (+O2, +ArcA) is compared with Experiment 2 (-O2, +ArcA) (Fig. 2A), and at a 80% true positive rate, we expect a 5% false positive rate when Experiment 2 (-O2, +ArcA) is compared with Experiment 3 (-O2, -ArcA) (Fig. 2B).



View larger version (12K):
[in this window]
[in a new window]
 
FIG. 2.
Receiver Operating Characteristic curve. These plots correlate the fraction of correctly identified differentially expressed genes (y axis) with the fraction of falsely identified differentially expressed genes (x axis). Panel A: +O2, +ArcA versus -O2, +ArcA. Panel B: -O2, +ArcA versus -O2, -ArcA. The false positive rate is [FP/(FP+TN)]. The true positive rate is [TP/(TP+FN)], where FP is the false positive, TN is the true negative, TP is the true positive and FN is the false negative.

 
The Cyber-T software package is available at the web site for the Institute for Genomics and Bioinformatics at the University of California (Irvine, CA). The clustering methods used to determine the regulatory patterns reported below are those implemented in the Gene-SpringTM software package (Silicon Genetics, Redwood City, CA).

Data Accession—All raw and processed data for the experimental results reported here are provided in tabular format as Excel files in the Supplemental Material.


    RESULTS AND DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 REFERENCES
 
Differential Gene Expression in the Presence or Absence of Oxygen
In the following discussions, we often refer to the -fold change for differentially expressed genes. However, simple-fold changes are incomplete and can be misleading (26). For this reason, the mean expression levels, S.D. values, p values, PPDE(<p) values, and PPDE(p) values for all differentially expressed E. coli genes are included in the Supplemental Material. In Tables I, II, III, IV, V, VI, VII, VIII, IX, we report only p values, PPDE (<p) values, and -fold changes.


View this table:
[in this window]
[in a new window]
 
TABLE IX
Genes not expressed in at least one experiment

 
A comparison of the wild-type E. coli gene expression levels between cells grown in the presence and absence of oxygen revealed 2820 genes that exhibited expression levels above the background for all replicates of Experiments 1 and 2 (+O2, + ArcA versus -O2, + ArcA) (Fig. 1) (1). The statistical analysis of these data revealed that approximately one-half of the genes expressed during aerobic growth (1445 genes) were differentially expressed following a transition from aerobic to anaerobic growth with a p value of 0.05 and a PPDE(<p) value of 0.96. Therefore, 58 of these 1445 differentially expressed genes are expected to be false positives (25).

Differential Gene Expression in the Absence of Oxygen and in the Presence and Absence of the ArcA Global Regulatory Protein
A comparison of the gene expression levels between cells grown in the absence of oxygen and in the presence or absence of ArcA revealed 2264 genes that exhibited expression levels above the background for all replicates of Experiments 2 and 3 (-O2, +ArcA versus -O2, -ArcA) (Fig. 1). Again, about one-half of the gene expression levels were modulated by this treatment condition. An examination of the distribution of p values suggested that the expression levels of 1243 genes with p values <0.05 were modulated either directly or indirectly by ArcA during growth transition from aerobic to anaerobic conditions. Because the PPDE(<p) value for this group of genes is 0.97, 37 false positives are expected. The individual p values and PPDE values, as well as additional statistical data, for all genes are provided in the Supplemental Material.

Identification of Differential Gene Expression Patterns Resulting from Two-variable Perturbation Experiments
To identify the global changes and adjustments of gene expression patterns that facilitate a transition from aerobic to anaerobic growth conditions and to determine the effects of genotype on these gene expression patterns, we analyzed E. coli gene expression profiles obtained from cells cultured under aerobic (+O2) or anaerobic (-O2) growth conditions and under anaerobic growth conditions in the presence (-O2, +ArcA) or absence (-O2, -ArcA) of ArcA, the global regulatory protein for anaerobic metabolism. Because ArcA is presumed to be inactive under aerobic conditions (5, 6, 31), we did not perform experiments comparing arcA genotypes under aerobic conditions.

Only two general regulatory patterns can be observed when two experimental conditions are compared, e.g. growth in the presence or absence of oxygen. However, when two conditions are compared, at least eight general regulatory patterns are expected. The data in Fig. 3 diagram the eight basic regulatory patterns that could be observed among three experiments conducted in the presence and absence of oxygen in an arcA+ strain and in the absence of oxygen in an arcA- strain. For simplicity, only three expression levels for each of these three experimental conditions were assumed: low, medium, and high.



View larger version (22K):
[in this window]
[in a new window]
 
FIG. 3.
Gene expression regulatory patterns expected from the comparison of DNA array experiments with one control and two treatment conditions. Experiment 1 (control) indicates gene expression levels during growth under aerobic conditions in an ArcA+ E. coli strain. Experiment 2 indicates gene expression levels during growth under anaerobic conditions in an ArcA+ E. coli strain. Experiment 3 indicates gene expression levels during growth under anaerobic conditions in an ArcA-deficient E. coli strain. Regulatory patterns I–VIII are indicated.

 
To identify genes differentially expressed at a high confidence level that correspond to each of the patterns (I, II, III, IV, V, VI, VII, VIII) diagrammed in Fig. 3, the genes differentially expressed due to the treatment conditions of Experiments 1 and 2 were sorted in ascending order according to their p values based on the regularized t test as described under "Materials and Methods." Next, the genes differentially expressed due to the treatment conditions of Experiments 2 and 3 were sorted in ascending order according to their p values. 100 genes with the lowest p values present in both lists were selected. These genes exhibited either an increased or decreased expression level between both treatment conditions (i.e. between Experiments 1 and 2 and between Experiments 2 and 3) (Fig. 3).

To identify those genes differentially expressed at a high level of confidence under the treatment conditions of Experiments 1 and 2 but expressed at the same or similar levels under the treatment conditions of Experiments 2 and 3 (patterns III and IV) (Fig. 3), the 500 genes of Experiments 1 and 2 with the highest probability for differential expression values were compared with the 500 genes of Experiments 2 and 3 with the lowest probability for differential expression values. This comparison identified 40 genes that were present in both lists, i.e. genes whose regulatory patterns fulfill this criterion. Likewise, to identify those genes differentially expressed under the treatment conditions of Experiments 2 and 3 but expressed at the same or similar levels under the treatment conditions of Experiments 1 and 2 (patterns VI and VIII) (Fig. 3), the 500 genes of Experiments 2 and 3 with the highest probability for differential expression values were compared with the 500 genes of Experiments 1 and 2 with the lowest probability for differential expression values. This comparison identified 35 genes that were present in both lists. These gene lists were combined into a single list of 175 genes differentially expressed under at least one treatment condition. All of the differentially genes of this list exhibited p values <0.00013 and a global confidence based on the experiment-wide false positive level of 99% (PPDE(<p) = 0.99). They constitute the "gold standard" gene set for the following analyses.

Hierarchical Clustering and Principal Component Analysis
GeneSpringTM software was used to empirically determine parameters for hierarchical clustering of these 175 genes into the eight patterns of Fig. 3 as discussed by Salmon et al. (1) and shown in Fig. 4. Interestingly, 83 of these ArcA-regulated genes are also differentially regulated directly or indirectly by FNR (patterns I, II, and V–VIII) (1). As an independent test to corroborate the accuracy of this supervised hierarchical clustering method, we used principal component analysis to cluster and visualize the same set of 175 genes (14). The principal component analysis clustering results shown in Fig. 5 illustrate that this unsupervised method produced the same results as the supervised hierarchical clustering method.



View larger version (26K):
[in this window]
[in a new window]
 
FIG. 4.
Hierarchical clustering of differentially expressed gene regulatory patterns. The experimental cell growth conditions were as follows: wild-type E. coli K12 strain (ArcA+) cultured under aerobic conditions (+O2 +Arc), wild-type E. coli K12 strain (ArcA+) cultured under anaerobic conditions (-O2 +Arc), and isogenic E. coli K12 strain lacking the arcA gene cultured under anaerobic conditions (-O2 -Arc). Each regulatory pattern is identified by different colors on the dendrogram and by numbers that correspond to the regulatory patterns defined in the legend to Fig. 3. The trust parameter is directly related to the mean divided by the S.D. for each gene measurement. Red indicates high expression, yellow indicates medium expression, and green indicates low expression.

 



View larger version (21K):
[in this window]
[in a new window]
 
FIG. 5.
Principal component analysis clustering of differentially expressed gene regulatory patterns. Shown is a two-dimensional projection onto a plane spanned by the second and third principal components. Each cluster is enclosed. The clusters are numbered according to the regulatory patterns indicated in the legends to Figs. 3 and 4. PCA, principal component analysis.

 
Interpretation of Clustering Results
Although some of the genes or operons differentially expressed in the presence or absence of ArcA are expected to be affected only indirectly, others whose expression is directly regulated by ArcA should possess a DNA-binding site(s) upstream of their transcriptional start site(s). ArcA is a 28-kDa protein that contains a winged helix-turn-helix motif that interacts with a poorly conserved consensus DNA sequence (31). This ArcA-P consensus sequence, obtained from DNA footprinting experiments performed with ~15 ArcA-regulated promoters, is 5'-WGTTAATTAW-3' (where W is A or T) (31).

Liu and De Wulf (22) used a weight matrix and a subset of 10 ArcA-footprinted promoter regions to define a slightly different consensus sequence of 5'-GTTAATTAAATGTTA-3'. This sequence resembles the previous 10-bp consensus sequence; however, it is extended by 5 residues at the 3'-end, and the first nucleotide of the original consensus sequence (5'-(A/T)) turned out to be poorly conserved and is not included in their motif (22). For the analyses reported here, a set of 26 known ArcA-binding sites in E. coli, including the 15 sites reviewed by Lynch and Lin (31) plus three newly footprinted ArcA-binding sequences,2 was compiled (see the Supplemental Material). Analysis of these sequences with MEME Version 3.0 (32, 33) identified a partially degenerate 15-bp motif. A weight matrix was generated from the motif found by MEME. The E. coli K12 genome was then scanned for sequences on either strand that had a weight matrix score exceeding the threshold and that were within 300 bp of an ORF origin, as identified by Regulon_DB (34). A total of 386 such sequences were located.

When ArcA acts as an activator of gene expression, it most often binds to upstream sites centered from 60 to 120 base pairs before the transcriptional start site of the affected gene or operon. When it acts as a repressor of gene expression, it binds to other sites often located near the transcriptional start site of the affected gene or operon (31). Of the 42 genes down-regulated in the presence of ArcA (patterns I, V, and VI) (Fig. 3), 12 contain a documented ArcA-binding site or a predicted ArcA-binding site at or near the transcriptional start site using the above MEME/weight matrix method (Tables I, V, and VI). Of the 93 genes up-regulated in the presence of ArcA (patterns II, VII, and VIII) (Figs. 4 and 5), 14 contain an upstream documented or predicted ArcA-binding site (Tables II, VII, and VII). Because the expression levels of the 40 genes of patterns III and IV were not affected by the presence or absence of ArcA, they are not expected to possess binding sites for this regulatory protein. However, five of these genes are predicted to possess a putative ArcA-binding site (Tables III and IV). Of these, three genes, cydA, nuoG, and nuoF, are known to be ArcA-regulated; however, the expression data are not consistent with previously published data, and this is likely due to paralog issues within the E. coli genome. Thus, the statistical and clustering methods described here produced results consistent with biological expectations.

Functional Classes of Genes Affected by Oxygen Availability and ArcA
The following discussion is limited to the 175 genes (our gold standard set) of regulatory patterns I–VIII (Fig. 4), although ArcA control of many other genes may be deduced from the information supplied in the Supplemental Material. As in our previous study (1), they represent many genes known to be oxygen-controlled and another larger set for which no previous information is available. These genes are listed in Tables I, II, III, IV, V, VI, VII, VIII and represent genes involved in a large number of processes, including small molecule biosynthesis, macromolecular synthesis, and aerobic/anaerobic respiration and fermentation. Regardless of their metabolic role, these genes are discussed below in the context of their expression patterns (Fig. 3).

Expression Pattern I: Decreased Expression during Anaerobiosis and Increased Expression in an ArcA Strain—Among the 175 genes displayed in the clustering procedures described above, 37 showed decreased expression under anaerobic conditions due to regulation by ArcA (Table I). Of these 37 genes, 10 have been reported to be directly regulated by ArcA (6), and 27 are newly discovered genes that are regulated either directly or indirectly by this global regulatory protein. In addition, 23 of the genes clustered into pattern I were also identified as being down-regulated by the FNR protein in our previous study (1). Previously described ArcA-regulated genes will be discussed first, followed by a discussion of the newly discovered ArcA-down-regulated genes.

Seven genes of the tricarboxylic acid cycle clustered into pattern I: icdA, sdhAB, lpdA, mdh, sucD, and gltA. Each of these seven genes has been shown previously to be anaerobically repressed by the ArcA protein (5, 6, 9, 10, 12, 13, 31, 35, 36). Regulation of lpdA by FNR was also observed in our previous study (1). A search for putative ArcA-binding sites using our customized MEME/weight matrix method (see "Materials and Methods") identified one or more sites upstream of each of these genes (Table I).

The cyoA gene is the first member of the cyoABCDE operon, which encodes all of the subunits of the cytochrome o ubiquinol oxidase. The cyoA gene was expressed 10-fold higher when cells were grown aerobically and 23-fold higher when cells were grown anaerobically in the ArcA-deficient strain (Table I). A previous study by our laboratory using a cyoA::lacZ fusion in the same ArcA+ and ArcA- isogenic strains used in this work showed the same regulatory pattern (16). A site similar to the ArcA consensus sequence has been identified upstream of the cyoA promoter3 and was also shown to be subject to regulation by FNR in our previous study (1), but this is likely indirect. Our MEME/weight matrix identified four putative ArcA-binding sites (Table I) upstream of the cyoA gene.

The nuoB and nuoE genes, which belong to the nuoA–N operon, encode NdhI (NADH dehydrogenase I), a membrane-associated, multisubunit, proton-translocating enzyme similar to complex I of eukaryotic mitochondria (37). Expression of both of these genes was lower under anaerobic conditions and elevated in the arcA mutant (Table I). A previous study using nuo::lacZ fusions established that nuo expression is subject to ArcA-mediated anaerobic repression (38). Two putative ArcA-binding sites were identified ~140 and 190 bp upstream of the nuoA gene using our MEME/weight matrix method (Table I). The nuoE gene also appeared to be subject to FNR regulation in our previous work (1), but the effect of FNR may be indirect as a consequence of its role in regulating ArcA expression (39).

The remaining genes in this group have not been shown previously to be subject to ArcA regulation. These newly discovered genes fall into the same functional classes as the genes regulated by the leucine-responsive regulatory protein Lrp under aerobic conditions (25) and FNR under anaerobic conditions (1). These functional classes include genes for small molecule biosynthesis and transport and macromolecule biosynthesis. More interestingly, of the remaining 27 genes of this expression group, 20 were also found to be regulated by FNR under anaerobic conditions (1).

12 genes of this cluster belong to the small molecule metabolism and transport groups. Nine of these genes were also found to be repressed in anaerobiosis due to regulation by FNR (1). These genes are crr (phosphocarrier protein for glucose transport); gpmA (phosphoglyceromutase); gatY (D-tagatose-1,6-bisphosphate aldolase); talA (transaldolase A); trpB (tryptophan synthase); speD and speE (biosynthesis of spermidine); prlA (secY, protein translocator of the secYEG operon); and ompA, which encodes an outer membrane protein. The remaining three genes are rbsC and rbsD (ribose high affinity transport system) and yjcU (alsE, allulose-6-phosophate 3-epimerase). Putative ArcA-binding sites were identified using the MEME/weight matrix for two of these: gatY and trpB.

11 of the remaining genes of this expression group belong to the macromolecule synthesis class. Eight of these were also observed to be regulated by FNR (1). These are rpsA, rpsT, rpsJ, rplS, rplT, and rplM (ribosomal proteins); tufA (elongation factor Tu); and oppA (oligopeptide permease). The remaining three genes are rplX (ribosomal protein), pal (essential lipoprotein), and atpG (ATP synthase). Putative ArcA-binding sites were identified using the MEME/weight matrix for two of these: oppA and atpG.

The functions of the remaining four genes in this list, ycdC, yajG, yceD, and yfiA, remain to be characterized. Three of these four genes, yajG, yceD, and yfiA, were also observed to be regulated by FNR in anaerobiosis (1).

Recently, Liu and De Wulf (22) identified 234 ORFs as being repressed by ArcA under anaerobic conditions in a microarray-based study. In our gold standard set, we identified a total of 42 genes as being up-regulated in an arcA mutant (patterns I, V, and VI) or 37 genes in pattern I. Only three genes, gltA, icd, and mdh, are conserved between the two reported data sets. However, our clustering set of 175 genes is highly restricted, with a strict PPDE(<p) cutoff level of 0.997, and eliminates false positives and other genes for which the data are of lower statistical significance.

Expression Pattern II: Increased Expression during Anaerobiosis and Decreased Expression in an ArcA Strain—Transcription of the 57 genes of expression pattern II (Table II) was both induced in the absence of oxygen and positively regulated by ArcA. Moreover, of these 57 genes, 34 were also observed to be positively regulated by FNR in anaerobiosis (1). 19 of these genes are members of the small molecule metabolism and transport group. Among the genes for metabolism, eight were also observed to be positively regulated by FNR in anaerobiosis. These are pyrD (dihydro-orotate dehydrogenase), glnD (uridylyltransferase), mobB (molybdenum cofactor biosynthesis), speC (ornithine decarboxylase), narY (cryptic nitrate reductase subunit), glnE (glutamine synthetase/adenylyltransferase), tdh (threonine dehydrogenase), and tynA (tyramine oxidase). One of these genes, glnD, is predicted to have a putative ArcA-binding site (Table II).

The gadA and gadB genes, encoding two highly homologous glutamate decarboxylases, also clustered into this group. In agreement with our previous study (1), lacZ fusion studies have shown that their anaerobic induction is due solely to the presence of the arcA gene product,2 but only the gadA gene has a predicted ArcA-binding sites upstream of its start codon. The gadX and gadW genes also clustered into pattern I. These two genes encode transcription factors that control the expression of the gadA and gadBC operons (4043). A putative ArcA-binding site(s) was identified upstream of each of these two genes (Table II). Two other genes, rhaA (L-rhamnose isomerase) and glgC (glucose-1-phosphate adenylyltransferase), have not been shown previously to be regulated by ArcA.

Six genes of this expression pattern belong to the small molecule transport functional class. Four of these genes were shown previously to be subject to FNR-mediated regulation (1). These genes are yabM (setA, glucose/lactose efflux transporter), yadQ (clcA, mammalian chloride channel protein homolog), nanT (sialic acid transporter), and uraA (transport of uracil). The remaining two genes belonging to this group are nfrA (an outer membrane protein) and pnuC (nicotinamide mononucleotide transporter).

As in our previous study (1), several genes of this expression pattern belonging to the macromolecular synthesis class are for DNA repair: recB and recC (subunits of the RecBCD enzyme complex), dinG (encodes a LexA-regulated DNA repair enzyme), and sbcC (co-suppressor of recBC mutations). Of the remaining five genes belonging to this functional group, only one was also observed to be regulated by FNR: glgA (glycogen synthesis). The other four genes are degQ (hhoA, periplasmic serine endopeptidase); cdh (CDP-diglyceride hydrolase); and two hydrogenase-encoding genes, hycD (hydrogenase-3 subunit) and hyaB (hydrogenase-1 subunit). Putative ArcA-binding sites were identified upstream of recB and hycD (Table II).

Of the remaining genes clustered into this expression pattern, two genes, mrcA (penicillin-binding protein 1A) and rarD (involved in chloramphenicol resistance), were also observed to be regulated by FNR (1). Two other genes, organized in an operon encoding a putative alternative cytochrome oxidase, appCB (cbdAB), were not observed previously to be regulated by FNR (1), and xylR (regulatory gene for the xylose operon) also clustered into this expression pattern. The 23 remaining members of this expression pattern are currently uncharacterized, 12 of which were also previously observed to be regulated by FNR (1). A putative ArcA-binding site was identified upstream of rarD and xylR and upstream of 2 of the 23 previously uncharacterized genes (ydbA and yhjE).

Only one gene in this expression pattern, glcC, was also identified in the study by Liu and De Wulf (22); however, their results indicated that glcG is repressed by ArcA (2.6-fold). Liu and De Wulf identified a total of 138 genes as being activated in the presence of ArcA. Again, in our gold standard set, we identified a total of 42 genes as being up-regulated in an arcA mutant (patterns II, VII, and VIII) or 57 genes in pattern II. However, our clustering set of 175 genes is highly restricted, with a strict PPDE(<p) cutoff level of 0.997.

Expression Pattern III: Decreased Expression during Anaerobiosis and No Change in an ArcA Strain—34 genes clustered into expression pattern III. Of these, 23 clustered into the same expression pattern in our previous study (1), indicating that the expression of these genes, although decreased during anaerobiosis, is not regulated by either ArcA or FNR.

Two members of the nuoA–N operon, nuoG and nuoE, which encode NdhI, a membrane-associated, multisubunit, protontranslocating enzyme similar to complex I of eukaryotic mitochondria (37), clustered into pattern I. Expression of the nuoE gene (Table III) was 3.8-fold lower under anaerobic conditions and was elevated 8.8-fold in the ArcA mutant (see Table IX). A previous study using nuo-lacZ fusions established that nuo expression is subject to ArcA-mediated anaerobic repression and NarL nitrate-mediated anaerobic activation (38). Two other members of this operon clustered into pattern I (nuoB and nuoE) (Table I).

The cydA gene (part of the cydAB operon) encodes the high affinity terminal oxidase of the oxygen respiratory chain, cytochrome d oxidase. The data obtained here show that cydA was repressed ~2-fold during anaerobic growth, but was unchanged in the ArcA-deficient strain (Table III). In agreement with these findings, previous studies using cydA::lacZ fusions showed that transcription of the cydAB operon is ArcA-repressed when oxygen becomes limiting (16, 44, 45). Subsequent studies have shown that ArcA functions to anti-repress cydAB transcription when oxygen is limiting (46), whereas FNR is required for repression when the oxygen tension is decreased further (14, 17, 45). As our study was carried out in full anaerobiosis, the ArcA effect was not observed, but the FNR effect was observed in our previous study (1). There are three ArcA sites that have been footprinted (17, 31). The study by Liu and De Wulf (22) also identified cydA to be ArcA-controlled; however, their study indicated that it is ArcA-activated (5.2-fold).

The remaining 31 genes of this cluster have not been studied previously for their expression under anaerobic growth conditions; however, one contains a putative ArcA-binding site (ykgI) (Table III). Again, the genes of this cluster are members of the same functional classes of expression patterns I and II. Three genes (fabG, rfbX, and katE) are involved in small molecule metabolism. 17 genes (rplB, rplC, rplO, hflC, rplF, rplQ, rplI, rpsE, rho, prfB, rplD, rpsH, tsf, rplE, nfi, tig, and lysS) are involved in macromolecule synthesis or degradation. 10 genes of this cluster are of unclassified function, seven of which were also identified in our FNR study (1). The remaining gene, eaeH (homologous to attachment and effacement proteins), also clustered into this expression pattern.

Expression Pattern IV: Increased Expression during Anaerobiosis and No Change in an ArcA Strain—The six genes of this cluster (Table IV) showed elevated expression under anaerobic growth conditions, but were not affected by deletion of the arcA allele. Two genes of unknown function clustered into this group (ybeD and ygjD) and also clustered into the same group in our FNR study (1). The remaining members of this cluster include htpG (a heat shock protein), mrr (involved in the restriction of methylated adenine residues; also clustered into this group in Ref. 1), cysK (cysteine synthase), and cof (complementation of fur). A search of the promoter regions of these six genes identified a putative ArcA-binding site upstream of one of these genes: ybeD. None of these genes were identified by Liu and De Wulf (22).

Expression Pattern V: Increased Expression during Anaerobiosis and Increased Expression in an ArcA Strain—This cluster contains only a single gene of unknown function: ybjX (Table V). A similar pattern of expression was also observed previously (1).

Expression Pattern VI: No Change during Anaerobiosis and Increased Expression in an ArcA Strain—Of the four genes of this cluster, three are involved in small molecule metabolism and transport: gapA (structural gene for glyceraldehyde-3-phosphate dehydrogenase A, essential for glycolysis), potF (member of the potFGHI operon involved in the transport of putrescine), and hisJ (member of the hisTJQMP operon encoding a histidine-binding protein that is part of the periplasmic permeases for the high affinity uptake of histidine). The final member, ydcF, is currently uncharacterized. All four members of this expression pattern clustered into the same group in our FNR study (1).

Expression Pattern VII: Decreased Expression during Anaerobiosis and Decreased Expression in an ArcA Strain—The same five genes observed in this expression pattern were also observed in our study with FNR (1). Two of the genes, frdA and nirB, have been shown previously to be FNR-regulated (4749). As we discussed previously (1), the discrepancy in these data is likely due to paralogs in the genome with these two genes (sdhA to frdA and nirD, cysI and cysJ to nirB). The remaining genes include rpmC (ribosomal protein) and two uncharacterized genes, ybdE (cusB) and ylcD (cusA).

Expression Pattern VIII: No Change during Anaerobiosis and Decreased Expression in an ArcA Strain—This cluster contains 31 genes, 20 of which are of unknown function. (12 were also identified in our previous FNR study (1).) Of the 31 genes of known function (Table VIII), two are known to be regulated by oxygen and/or ArcA under anaerobic growth conditions, and four contain putative ArcA-binding sites.

The two genes reported to be regulated by oxygen and/or ArcA are fumB and lysU. The anaerobic fumarase, encoded by fumB,is known to be activated during anaerobic fermentative growth (50, 51), and Tseng (51) showed that both ArcA and FNR are responsible for this anaerobic activation. As stated in our previous work (1), although our microarray data indicate that fumB is not regulated with respect to oxygen, its presence in this expression pattern is probably a result of the high sequence identity (80%) between fumB and the aerobically expressed fumarase, fumA. The lysU gene encodes one of the two lysyl-tRNA synthetases (the other being lysS, with which it shares 79% sequence identity (52)) and was reported previously to be induced under anaerobic conditions (53).

Eight members of this expression pattern are involved in macromolecular metabolism: cvpA (colicin V production), aceK (isocitrate dehydrogenase kinase/phosphatase), ftsY (cell division), dnaX (subunit of DNA polymerase III), umuC (involved in mutation induction), menD (o-succinylbenzoate synthase I), degS (periplasmic serine endopeptidase), and tyrB (tyrosine aminotransferase). The final member, fhuC (ferric hydroxamate-dependent iron uptake), is involved in small molecule transport. The remaining 20 genes of this expression pattern have not been characterized. Putative ArcA-binding sites were identified upstream of ycdM and yjhH (Table VIII).

The functional class distribution of the 175 genes of regulatory patterns I—VIII is shown in Fig. 6. Roughly 37.7% are hypothetical or unclassified, whereas another 23.4% are involved in small molecule metabolism. Most of the previously documented oxygen-controlled genes fall into the category of carbon and energy metabolism (5%). The study by Liu and De Wulf (22) identified 58 new genes/operons that are implicated in energy metabolism, transport, survival, catabolism, and transcriptional regulation.



View larger version (17K):
[in this window]
[in a new window]
 
FIG. 6.
Distribution of functions for genes affected by oxygen availability and ArcA. The distribution of the 175 genes with PPDE(<p) values >0.99 and p values <0.0013 is as follows: small molecule biosynthesis and transport, 41; carbon and energy metabolism, 14; macromolecular biosynthesis, 48; regulation, three; cell structure, three; and hypothetical or unclassified, 66.

 
Genes Not Expressed in at Least One Experiment
Only those genes exhibiting an expression level greater than zero in all experiments were used for statistical analysis. To identify differentially expressed genes that were not expressed under one condition but turned on under another treatment condition (or vice versa), gene measurements containing zero expression values were set aside and are listed in Table IX. This set contains only eight genes with expression values of at least 1 x 10-4 of total mRNA for all measurements in at least one experiment with a coefficient of variance <0.2 (Table IX). Seven are members of pattern II (increased expression during anaerobiosis and decreased expression in an ArcA strain): yddS, ftsW, hyaD, ldcC, ybdA, yhgE, and yrbF. The remaining gene, frdB, is a member of pattern VII (decreased expression during anaerobiosis and decreased expression in a ArcA strain). Two of these genes contain putative ArcA-binding sites: yddS and yhgE (Table IX).

Venn Diagram
To better visualize the interaction between the oxygen, ArcA and FNR regulons, Venn diagrams were created (Fig. 7). The top 500 genes (sorted by p value) from each data set were used as in the construction of the 175-gene list described above and the 205-gene list from our previous study (1). Interestingly, 303 genes were found to be regulated by both ArcA and FNR, and 74 of these genes showed additional regulation by oxygen (Fig. 7A). This is in contrast to the 16 genes reported previously to be co-regulated (5, 6).



View larger version (26K):
[in this window]
[in a new window]
 
FIG. 7.
Venn diagrams. A, Venn diagram for oxygen-, ArcA-, and FNR-regulated genes. The top 500 genes for ArcA (upper left circle), FNR (upper right circle), and oxygen (lower circle) are shown. B, Venn diagram for ArcA-, FNR-, and Lrp-regulated genes. The top 500 genes for ArcA (upper left circle), FNR (upper right circle), and Lrp (lower circle) are shown.

 
In looking at the top 500 genes from each group, 48 genes were identified as being subject solely to ArcA regulation and 57 solely to FNR regulation under anaerobic conditions. The remaining 321 genes pose an interesting question as to whether or not another global oxygen regulator that has yet to be identified exists within the E. coli genome. Moreover, the 378 genes in the ArcA grouping and the 369 genes in the FNR grouping that do not show regulation by oxygen, but that are regulated by each of these proteins (or co-regulated) under anaerobic conditions, suggest that these two proteins may also be important for adaptation to the anaerobic environment. It is also important to note that a large proportion of the 515 genes in this latter group are currently of unknown function. In addition to the comparisons above, a second comparison between the ArcA, FNR, and Lrp (25) regulons was also done (Fig. 7B), as we had indicated previously an overlap between the FNR and Lrp data sets (1). This diagram reveals 48 genes overlapping between the Lrp and FNR regulons, 43 genes overlapping between the ArcA and Lrp regulons, and 26 genes overlapping between all three (data not shown). These comparisons strongly suggest that regulatory networks are more complex than described previously.

Comparison with Other Studies
When different array formats are used, the magnitudes and sources of experimental errors are surely different. This raises the question of whether or not results obtained from experiments performed with different DNA array formats such as pre-synthesized filter arrays and in situ synthesized Affymetrix GeneChips can be compared with one another. We have previously addressed this question. Hung et al. (25) compared the results of 4-fold replicated gene expression profiles of otherwise wild-type and lrp isogenic E. coli strains performed with these two DNA microarray formats. To emphasize variance due to format differences, the same RNA samples were used for target preparation for both formats, and the data were analyzed with Cyber-T software as described here. When the top 100 genes with the lowest p values obtained with each format were compared, a highly significant number of genes, 29, were in common.

Liu and De Wulf (22) have reported the transcriptional profiles of arcA+ and arcA- E. coli cells grown under anaerobic conditions and generously provided us with their raw data. A comparison of this Affymetrix GeneChip data with our filter array data, both analyzed with Cyber-T software, does not show significant agreement. Of the top 100 genes with the lowest p values (<0.018) obtained with each format, only three genes were in common. Because Liu and De Wulf use a different data analysis software package (Spotfire) and defined differentially expressed genes as those with an expression level coefficient of variance <0.8 and a mutant to wild-type signal ratio of >2 with p < 0.05, it is not possible to directly compare their results with the results presented here. In addition, Liu and De Wulf also used a different carbon source (xylose rather than glucose). We can, however, compare conclusions. They reported 58 differentially expressed genes of operons under the direct control of ArcA as evidenced by the presence of a documented or putative DNA-binding site. In our data set, these genes exhibit p values ranging from 3.8 x 10-6 to 0.9 and PPDE(p) values ranging from 1.0 to 3.2 x 10-8. This suggests many false negatives and false positives in the data set of Liu and De Wulf.

Implications for Genome-wide Control by ArcA and FNR
In this study, we employed statistical methods (1, 25) for the identification of differentially expressed genes based on experiment-wide false positive and false negative measurement levels. These methods previously allowed us to infer differential expression for more than one-third of the 4290 genes of E. coli during growth in the presence or absence of oxygen (1445 genes) (1). This study has allowed us to determine that ~1243 of these changes in expression level are mediated either directly or indirectly by ArcA (Fig. 2B). These results further support our previous conclusions (1) that the network of genes required for the transition of cells from aerobic to anaerobic growth conditions is as much as 10 times larger than previously suspected. A comparison of the ArcA and FNR gold standard sets showed that 303 genes were regulated by both proteins (Fig. 7A), 74 of which were also affected by oxygen. Previous to this study, only 16 genes had been reported to be co-regulated (5, 6). Therefore, as suggested previously by us (1) and Liu and De Wulf (22), the total number of genes directly activated or repressed by ArcA and FNR is likely to be much higher than documented previously.

Rationale of Regulatory Patterns
Regulatory pattern I (anaerobic repressed gene expression, i.e. decreased expression in the presence of ArcA) (Table I) and pattern II (anaerobic activated gene expression, i.e. increased expression in the presence of ArcA) (Table II) are most easily reconciled with previous reports. Of the 94 genes of these patterns, 24 contain known or putative ArcA-binding site motifs. These results suggest that we might expect the total number of genes directly activated or repressed by ArcA to be in the range of 290 genes. Liu and De Wulf (22) estimated 372 genes.

Regulatory pattern III (anaerobically repressed, but not affected by ArcA) and pattern IV (anaerobically activated, but not affected by ArcA) are most easily explained as genes controlled by the FNR protein or by an as yet unidentified global regulator such as Lrp, IHF, FIS, or H-NS. Only two of these genes, nuoG and nuoF, are known members of the ArcA regulon.

As in our previous work (1), it is more difficult to understand the physiological roles that the genes of regulatory patterns V–VIII might play in anaerobic metabolism. However, these genes are still members of the same functional classes regulated by FNR (1) and Lrp (25). To illustrate the overlap between genes regulated by ArcA, FNR, and Lrp, a Venn diagram was constructed (Fig. 7B). The 500 genes with the highest PPDE(p) values (>0.996232) and the lowest p values (<5.26E-04) obtained from the array experiments reported here comparing arcA isogenic strains under anaerobic growth conditions were compared with the 500 genes with the highest PPDE(p) values (>0.991) and the lowest p values (<0.0014) obtained from the array experiments reported here comparing fnr isogenic strains under anaerobic growth conditions compared with the highest PPDE(p) values (>0.80) and the lowest p values (<0.027) obtained from the Lrp array experiments comparing lrp isogenic strains under aerobic growth conditions (25). Among these three gene sets, 26 genes are present in all three, and 43 genes overlap between the ArcA and Lrp regulons. This further supports our previous suggestion (1) that the FNR, Lrp, and now ArcA regulons reveal overlapping functions under aerobic and anaerobic conditions.

Conclusion
In this, our fourth study of global gene expression profiling in E. coli K12, we have again employed rigorous statistical treatment of the data to infer differential expression for 1139 genes in the presence and absence of the ArcA regulatory protein. In agreement with our previous study on the FNR protein (1) and the study of Liu and De Wulf (22), these results demonstrate that the network of genes required for the transition of cells from aerobic to anaerobic growth conditions is much larger than previously suspected (~8–10-fold).

A total of 30 genes had been documented previously as members of the ArcA regulon (5, 6). The study by Liu and De Wulf (22) suggested that 372 genes (or ~9% of the E. coli genome) are potential members of the ArcA regulon. The results presented here identify 135 of 175 genes with p values <0.000174 and PPDE(<p) values >0.9994 whose expression is affected by ArcA. However, if we include all genes expressed at a level above the background and examine the PPDE versus p value plots, we have a 63% confidence level that any gene in our oxygen-regulated set is differentially expressed (1), i.e. 63% of the 2820 genes or ~1700 genes. In the same manner, using the same PPDE versus p value plots, 67% of these 1700 genes or 1139 genes are either directly or indirectly regulated by ArcA. Thus, these results greatly expand our knowledge of genes that compose the ArcA regulatory network.


    FOOTNOTES
 
* This work was supported in part by National Institutes of Health Grants GM49694 and AI21678 (to R. P. G.) and Grant GM68903 (to G. W. H.) and by the University of California Institute for Genomics and Bioinformatics (Irvine, CA). The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. Back

{boxs} The on-line version of this article (available at http://www.jbc.org) contains a supplemental table. Back

b Both authors contributed equally to this work. Back

c Present address: Dept. of Microbiology and Molecular Genetics, University of California, Irvine, CA 92697. Back

f Recipient of a postdoctoral fellowship from the University of California Biotechnology Research and Education Program. Back

h Recipient of Biomedical Informatics Training Program Postdoctoral Fellowship T15 LM-07443 from the National Institutes of Health-National Library of Medicine. Back

i Supported by a traineeship from the UCLA-Integrative Graduate Education and Research Traineeship Bioinformatics Program funded by National Science Foundation Grant DGE-9987641. Back

l To whom correspondence may be addressed: Dept. of Microbiology and Molecular Genetics, University of California, Medical Science I, Campus Dr., Irvine, CA 92697. Tel.: 949-824-5344; Fax: 949-824-8595; E-mail: gwhatfie{at}uci.edu. n To whom correspondence may be addressed: Dept. of Microbiology, Immunology, and Molecular Genetics, UCLA, 609 Charles Young Dr. East, 1602A MSB, Los Angeles, CA 90095. Tel.: 310-206-8201; Fax: 310-206-5231; E-mail: robg{at}microbio.ucla.edu.

1 The abbreviations used are: MOPS, 4-morpholinepropanesulfonic acid; ORF, open reading frame; PPDE, posterior probability of differential expression. Back

2 R. P. Gunsalus, unpublished data. Back

3 J. A. Albrecht, unpublished data. Back



    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS AND DISCUSSION
 REFERENCES
 

  1. Salmon, K., Hung, S. P., Mekjian, K., Baldi, P., Hatfield, G. W., and Gunsalus, R. P. (2003) J. Biol. Chem. 278, 29837-29855[Abstract/Free Full Text]
  2. Blattner, F. R., Plunkett, G., III, Bloch, C. A., Perna, N. T., Burland, V., Riley, M., Collado-Vides, J., Glasner, J. D., Rode, C. K., Mayhew, G. F., Gregor, J., Davis, N. W., Kirkpatrick, H. A., Goeden, M. A., Rose, D. J., Mau, B., and Shao, Y. (1997) Science 277, 1453-1474[Abstract/Free Full Text]
  3. Gunsalus, R. P., and Park, S. J. (1994) Res. Microbiol. 145, 437-450[Medline] [Order article via Infotrieve]
  4. Guest, J. R., Attwood, M. M., Machado, R. S., Matqi, K. Y., Shaw, J. E., and Turner, S. L. (1997) Microbiology 143, 457-466[Abstract/Free Full Text]
  5. Lynch, A. S., and Lin, E. C. C. (1996) in Regulation of Gene Expression in E. coli (Lin, E. C. C., and Lynch, A. S., eds) pp. 362-381, R. G. Landes Co., Austin, TX
  6. Lynch, A. S., and Lin, E. C. C. (1996) in Escherichia coli and Salmonella: Cellular and Molecular Biology (Neidhart, F. C., ed) Vol. 1, pp. 1526-1549, ASM Press, Washington, D. C.
  7. Guest, J. R., Green, J., Irvine, A., and Spiro, S. (1996) in Regulation of Gene Expression in Escherichia coli (Lin, E. C. C., and Lynch, A. S., eds) pp. 317-342, R. G. Landes Co., Austin, TX
  8. Bauer, C. E., Elsen, S., and Bird, T. H. (1999) Annu. Rev. Microbiol. 53, 495-523[CrossRef][Medline] [Order article via Infotrieve]
  9. Park, S. J., Chao, G., and Gunsalus, R. P. (1997) J. Bacteriol. 179, 4138-4142[Abstract/Free Full Text]
  10. Park, S. J., Cotter, P. A., and Gunsalus, R. P. (1995) J. Bacteriol. 177, 6652-6656[Abstract/Free Full Text]
  11. Park, S. J., and Gunsalus, R. P. (1995) J. Bacteriol. 177, 6255-6262[Abstract/Free Full Text]
  12. Park, S. J., McCabe, J., Turna, J., and Gunsalus, R. P. (1994) J. Bacteriol. 176, 5086-5092[Abstract/Free Full Text]
  13. Park, S. J., Tseng, C. P., and Gunsalus, R. P. (1995) Mol. Microbiol. 15, 473-482[CrossRef][Medline] [Order article via Infotrieve]
  14. Cotter, P. A., Chepuri, V., Gennis, R. B., and Gunsalus, R. P. (1990) J. Bacteriol. 172, 6333-6338[Abstract/Free Full Text]
  15. Cotter, P. A., and Gunsalus, R. P. (1989) J. Bacteriol. 171, 3817-3823[Abstract/Free Full Text]
  16. Cotter, P. A., and Gunsalus, R. P. (1992) FEMS Microbiol. Lett. 70, 31-36[Medline] [Order article via Infotrieve]
  17. Cotter, P. A., Melville, S. B., Albrecht, J. A., and Gunsalus, R. P. (1997) Mol. Microbiol. 25, 605-615[CrossRef][Medline] [Order article via Infotrieve]
  18. Govantes, F., Albrecht, J. A., and Gunsalus, R. P. (2000) Mol. Microbiol. 37, 1456-1469[CrossRef][Medline] [Order article via Infotrieve]
  19. Drapal, N., and Sawers, G. (1995) Mol. Microbiol. 16, 597-607[Medline] [Order article via Infotrieve]
  20. Jeong, J. Y., Kim, Y. J., Cho, N., Shin, D., Nam, T. W., Ryu, S., and Seok, Y. J. (2004) J. Biol. Chem. 279, 38513-38518[Abstract/Free Full Text]
  21. Sawers, G., and Suppmann, B. (1992) J. Bacteriol. 174, 3474-3478[Abstract/Free Full Text]
  22. Liu, X., and De Wulf, P. (2004) J. Biol. Chem. 279, 12588-12597[Abstract/Free Full Text]
  23. Silhavy, T. J., Berman, M. L., and Enquist, L. W. (1984) Experiments with Gene Fusions, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
  24. Neidhardt, F. C., Bloch, P. L., and Smith, D. F. (1974) J. Bacteriol. 119, 736-747[Abstract/Free Full Text]
  25. Hung, S. P., Baldi, P., and Hatfield, G. W. (2002) J. Biol. Chem. 277, 40309-40323[Abstract/Free Full Text]
  26. Baldi, P., and Hatfield, G. W. (2002) DNA Microarrays and Gene Expression: From Experiments to Data Analysis and Modeling, Cambridge University Press, Cambridge, UK
  27. Hatfield, G. W., Hung, S. P., and Baldi, P. (2003) Mol. Microbiol. 47, 871-877[CrossRef][Medline] [Order article via Infotrieve]
  28. Long, A. D., Mangalam, H. J., Chan, B. Y., Tolleri, L., Hatfield, G. W., and Baldi, P. (2001) J. Biol. Chem. 276, 19937-19944[Abstract/Free Full Text]
  29. Baldi, P., and Long, A. D. (2001) Bioinformatics 17, 509-519[Abstract/Free Full Text]
  30. Allison, D. B., Gadbury, G. L., Heo, M., Fernndez, J. R., Lee, C. K., Prolla, T. A., and Weindruch, R. (2002) Comput. Stat. Data Anal. 39, 1-20
  31. Lynch, A. S., and Lin, E. C. C. (1996) J. Bacteriol. 178, 6238-6249[Abstract/Free Full Text]
  32. Bailey, T. L., and Elkan, C. (1994) in Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology (Altman, R., Brutlag, D., Karp, P., Lathrop, R., and Searls, D., eds) pp. 28-36, AAAI Press, Menlo Park, CA
  33. Bailey, T. L., and Gribskov, M. (1998) Bioinformatics 14, 48-54[Abstract/Free Full Text]
  34. Salgado, H., Gama-Castro, S., Martinez-Antonio, A., Diaz-Peredo, E., Sanchez-Solano, F., Peralta-Gil, M., Garcia-Alonso, D., Jimenez-Jacinto, V., Santos-Zavaleta, A., Bonavides-Martinez, C., and Collado-Vides, J. (2004) Nucleic Acids Res. 32, D303-D306[Abstract/Free Full Text]
  35. Chao, G., Shen, J., Tseng, C. P., Park, S. J., and Gunsalus, R. P. (1997) J. Bacteriol. 179, 4299-4304[Abstract/Free Full Text]
  36. Shen, J., and Gunsalus, R. P. (1997) Mol. Microbiol. 26, 223-236[CrossRef][Medline] [Order article via Infotrieve]
  37. Weidner, U., Geier, S., Ptock, A., Friedrich, T., Leif, H., and Weiss, H. (1993) J. Mol. Biol. 233, 109-122[CrossRef][Medline] [Order article via Infotrieve]
  38. Bongaerts, J., Zoske, S., Weidner, U., and Unden, G. (1995) Mol. Microbiol. 16, 521-534[CrossRef][Medline] [Order article via Infotrieve]
  39. Compan, I., and Touati, D. (1994) Mol. Microbiol. 11, 955-964[Medline] [Order article via Infotrieve]
  40. Ma, Z., Richard, H., Tucker, D. L., Conway, T., and Foster, J. W. (2002) J. Bacteriol. 184, 7001-7012[Abstract/Free Full Text]
  41. Ma, Z., Richard, H., and Foster, J. W. (2003) J. Bacteriol. 185, 6852-6859[Abstract/Free Full Text]
  42. Masuda, N., and Church, G. M. (2003) Mol. Microbiol. 48, 699-712[CrossRef][Medline] [Order article via Infotrieve]
  43. Tramonti, A., Visca, P., De Canio, M., Falconi, M., and De Biase, D. (2002) J. Bacteriol. 184, 2603-2613[Abstract/Free Full Text]
  44. Iuchi, S., Chepuri, V., Fu, H. A., Gennis, R. B., and Lin, E. C. C. (1990) J. Bacteriol. 172, 6020-6025[Abstract/Free Full Text]
  45. Tseng, C. P., Albrecht, J., and Gunsalus, R. P. (1996) J. Bacteriol. 178, 1094-1098[Abstract/Free Full Text]
  46. Govantes, F., Orjalo, A. V., and Gunsalus, R. P. (2000) Mol. Microbiol. 38, 1061-1073[CrossRef][Medline] [Order article via Infotrieve]
  47. Jones, H. M., and Gunsalus, R. P. (1987) J. Bacteriol. 169, 3340-3349[Abstract/Free Full Text]
  48. Bell, A. I., Cole, J. A., and Busby, S. J. (1990) Mol. Microbiol. 4, 1753-1763[Medline] [Order article via Infotrieve]
  49. Jayaraman, P. S., Cole, J. A., and Busby, S. J. (1989) Nucleic Acids Res. 17, 135-145[Abstract/Free Full Text]
  50. Tseng, C. P., Yu, C. C., Lin, H. H., Chang, C. Y., and Kuo, J. T. (2001) J. Bacteriol. 183, 461-467[Abstract/Free Full Text]
  51. Tseng, C. P. (1997) FEMS Microbiol. Lett. 157, 67-72[CrossRef][Medline] [Order article via Infotrieve]
  52. Hirshfield, I. N., Tenreiro, R., Vanbogelen, R. A., and Neidhardt, F. C. (1984) J. Bacteriol. 158, 615-620[Abstract/Free Full Text]
  53. Leveque, F., Gazeau, M., Fromant, M., Blanquet, S., and Plateau, P. (1991) J. Bacteriol. 173, 7903-7910[Abstract/Free Full Text]

Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Infect. Immun.Home page
J. C. Harrington, S. M. S. Wong, C. V. Rosadini, O. Garifulin, V. Boyartchuk, and B. J. Akerley
Resistance of Haemophilus influenzae to Reactive Nitrogen Donors and Gamma Interferon-Stimulated Macrophages Requires the Formate-Dependent Nitrite Reductase Regulator-Activated ytfE Gene
Infect. Immun., May 1, 2009; 77(5): 1945 - 1958.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
A. Yan and P. J. Kiley
Dissecting the Role of the N-Terminal Region of the Escherichia coli Global Transcription Factor FNR
J. Bacteriol., December 15, 2008; 190(24): 8230 - 8233.
[Abstract] [Full Text] [PDF]


Home page
Infect. Immun.Home page
F. F. R. Buettner, I. M. Bendallah, J. T. Bosse, K. Dreckmann, J. H. E. Nash, P. R. Langford, and G.-F. Gerlach
Analysis of the Actinobacillus pleuropneumoniae ArcA Regulon Identifies Fumarate Reductase as a Determinant of Virulence
Infect. Immun., June 1, 2008; 76(6): 2284 - 2295.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
P. I. Nikel, A. de Almeida, M. J. Pettinari, and B. S. Mendez
The Legacy of HfrH: Mutations in the Two-Component System CreBC Are Responsible for the Unusual Phenotype of an Escherichia coli arcA Mutant
J. Bacteriol., May 1, 2008; 190(9): 3404 - 3407.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
L. Gualdi, L. Tagliabue, and P. Landini
Biofilm Formation-Gene Expression Relay System in Escherichia coli: Modulation of {sigma}S-Dependent Gene Expression by the CsgD Regulatory Protein via {sigma}S Protein Stabilization
J. Bacteriol., November 15, 2007; 189(22): 8034 - 8043.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
M. L. Boulette and S. M. Payne
Anaerobic Regulation of Shigella flexneri Virulence: ArcA Regulates fur and Iron Acquisition Genes
J. Bacteriol., October 1, 2007; 189(19): 6957 - 6967.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
J. D. Partridge, G. Sanguinetti, D. P. Dibden, R. E. Roberts, R. K. Poole, and J. Green
Transition of Escherichia coli from Aerobic to Micro-aerobic Conditions Involves Fast and Slow Reacting Regulatory Components
J. Biol. Chem., April 13, 2007; 282(15): 11230 - 11237.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
A. Weber, S. A. Kogl, and K. Jung
Time-Dependent Proteome Alterations under Osmotic Stress during Aerobic and Anaerobic Growth in Escherichia coli.
J. Bacteriol., October 1, 2006; 188(20): 7165 - 7175.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Scheer, F. Klawonn, R. Munch, A. Grote, K. Hiller, C. Choi, I. Koch, M. Schobert, E. Hartig, U. Klages, et al.
JProGO: a novel tool for the functional interpretation of prokaryotic microarray data using Gene Ontology information.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W510 - W515.
[Abstract] [Full Text] [PDF]


Home page
Appl. Environ. Microbiol.Home page
P. I. Nikel, M. J. Pettinari, M. A. Galvagno, and B. S. Mendez
Poly(3-Hydroxybutyrate) Synthesis by Recombinant Escherichia coli arcA Mutants in Microaerobiosis
Appl. Envir. Microbiol., April 1, 2006; 72(4): 2614 - 2620.
[Abstract] [Full Text] [PDF]


Home page
Genes Dev.Home page
F. Mika and R. Hengge
A two-component phosphotransfer network involving ArcB, ArcA, and RssB coordinates synthesis and proteolysis of {sigma}S (RpoS) in E. coli
Genes & Dev., November 15, 2005; 19(22): 2770 - 2781.
[Abstract] [Full Text] [PDF]


Home page
Infect. Immun.Home page
S. A. Bucarey, N. A. Villagra, M. P. Martinic, A. N. Trombert, C. A. Santiviago, N. P. Maulen, P. Youderian, and G. C. Mora
The Salmonella enterica Serovar Typhi tsx Gene, Encoding a Nucleoside-Specific Porin, Is Essential for Prototrophic Growth in the Absence of Nucleosides
Infect. Immun., October 1, 2005; 73(10): 6210 - 6219.
[Abstract] [Full Text] [PDF]


Home page
Microbiol. Mol. Biol. Rev.Home page
G. N. Vemuri and A. A. Aristidou
Metabolic Engineering in the -omics Era: Elucidating and Modulating Regulatory Networks
Microbiol. Mol. Biol. Rev., June 1, 2005; 69(2): 197 - 216.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplemental Data
Right arrow All Versions of this Article:
280/15/15084    most recent
M414030200v1
Right arrow Submit a Letter to Editor
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrowRequest Permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Salmon, K. A.
Right arrow Articles by Gunsalus, R. P.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Salmon, K. A.
Right arrow Articles by Gunsalus, R. P.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 All ASBMB Journals   Molecular and Cellular Proteomics 
 Journal of Lipid Research   ASBMB Today 
Copyright © 2005 by the American Society for Biochemistry and Molecular Biology.
Advertisement
spacer
Advertisement
Advertisement