The Role of the embA and embB Gene Products in the Biosynthesis of the Terminal Hexaarabinofuranosyl Motif of Mycobacterium smegmatis Arabinogalactan*

, The emb genes are conserved among different mycobacteria. In Mycobacterium smegmatis and Mycobacterium tuberculosis , they belong to an operon comprising three genes, embC , embA , and embB. The EmbB protein has been proposed to be the target of ethambutol, a drug which is known to inhibit the synthesis of the arabinan portion of the mycobacterial cell wall arabinogalactan (AG). To further define the role of EmbB protein in arabinan biosynthesis, embA , - B , and - C genes were inacti-vated individually by homologous recombination in M. smegmatis . All three mutants were viable, and among the three, the slowest growing embB (cid:1) mutant encoun-tered profound morphological changes and exhibited a higher sensitivity to hydrophobic drugs and detergents, presumably due to an increase in cell wall permeability. Furthermore, chemical analyses showed that there was a diminution in the arabinose content of arabinogalactan from the Electrospray MS analyses performed on Autospec acceleration-time fitted with an electrospray source and operated at 4 kV accelerating voltage. Samples were dissolved in methanol, and 10- (cid:3) l aliquots were injected through a Rheodyne loop into the mobile phase (methanol:water:acetic acid, 50: 50:1, v/v/v), delivered at a flow rate of 5 (cid:3) l/min into the electrospray source by a syringe pump. NMR Analyses of intact AG— One-dimensional 1 H and 13 C DEPT (distortionless enhancement by polarization transfer) experiments were performed on a Varian Inova AM-400 and 1 H 13 C two-dimensional HSQC (heteronuclear single quantum correlation spectroscopy) NMR spectra were acquired on a Varian Inova 500 MHz NMR spectrometer (NIH-SIG RR11981) using the supplied Varian pulse sequences. Spectra were acquired after several lyophilizations with D 2 O on a solution of 10 mg/0.6 ml in 100% D 2 O. The HSQC data were acquired with a 7-kHz window for proton in F2 and a 15-kHz window for carbon in F1. The total recycle time was 1.65 s between transients. Adiabatic decoupling was applied to carbon during proton acquisition. Pulsed field gradients were used throughout for artifact suppression but were not used for coherence selection. The data set consisted of 1K complex points in t2 by 256 complex points in t1 using States-TPPI. Forward linear prediction was used for resolution enhancement to expand t1 to 512 complex points. A cosine-squared weighting function and zero filling were applied to both t1 and t2 prior to the Fourier transform. The final reso- lution was 3.5 Hz/point in F2 and 15 Hz/point in F1.

The cell wall of mycobacteria consists of a highly impermeable layer of 70 -90 unique carbon lipids, which are mycolic acids covalently linked to the polysaccharide arabinogalactan (AG). 1 The central role of arabinan in the cell wall proper appears to be in maintaining its integrity by tethering the parallel packed outer mycolic acid lipid barrier perpendicular to the underlying peptidoglycan layer through the flexible glycosyl linkages of AG to form the mycolylarabinogalactan-peptidoglycan (mAGP) complex. The primary structure of the arabinan portion of AG has been elucidated (1)(2)(3). Unlike the comparable arabinan structure in lipoarabinomannan (LAM) (4), the arabinan of AG terminates only in a well defined hexaarabinofuranosyl (Ara 6 (1), where both the terminal ␤-Ara f and the penultimate 2-␣-Ara f serve as the anchoring points for the mycolic acids. In Mycobacterium tuberculosis, two such Ara 6 units were found to be further assembled into a unique Ara 22 -mer, the largest structurally defined arabinan unit to date (5). Despite these advances, the precise structural details with respect to the number and further conjugation of the Ara 22 -mers to the -(␤-D-Gal f -(135)-␤-D-Gal f -(136)-) n scaffold remain unclear. Nor is it known whether the Ara 22 -mer architecture also applies to AG of all other Mycobacterium spp., although compositional analysis and NMR fingerprinting have implicated a similar terminal Ara 6 motif as a well conserved feature among those examined to date (28), including the AG of Mycobacterium avium and Mycobacterium smegmatis.
One of the major antituberculous drugs, ethambutol (EMB), acts by inhibiting the biogenesis of the cell wall, apparently by direct action on arabinan biosynthesis (6 -8). A two-gene locus (embAB) responsible for EMB resistance through target overexpression has been identified in M. avium, and an increase in minimal inhibitory concentration of EMB was demonstrated (9). Furthermore, these two genes were found to confer low levels of resistance to EMB when overexpressed in an otherwise susceptible M. smegmatis host (9). Three contiguous genes encoding putative target(s) for EMB in M. smegmatis and M. tuberculosis were subsequently cloned, sequenced, and characterized (10). Two of these genes were similar to the embA and embB genes described in M. avium, and the third one was termed embC. These genes are likely to be organized as an operon in the order embC, embA, and embB. It is reasonable to expect that these genes are transcribed as a single polycistronic mRNA from a unique promoter, but this has yet to be shown. However, a previous study (11) strongly implies that at least the embB gene can be expressed from its own individual promoter, the exact location of which still remains to be characterized. The embCAB gene cluster was initially identified in an EMB-resistant strain of M. smegmatis (10,11) and was subsequently characterized in both M. tuberculosis (10) and Mycobacterium leprae (12). The majority of M. tuberculosis clinical isolates resistant to EMB have mutations in embB (10,13,14), and thus, the EmbB protein has been proposed as the main target of EMB.
The Emb proteins are predicted to be integral membrane proteins with 11-13 transmembrane domains and a large carboxyl-terminal globular region of external location (10,14). Furthermore, M. smegmatis strains overexpressing embA, embB, and embC had increased arabinosyltransferase activities implying that embA, -B, and -C could encode arabinosyltransferases (9). However, heterologous expression of functional Emb proteins for direct biochemical assays or mechanistic studies has not been successful to allow unequivocal attribution of enzymatic activities. As an alternative, we took the reverse genetic approach to inactivate embA, -B, and -C genes individually and report here the resulting phenotypes and structural alterations associated with the arabinan component of the cell wall AG. Interruption of embA and -B led to a marked decrease in the total arabinose content. In particular, the terminal disaccharide ␤-D-Ara f -(132)-␣-D-Ara f unit normally found on the 3-arm of the characteristic Ara 6 is largely missing in both mutants. This is the first report of AG-deficient mutants of mycobacteria with specific lesion in the terminal elaboration of arabinan that could be restored by complementing with the respective functional genes.
Plasmids and DNA Manipulations-pBluescript II KS ϩ was purchased from Stratagene. pOLYG and pPR23 were kindly supplied by Dr. Ó . Gaora and Dr. B. Gicquel, respectively (18,19). All the restriction and nucleic acid-modifying enzymes were purchased from New England Biolabs. Taq DNA polymerase was purchased from Promega. Plasmid DNAs were purified on Qiagen columns according to the recommendations of the manufacturer (Qiagen, GmbH).
Insertional Mutagenesis-DNA fragments carrying internal sequences from embA and embC from M. smegmatis chromosome were amplified by PCR using the following primers: for embA, positions 1128 -1257 and positions 3228 -3257, according to the coding sequence, and for embC, positions 44 -73 and positions 2083-2112 (GenBank TM accession number U68480). Amplified DNA fragments were cloned into pBluescript, and a cassette conferring resistance to kanamycin (aphA-3) (20) was inserted at the MscI site for embA and the NcoI site for embC. Resulting DNA fragments were finally subcloned into pPR23 (19), a suicide shuttle vector carrying the counterselection marker sacB to give, respectively, pMAL-A and pMAL-C (Fig. 1A). For the embB gene, a 2930-bp NruI fragment carrying the embB gene lacking its 5Ј-end was cloned into pBluescript. A 765-bp MscI-MluI intragenic fragment was deleted and replaced with a promoterless cassette conferring resistance to kanamycin (aphA-3 ⌬P). The resulting fragment was then subcloned into pPR23 to give pMAL-B (Fig. 1A). PMAL-A, -B, and -C were introduced into M. smegmatis by electroporation, and transformants were selected at 30°C on Middlebrook 7H10 medium containing 5 g/ml gentamycin. Transformants were grown in 7H9 liquid medium without selection and then plated at 42°C on 7H10 supplemented with kanamycin (20 g/ml) and sucrose (10%). Kanamycin-resistant and gentamicin-sensitive clones were analyzed by both PCR amplification with appropriate primers and Southern blot to confirm the double crossover events and replacement of wild-type sequences.
Electron Microscopy-For negative staining, samples of exponentially grown and stationary phase cells were fixed at room temperature for 10 min with formaldehyde to a final concentration of 1% and centrifuged, and the pellet was suspended in 0.5% phosphotungstic acid, pH 7 (21). Formvar/carbon-coated grids were covered with one drop of bacterial suspension, air dried, and directly examined under the electron microscope (Jeol JEM-100CXII). For thin sectioning, samples were processed exactly as described previously (22). Intracellular polysaccharides were stained by the method of Thiery (23) as subsequently modified (22).
Drug Resistance Assays-Minimal inhibitory concentrations were determined by plating cells on solid Middlebrook 7H10 medium supplemented with various concentrations of the tested drug. The minimal inhibitory concentration was defined as the first concentration of drug that inhibited 100% of growth after 5 days of incubation (9).
Preparations of AG-Cells (2 g of wet weight) were delipidated with hot absolute ethanol followed by a 2:1 chloroform:methanol extraction as described previously (24). The dried residual biomass was rehydrated with phosphate-buffered saline and disrupted mechanically with the Soniprep 150 (Sanyo Gallenkamp PLC). The pulse time was 60 s disrupting and 90 s relaxing repeated six times. The suspension after cell breakage was extracted three times with 50% ethanol, each time centrifuging at 27,000 ϫ g. The supernatants were concentrated and partitioned between water and phenol to remove proteins (25). The aqueous layer containing the majority of cellular LAM, lipomannan, and phosphatidylinositol mannosides was freeze-dried, reconstituted in the same volume of water, and used for SDS-PAGE. The core cell wall mycolylarabinogalactan-peptidoglycan complex was obtained by extraction of the residual pellet with phosphate-buffered saline containing 2% SDS followed by extensive washing of the insoluble residue in the same buffer. The SDS was removed by sequential washing with phosphatebuffered saline, water, and finally acetone. The insoluble residue (mAGP complex) was then solubilized by alkaline treatment with 1 M sodium hydroxide (1) for 16 h at 80°C followed by neutralization and dialysis. The freeze-dried material was applied to a 26-mm ϫ 1-m column of Sephacryl S-300 (Amersham Biosciences, Inc.) with Milli-Q water at ϳ20 ml/h. Fractions were pooled on the basis of neutral sugar content and dried. Pools were analyzed by converting aliquots into alditol acetates after hydrolysis.
GC and GC/MS Analysis of Glycosyl Composition-Alditol acetates and partially methylated alditol acetates for GC/MS analysis were prepared according to the procedure described previously (22). Briefly, AG samples or permethylated derivatives were hydrolyzed with 2 M trifluoroacetic acid for 2 h at 121°C, reduced with 10 mg/ml sodium borodeuteride (NaBD 4 ), 2 M NH 4 OH at room temperature for 2 h, and then acetylated with acetic anhydride at 100°C for 1 h. GC was carried out on an SP 2380 (30-m ϫ 0.25-mm internal diameter, Supelco) column at an initial temperature of 50°C held for 1 min. The temperature was then raised to 170°C at 30°C/min before increasing to 260°C at 5°C/ min. GC/MS was carried out using a ThermoQuest Trace Gas Chromatograph 2000 (ThermoQuest, Austin, TX) connected to a GCQ/Polaris MS mass detector (ThermoQuest). The partially methylated alditol acetates were dissolved in chloroform prior to injection on a DB-5 column (10-m ϫ 0.18-mm internal diameter, J&W Scientific, Folsom, CA) at an initial temperature of 50°C held for 1 min. The temperature was raised to 150°C at 30°C/min before increasing to 260°C at 5°C/min.
Digestion with Endoarabinanase-Selective growth of a soil microorganism, Cellulomonas gelida, from which the endoarabinanase was purified, has been described previously (26). To ensure a consistent result, 10 g of each AG sample were treated with 5 l of the enzyme preparation for 8 h at 37°C. Aliquots from the digestion product mixtures that contained the released oligoarabinosides and oligogalactosides were analyzed directly by high pH anion exchange chromatography (HPAEC). Analytical HPAEC was performed on a Dionex LC system fitted with a Dionex Carbopac PA-1 column, and the oligoglycosides were detected with a pulse-amperometric detector (PAD-II).
Electrospray Mass Spectrometry Analyses-Samples were permethylated using the NaOH/dimethyl sulfoxide slurry method or were perdeuteroacetylated with the pyridine:d 6 -acetic anhydride method as de-scribed by Dell et al. (27). Electrospray MS analyses were performed on an Autospec orthogonal acceleration-time of flight mass spectrometer (Micromass, Manchester, United Kingdom) fitted with an electrospray source assembly and operated at 4 kV accelerating voltage. Samples were dissolved in methanol, and 10-l aliquots were injected through a Rheodyne loop into the mobile phase (methanol:water:acetic acid, 50: 50:1, v/v/v), delivered at a flow rate of 5 l/min into the electrospray source by a syringe pump.
NMR Analyses of intact AG-One-dimensional 1 H and 13 C DEPT (distortionless enhancement by polarization transfer) experiments were performed on a Varian Inova AM-400 and 1 H 13 C two-dimensional HSQC (heteronuclear single quantum correlation spectroscopy) NMR spectra were acquired on a Varian Inova 500 MHz NMR spectrometer (NIH-SIG RR11981) using the supplied Varian pulse sequences. Spectra were acquired after several lyophilizations with D 2 O on a solution of 10 mg/0.6 ml in 100% D 2 O. The HSQC data were acquired with a 7-kHz window for proton in F2 and a 15-kHz window for carbon in F1. The total recycle time was 1.65 s between transients. Adiabatic decoupling was applied to carbon during proton acquisition. Pulsed field gradients were used throughout for artifact suppression but were not used for coherence selection. The data set consisted of 1K complex points in t2 by 256 complex points in t1 using States-TPPI. Forward linear prediction was used for resolution enhancement to expand t1 to 512 complex points. A cosine-squared weighting function and zero filling were applied to both t1 and t2 prior to the Fourier transform. The final resolution was 3.5 Hz/point in F2 and 15 Hz/point in F1.

Gene Exchange Mutagenesis of the emb Genes of M. smegmatis-
The three mutants embA Ϫ , embB Ϫ , and embC Ϫ were generated by homologous recombination using the thermosensitive vector pPR23 containing embA, embB, or embC intragenic fragments disrupted by the aphA-3 cassette conferring resistance to kanamycin (20) (Fig. 1A). To confirm the allelic exchange, chromosomal DNA was prepared from the mutants, and PCR amplifications were performed using primers that yield amplification products only after the double recombination event occurred. Amplification products of the expected size were obtained with the three emb Ϫ mutants (Fig. 1B); no amplification was obtained with the wild-type (WT) strain. Southern blot analysis was also performed on the chromosomal DNA from the mutants, which confirmed the results obtained with PCR. Therefore, both PCR and Southern blot experiments demonstrated that the three emb Ϫ mutants were the result of allelic exchange involving replacement of the emb genes with the disrupted emb constructs and loss of the vector.
Inactivation of emb Genes Induces Morphological Alterations in M. smegmatis-WT, embA Ϫ , embB Ϫ , and embC Ϫ strains were examined during the exponential phase of growth for morphological changes. Thin sections of WT showed typical morphology with long rods, regular septation, and bacterial sizes of ϳ5.5 m (Fig. 2a). In contrast, the embB Ϫ mutant showed drastically altered morphology with size shortening (ϳ2.8 m), swelling, and distortion (Fig. 2c). Accumulation of translucent cytoplasmic inclusions could also be observed (Fig.  2c). The embA Ϫ mutant was also altered in its morphology with size shortening (ϳ2.9 m), slight distortion, and swelling but to a lesser extent than with the embB Ϫ mutant (Fig. 2b). The embC Ϫ mutant exhibited a shorter size (ϳ4.9 m) but with a nonaltered shape (Fig. 2d). In this study, it was also found that interruption of the emb genes resulted in a severe alteration in the physiological behavior of the bacterium in several respects: for instance, complete loss of acid fastness for both embB Ϫ and embA Ϫ strains and only partial acid fastness in the case of the embC Ϫ mutant. In addition, susceptibility of these mutants to the hydrophobic antibiotics rifampicin and novobiocin was enhanced (minimal inhibitory concentrations of 125 and 60 g/ml for the WT in comparison to minimal inhibitory concentrations of 25 and 4 g/ml for the embA Ϫ , 1.5 and 0.25 g/ml for the embB Ϫ , and 40 and 25 g/ml for the embC Ϫ ), indicating altered permeability of the mutant strains. These results pointed to the probability that the emb Ϫ mutants had an altered cell wall.
Selective Effects of emb Genes Inactivation on the Arabinan Synthesis of AG-The peculiar morphological variations observed among the emb Ϫ mutants and the evidence that the emb genes encode for key proteins in AG biosynthesis (9) led us to analyze the structure of the arabinan of AG from these mutants. Cell wall core (mAGP) was prepared from WT and the embA Ϫ , -B Ϫ , and -C Ϫ as described previously (24), and the ratio of Ara to Gal in mAGP was determined by gas chromatography of the alditol acetates prepared after hydrolysis (28). The glycosyl composition was calculated based on a single rhamnosyl (Rha) residue per AG chain (Table I) (2). The AG preparation from the WT mAGP was found to contain ϳ100 glycosyl residues with a relative molar ratio of Rha:Gal:Ara of 1:29:64 and an Ara:Gal ratio of 2.2:1 in accord with previous data (1). All three emb Ϫ mutants yielded AG with significant reduction in Ara content concomitant with an increase in the amount of Gal (Table I), although the effect was less dramatic with embC Ϫ . In accord with the extent of morphological alteration, the AG from the embB Ϫ mutant also gave the lowest Ara:Gal ratio (0.88:1) followed by embA Ϫ (1.01:1) and embC Ϫ (1.4:1). When comple- mented in trans with respective WT emb genes, the embA Ϫ and embB Ϫ mutants were able to synthesize AG with a glycosyl composition identical to the WT AG. This total synthesis of normal AG with a WT copy of the gene corroborates that the defect in AG observed in the embA Ϫ and embB Ϫ mutants was a direct consequence of the interruption of these genes. Synthesis of AG with embC complementation, however, was less complete, suggesting that the effect of embC inactivation on AG expression could be partly due to a polar effect on the expression of embA and/or embB located directly downstream. However, this remains to be directly demonstrated, and the exact cause of the AG defect in the embC Ϫ mutant is under further investigation. Based on the complementation experiments, the presence of the individual embB promoter is confirmed, and in addition, the existence of an individual promoter for embA is also strongly suggested. Thus, embA and embB genes can be expressed from their own individual promoters, but the genetic organization of the emb genes also indicates that, in addition, a polycistronic mRNA encoding the three Emb proteins could also be synthesized. Further analysis of the LAM preparation from embA Ϫ and embB Ϫ revealed no difference in composition from that of LAM from the WT, indicating that the embAB gene products are specifically involved in the biosynthesis of AG (data not shown). In contrast, full-size LAM could not be obtained from the embC Ϫ mutant, suggesting that the embC gene product is involved in the synthesis of LAM. Further detailed analysis of the latter aspects of embC gene inactivation will be reported elsewhere. 2 Comparison of the Structures of AG from WT and embB Ϫ using NMR-To provide further insights into the chemical basis of alteration in the arabinan content, the base-solubilized AGs were subjected to NMR analyses. Since compositional analyses have indicated the presence of substantial amount of Glc in the samples, the crude preparation of AGs were applied to a Sephacryl S-300 size fractionation column to remove the majority of the contaminating glucan, which eluted at the void volume. A quantitative 1 H NMR analysis of the AGs from WT and embB Ϫ clearly showed, as expected, that all the anomeric resonances corresponding to Ara f residues were reduced in intensity with respect to the anomeric Gal f resonances. This observation was corroborated by a 13 C DEPT experiment (Fig.  3) for which the methods used to assign the C-1 resonances of terminal ␤-Ara f , 2-linked ␣-Ara f , and the C-2 of 2-linked ␣-Ara f have been described previously (28). Further confirmation was sought using a two-dimensional 1 H 13 C HSQC experiment to obtain well resolved spectra (Fig. 4) from which resonances in highly overlapping anomeric regions of one-dimensional 1 H NMR could be assigned. Thus, for the WT AG, C-1 resonance at ␦ 109 ppm correlated with anomeric protons at ␦ 5.07 ppm and was assigned to the 6-linked ␤-Gal f . Two overlapping spin systems centered at ␦ 108.6, 108.7 ppm correlating with protons at ϳ␦ 5.16 ppm were attributed to the 5-linked ␣-Ara f and 2 D. Chatterjee and V. E. Escuyer, manuscript in preparation.  the 3,5-linked ␣-Ara f . The C-1 resonance of 5,6-linked ␤-Gal f at ␦ 108.2 ppm ( 1 H at ␦ 5.21 ppm) was resolved from that of 5-linked ␤-Gal f at ␦ 108.1 ppm, which correlated with protons at ␦ 5.28 ppm. The two sets of well separated spin systems were easily identified as the carbon-protons related to the C-1 resonances of 2-linked ␣-Ara f , which otherwise overlapped in the one dimensional 1 H NMR spectrum but separated into two singlets in the 13 C DEPT experiment (Fig. 3A). The C-1 resonances of these carbons were at ␦ 106.8 ( 1 H at ␦ 5.22 ppm) and ␦ 106.6 with 1 H at ␦ 5.29 ppm. The farthest upfield signals were the two sets of spin systems belonging to the terminal ␤-Ara f resonating at ␦ 101.8 ppm ( 1 H at 5.19 ppm) and ␦ 100.8 ppm, respectively, correlating with protons centered at ␦ 5.44 ppm. The two clear singlets in the 13 C DEPT spectrum, which correspond to C-2 of 2-linked ␣-Ara f (Fig. 3A), appeared to be completely overlapping in the HSQC spectrum. However, it could be resolved into two broad peaks in the slice spectrum of this region (see Fig. 4A, inset), indicative of a similar terminal arabinan structural motif in the WT AG as has been reported for the AG of M. tuberculosis (1). There were noteworthy differences in the NMR spectra of the AG of embB Ϫ strain in comparison with those of WT. From the 13 C DEPT spectra (Fig. 3), it is clear that one of the signals from each of the three resonance doublets attributed to C-1 of 2-linked ␣-Ara f , C-2 of 2-linked ␣-Ara f , and C-1 of terminal ␤-Ara f are missing in the spectra of the AG of embB Ϫ mutant (Fig. 3B). Likewise in the 1 H 13 C HSQC spectrum (Fig. 4B), the cross-peaks corresponding to terminal ␤-Ara f and 2-linked Ara f extending from the 3-arm of 3,5-linked Ara f in the WT are reduced to trace amounts. The slice spectrum in the HSQC (shown as an inset in Fig. 4B) further showed that the two broad singlets corresponding to C-2 of 2-linked Ara f have collapsed into one singlet. A reduction in the volume/intensity of the resonance corresponding to 3,5-␣-Ara f from spin systems at ␦ 108.7 ppm/␦ 5.14 ppm was also evident. Together these data are indicative of the preferential albeit incomplete loss of a ␤-D-Ara f -(132)-␣-D-Ara f terminal disaccharide attached to the 3-OH of the otherwise branched 3,5-␣-Ara f residue in the backdrop of overall reduction in the arabinan content relative to the galactan.
Interestingly the relative peak volume of the anomeric resonances as determined from the 1 H 13 C HSQC experiment (Table II) indicated that the amount of 5,6-linked Gal f relative to 5-linked or 6-linked Gal f was also reduced in the embB Ϫ mutant AG. Since branching in the (5-␤-Gal f 136-␤-Gal f -) n galactan chain has been attributed to arabinan attachment, the lower degree of branching is consistent with reduction in the attached arabinan chains and/or extended chain length of galactan. In comparison, the AG from the embA Ϫ mutant exhibited similar lesion in the arabinan with drastic reduction in the 2-linked ␣-Ara f 33 but maintained the same degree of branching in the galactan as in WT AG. Thus, both embA and embB gene products may act together and specifically contribute to the proper synthesis of the terminal arabinan motif of AG. The embB gene may additionally be involved in the assembly of properly synthesized arabinan onto the galactan scaffold.
HPAEC and MS Mapping of Endoarabinanase Digestion Products-It was reasoned that structural alteration within the terminal elaboration of the arabinan will primarily affect the terminal hexaarabinofuranosyl (Ara 6 ) structural motif, which would otherwise be released intact from AG by the crude endoarabinanases/galactanases preparation from C. gelida (26). Specifically, if a terminal ␤-D-Ara f -(132)-␣-D-Ara f unit was indeed missing as suggested by NMR analyses, a terminal tetraarabinofuranoside (Ara 4 ) will be produced instead of Ara 6 . On the other hand, the other major products including the diarabinoside ␣-Ara f -(135)-Ara f (Ara 2 ) from internal arabinan and the cyclic Gal oligomer fragments deriving from the galactans should remain largely unchanged. The AGs from the WT and emb Ϫ strains were therefore digested by the enzyme preparation, and the resulting oligosaccharides were rapidly mapped by two complementary methods, HPAEC (Fig. 5) and MS (Fig. 6) analyses as described previously (26).
As expected, HPAEC analysis showed that the digested WT AG (Fig. 5A) yielded Ara 2 (peak 1) and Ara 6 (peak 3) as the major oligoarabinosides together with a series of cyclic Gal n , the most prominent of which corresponds to cyclic Gal 4 (peak 4). The chemical identities of the HPAEC peaks have been defined previously by running them against authentic standards (29), and their molecular compositions were further corroborated here by MS analysis of the per-O-methyl derivatives (Fig. 6A). Ara 6 afforded an [M ϩ Na] ϩ molecular ion at m/z 1029, while the cyclic Gal oligomers gave [M ϩ Na] ϩ signals at m/z 839, 1043, 1247, 1451, and 1655 corresponding to cyclic Gal 4 to cyclic Gal 8 . Larger oligomers were also detectable up to cyclic Gal 12 , whereas no other higher Ara oligomers were observed. The HPAEC peak eluting just before peak 1 could be assigned as deriving from contaminating glucan polymers. If the base-solubilized AG was first subjected to a Sephacryl S-300 column as performed for the samples used for NMR analyses, this peak was found to be largely abolished (data not shown). Furthermore, the relative intensity of this peak in the HPAEC profile correlates with the amount of Glc in composition analysis, terminal Glc and 4-linked Glc in linkage analysis (data not shown), and the molecular ion signal at m/z 681 ([M ϩ Na] ϩ of Hex 3 ) in the mass spectra (Fig. 6). The significance of yielding a major Glc 3 product from the glucan polymer by the enzyme treatment is unclear at present but is considered unrelated to the emb mutations under investigation and hence not further pursued here.
It is apparent that both HPAEC (Fig. 5D) and MS (Fig. 6D) profiles of the digestion products of embC Ϫ AG are fairly similar to those of WT, supporting the conclusion that mutation in

Alteration of Terminal Ara 6 in AG 48859
embC did not affect the synthesis of AG. In contrast, both embA Ϫ and embB Ϫ yielded AG significantly altered in its terminal arabinan motif. The HPAEC profiles (Fig. 5, B and C) indicated that the amount of Ara 6 relative to Ara 2 and cyclic Gal 4 was drastically reduced concomitant with the appearance of a peak attributable to Ara 4 based on its retention time (peak 2). From MS analyses, it is clear that the signal corresponding to Ara 6 (m/z 1029) was reduced in intensity relative to the two flanking cyclic Gal signals, i.e. cyclic Gal 4 and Gal 5 at m/z 839 and 1043, respectively. Both embA Ϫ (Fig. 6B) and embB Ϫ (Fig.  6C) were similarly affected but the effect was more pronounced in embB Ϫ . Importantly, the relative intensity of the signal corresponding to Ara 4 at m/z 709 was enhanced as would be expected (see also the insets in Fig. 6 for clearer visual representation). These altered profiles reverted to normal, namely resembling those of WT AG, when the corresponding digestion products from complemented embAB strains were similarly analyzed (insets in Fig. 5, B and C, Fig. 6E for complemented embB). Thus, it can be firmly concluded that the change in the terminal arabinan structure as manifested by conversion of a significant proportion of the Ara 6 motif into Ara 4 was mutation-specific and -dependent. In addition, since HPLC-isolated Ara 4 from the digestion mixtures gave terminal Ara f , 2-Ara f , and 5-Ara f in ϳ1:1:2 ratio by methylation analysis, the defect can be unequivocally attributed to 3-arm branching off the otherwise 3,5-branched Ara f residue in Ara 6 . It should be noted, however, that the inhibition was not complete and that residual amount of Ara 6 could still be detected in both embA Ϫ and embB Ϫ .

DISCUSSION
AG is the major constituent of the mycobacterial cell wall. Although the primary structure of this molecule has been established (30), very little is known about its biosynthetic pathway. One of the strategies to identify the genes involved in AG synthesis is based on the hypothesis that some of these genes will be similar to the genes involved in polysaccharide biosynthetic pathways in other bacteria. This approach proved to be highly successful for identifying genes responsible for the early steps of AG biosynthesis (31). However, due to the fact that D-Ara f residues are rarely found in nature and the search for homologous enzymes is problematical, no arabinosyltransferases involved in the polymerization of mycobacterial AG arabinan have thus far been identified. An alternative approach to finding enzymes involved in AG biosynthesis is to use the antimycobacterial drug EMB, which inhibits the biosynthesis of the arabinan moiety of both AG and LAM (8,24). It had been previously demonstrated that ␤-D-arabinofuranosyl-P-decaprenol is the precursor of the Ara f residues of arabinan and accumulates when mycobacterial cells were treated with EMB (32). Further, the transfer of Ara f from ␤-D-arabinofuranosyl-P-decaprenol to arabinan was inhibited by EMB, although only by about 50% in the context of a crude cell-free assay system (33). Based on this evidence, the hypothesis arose that the target(s) for EMB action were arabinosyltransferases. Previous studies had also implicated the Emb proteins, directly or indirectly, in arabinan biosynthesis (9,10). It was demonstrated, by chemical analysis and radiolabeling experiments, that M. smegmatis cells transformed with embAB on a multicopy plasmid and treated with EMB produce normal AG, whereas similar treated cells containing only the cloning vector showed a marked diminution in arabinan content of AG (9,24). Taken together, these data suggested that EmbAB proteins either are themselves arabinosyltransferases or are implicated in arabinan synthesis, for example, in the export of the synthesized oligosaccharide precursors.
In this study, we found that inactivation of the embA and embB genes resulted in a severe alteration in the physiological behavior of the bacterium. Further, we showed that inactivation of these genes resulted in a marked reduction of the total arabinose content of AG compensated by a partial increase in galactose content. More specifically, upon detailed structural analyses by HPAEC, MS, and NMR, the amount of the nonreducing terminal disaccharide ␤-D-Ara f (132)-␣-D-Ara f that could be derived from AG was shown to be dramatically diminished in both embA Ϫ and embB Ϫ . Since a substantial amount of the otherwise predominant Ara 6 branched structural unit was found to be converted to linear Ara 4 , it can be further concluded that a primary lesion in the embA Ϫ and embB Ϫ mutants resides in the 3-arm branching off the ␣-5-arabinan chain proximal to the nonreducing end and the attachment site for the mycolates. It has been established that only two-thirds of the mycolylation sites of AG are substituted with mycolates either at the terminal ␤-D-Ara f or the penultimate 2-␣-D-Ara f (3). The ␤-D-Ara f (132)-␣-D-Ara f -deficient polymer formed as a consequence of embA or embB gene interruption is expected to contain fewer mycolates resulting in an increase of the cell wall permeability. The amounts of mycolic acids were indeed reduced in both embA Ϫ and embB Ϫ mutants (data not shown),  [2]. For the second column, the values were normalized to 6-␤-Gal f . b Two sets of terminal and 2-linked ␣-Ara f resonances were detected, denoted respectively as 1 and 2. The chemical shifts of the first set correspond to ␤-Ara f 132-␣-Ara f 13 on the 3-arm, whereas the second were assigned to those on the 5-arm of the branched 3,5-␣-Ara f residue. Only about half of the terminal ␤-Ara f detected was accounted for by direct conjugation to the 2-linked Ara f in the WT AG based on this method of quantification. and there was an increased permeability to hydrophobic antibiotics. However, the ability of these mutants to grow, albeit more slowly than the WT, demonstrates that the emb genes are nonessential. As for the increase in galactose content in AG from the emb Ϫ mutants, we hypothesize that it is due to an abnormal synthesis of the galactan when restraints imposed by arabinose addition are lessened.
The total number of arabinosyltransferases that are required for arabinan biosynthesis is a matter of speculation, depending on how the arabinan chains are assembled. Based on the structural information available, one can predict that as many as five to six enzymes are needed for the formation of the different linkages, and these could very well vary for LAM and AG. The presence of the common terminal hexaarabinofuranosyl motif in AG and LAM suggests that similar sets of arabinosyltransferases are involved in their assembly, but they must be sufficiently distinct to allow divergent processing. For instance, the embA Ϫ and embB Ϫ mutants have no discernible effect on arabinosylation of LAM even though they both dramatically alter the arabinosylation of AG. In contrast, preliminary data indicate that in the embC Ϫ mutant, the arabinosylation of LAM is abolished (data not shown). Therefore, although very similar in sequence, the Emb proteins are able to discriminate between the arabinan synthesis of AG and LAM. However, the structural basis for this specificity remains to be elucidated.
Based on our results, it is now clear that the Emb proteins are involved in the proper branching of the terminal Ara 6 motif. The fact that the inactivation of embA or embB resulted in a very similar defect strongly suggests that EmbA and EmbB proteins share a very similar if not identical function. We propose that these proteins could both act as ␣-(133) arabinosyltransferases. However, several questions remain unanswered. For example, ␤-D-arabinofuranosyl-P-decaprenol is the only precursor/donor identified in arabinogalactan/arabino-FIG. 5. Dionex-HPAEC profiles of the endoarabinanase digestion products of AGs. AGs were prepared from M. smegmatis WT (A), embA Ϫ mutant (B), embB Ϫ mutant (C), and embC Ϫ mutant (D). The digestion products were lyophilized and redissolved in water for HPAEC analysis without further purification: peak 1, Ara 2 ; peak 2, Ara 4 ; peak 3, Ara 6 ; peak 4, cyclic Gal 4 . Peak marked "x " corresponds to the Glc contaminant (see text). The profiles of digests from embA-and embB-complemented strains are shown as insets in panels B and C. The major peaks are all confirmed using mass spectral analysis. mannan synthesis (32), and because lipid-linked sugar donors are conventionally transported across the membranes (34), one can postulate that some intermediates need to be translocated from the inner leaf to the outer side of the plasma membrane in order for complex arabinosylation to occur. The 11-13 transmembrane domains present in the three Emb proteins (10,14) suggest that they may also operate, at least in part, as proteins involved in transporting LAM and AG arabinan precursors across the plasma membrane. In this case, the Emb proteins would work in close relation with the true arabinosyltransferases so far unidentified or might be bifunctional proteins with both transferase and transport activities.
Until recently the Emb proteins were known exclusively in mycobacteria. However, progress in the sequencing of the genome of Corynebacterium diphtheriae revealed the presence of a gene having 40% identity with the mycobacterial emb genes (Sanger Center, Cambridge, UK). C. diphtheriae cell wall structure presents many similarities with that of mycobacteria, including the presence of AG (35,36). However, the glycosyl linkage composition of AG from C. diphtheriae differs from published mycobacterial AG composition by the absence of 3,5-linked Ara f (37). This observation strongly suggests that the Emb proteins are members of a family of proteins involved in the formation of various linkages in the arabinan molecule. The fact that they are not confined to mycobacteria offer the possibility of comparative studies among the "mycolata," bacteria sharing similar cell wall structure (Mycobacterium, Nocardia, Rhodoccocus, and Corynebacterium). The fine specificities and hence functional roles of each of the homologous Emb proteins are likely to be encoded by differences in a few key amino acid residues within the catalytic or active sites. This and additional emb gene sequence information derived from various EMB-resistant mutants will aid in the final delineation of the precise role of the Emb proteins in the arabinan biosynthesis as well as the mechanism of EMB drug action and resistance.