Characterization of Three Glycosyltransferases Involved in the Biosynthesis of the Phenolic Glycolipid Antigens from the Mycobacterium tuberculosis Complex*

Mycobacterium tuberculosis and Mycobacterium leprae, the two main mycobacterial pathogens in humans, produce highly specific long chain β-diols, the dimycocerosates of phthiocerol, and structurally related phenolic glycolipid (PGL) antigens, which are important virulence factors. In addition, M. tuberculosis also secretes glycosylated p-hydroxybenzoic acid methyl esters (p-HBAD) that contain the same carbohydrate moiety as the species-specific PGL of M. tuberculosis (PGL-tb). The genes involved in the biosynthesis of these compounds in M. tuberculosis are grouped on a 70-kilobase chromosomal fragment containing three genes encoding putative glycosyltransferases: Rv2957, Rv2958c, and Rv2962c. To determine the functions of these genes, three recombinant M. tuberculosis strains, in which these genes were individually inactivated, were constructed and biochemically characterized. Our results demonstrated that (i) the biosynthesis of PGL-tb and p-HBAD involves common enzymatic steps, (ii) the Rv2957, Rv2958c, and Rv2962c genes are involved in the formation of the glycosyl moiety of the two classes of molecules, and (iii) the product of Rv2962c catalyzes the transfer of a rhamnosyl residue onto p-hydroxybenzoic acid ethyl ester or phenolphthiocerol dimycocerosates, whereas the products of Rv2958c and Rv2957 add a second rhamnosyl unit and a fucosyl residue to form the species-specific triglycosyl appendage of PGL-tb and p-HBAD. The recombinant strains produced provide the tools to study the role of the carbohydrate domain of PGL-tb and p-HBAD in M. tuberculosis pathogenesis.

Mycobacteria cause extremely serious diseases, such as tuberculosis and leprosy, and are major causes of mortality and morbidity worldwide. Although little is known about the molecular mechanisms of mycobacterial pathogenesis, the data accumulated during the last decades revealed that the mycobacterial envelope plays a fundamental role in this process. This envelope is a complex structure that differs from those of other bacteria by both its molecular composition and the architectural arrangement of its constituents. It is characterized by its high lipid content, up to 60% of the dry weight of the bacteria, and by the variety of lipid compounds with unusual structures, many of which are unique to mycobacteria (1). Consistently, genome sequence data for several mycobacterial species revealed that a high proportion of genes may be involved in lipid metabolism (ϳ250 genes in M. tuberculosis) (2,3).
Two structurally related families of lipids, phthiocerol diesters and phenolic glycolipids, have retained a special attention. These mycobacterial cell envelope constituents are produced by a small number of slow growing species, in particular the pathogenic species Mycobacterium leprae, Mycobacterium ulcerans, Mycobacterium marinum and members of the Mycobacterium tuberculosis complex (4). Phthiocerols diesters are composed of a mixture of long chain ␤-diols that are esterified by multimethyl-branched fatty acids. Depending on the asymmetric centers bearing the methyl branches (D or L series), the fatty acids are called mycocerosic or phthioceranic acids, respectively, and the corresponding complex lipids are named dimycocerosates of phthiocerol (DIMs) 1 or diphthioceranates of phthiocerol (DIPs) (4). The phenolic glycolipids (PGLs) consist of a lipid core very similar to those of DIMs or diphthioceranates of phthiocerol but -terminated by an aromatic nucleus that is glycosylated by a type-or species specific mono-, tri-, or tetrasaccharide ( Fig. 1) (5,6). Several groups have clearly shown that DIMs are involved in mycobacterial pathogenicity by demonstrating that M. tuberculosis mutants deficient in the biosynthesis or transport of these molecules display attenuated multiplication in guinea pigs and mice (7)(8)(9). Results suggesting that PGLs are involved in the pathogenesis of mycobacterial infections are less clear-cut. These glycolipids exhibit biological activities in vitro that may be relevant to the pathogenesis of mycobacterial infections. For instance, the major PGLs from M. leprae (PGL-1) and from other mycobacterial species inhibit the proliferation of T lymphocytes after stimulation with concanavalin A (10,11). Furthermore, PGL-1 seems to be associated with resistance to intracellular killing by macrophages (12) and promotes phagocytosis of M. leprae by macrophages and Schwann cells through binding to complement component C3 and laminin-2, respectively (13,14). The molecular basis of the interaction between M. leprae and Schwann cells was recently elucidated; the saccharidyl moiety of PGL-1 binds to the ␣2LG1, ␣2LG4, and ␣2LG5 modules of the peripheral nerve laminin ␣2 chain (14). Similarly, a M. tuberculosis mutant deficient in the biosynthesis of PGL-tb was recently shown to be attenuated in the mouse model in comparison with the parental (PGL-tb producing) strain (15). However, most clinical isolates of M. tuberculosis do not synthesize PGL-tb (16 -19). We recently identified a third family of molecules related to PGL in the culture supernatant of all M. tuberculosis and Mycobacterium bovis BCG strains tested (20). These compounds, called p-hydroxybenzoic acid derivatives (p-HBAD) are small glycoconjugates that contain the same glycosylated phenolic moiety as PGL-tb (20). Because of the high structural similarity between p-HBAD and the key structural determinant for the interaction of this molecule with the host cellular ligand, i.e. the saccharide moiety of PGL, the role of p-HBAD in mycobacterial pathogenesis needs to be addressed.
The biosynthesis of DIM and PGL has just begun to be elucidated and has been shown to involve a common synthetic pathway in which two different precursors, either C22-C24 fatty acid or p-hydroxyphenylalkanoic acid, are elongated by three malonyl-CoA and two methylmalonyl-CoA units to yield a common lipid core ( Fig. 1) (21). These sequential elongation steps are believed to be catalyzed by a group of five polyketide synthases named PpsA-E. Mycocerosic acids, the class of multimethyl-branched fatty acids that esterify phthiocerol and phenolphthiocerol in members of the M. tuberculosis complex and M. leprae (4), are synthesized by another polyketide synthase, Mas (22,23). More recently, we have shown that another polyketide synthase, Pks15/1, catalyzes the elongation of p-hydroxybenzoic acids to form p-hydroxyphenylalkanoates, the substrate of PpsA-E in the biosynthesis of PGL (20). The mas, ppsA-E and pks15/1 genes are clustered in a region of the M. tuberculosis chromosome that contains six other genes encoding acyl-adenylate synthase (fadD26 and fadD28) and transporters (drrA, drrB, drrC, and mmpL7) important for DIM biosynthesis and translocation (9,24). A genomic region that is well conserved between M. tuberculosis and M. leprae has been found downstream of the mas and pks15/1 genes. This region contains genes encoding proteins with similarities to methyltransferases and glycosyltransferases, both enzymes that are required for the transfer and modification of the saccharide moiety of PGL (20).
In this study, we examined the formation of the carbohydrate domains of PGL-tb and p-HBAD. We provide evidences that the Rv2962cc Rv2958c, and Rv2957 genes are involved in the sequential transfer of the three monosaccharides found in PGL-tb and p-HBAD of M. tuberculosis.
Computer Analysis-The DNA sequences of M. tuberculosis strains CDC1551 and 210 were obtained from the Institute for Genomic Research website (www.tigr.org) and M. bovis strain AF2122/97 DNA sequence (spoligotype 9) from the Sanger Institute website (www. sanger.ac.uk). Multiple sequence alignments were performed using ClustalW (26) at the Pasteur Institute Website (www.pasteur.fr). The  Table III. sequences of the pks15/1 regions of M. tuberculosis and M. leprae were compared using the Blast 2 program (27) at the Pasteur Institute Website (www.pasteur.fr).
Construction of M. tuberculosis H37Rv Mutants-Mycobacterial DNA was extracted from 5 ml of saturated cultures as described previously (28). DNA pellets were resuspended in 100 l of 10 mM Tris (pH 8) buffer. PCRs were performed using M. bovis BCG genomic DNA (Table I) Table I). The amplification program consisted of 1 cycle at 95°C for 10 min followed by 35 cycles of 30 s at 95°C, 30 s at 57°C, and 3 min at 72°C. A final extension at 72°C for 10 min was then performed. The PCR products were analyzed by electrophoresis in 0.8% agarose gels. The various fragments, 2647 bp for Rv2962c, 3345 bp for Rv2957, and 3222 bp for Rv2958c were gel-purified using the Qiaquick gel extraction purification kit (Qiagen, Courtaboeuf, France) and digested with NotI and XbaI. To remove the BamHI and ApaI restriction sites from the multicloning site, to be used later, the pBlueScript vector was digested with SpeI and KpnI and religated to give pPET2. The different PCR fragments were inserted between the NotI and XbaI restriction sites of pPET2 to give pPET6 (Rv2962c), pPET3 (Rv2957), and pPET4(Rv2958c) ( Table II). A kanamycin resistance cassette formed by the ⍀km cassette from pHP45⍀km (29) flanked by two res sites from transposon ␥␦ (30) was inserted between the BamHI and NruI sites of Rv2962c, generating a 344-bp deletion, between the two NruI sites of Rv2957, generating a 310-bp deletion, or into the ApaI site of Rv2958c. The resulting plasmids were named pPET15, pPET10, and pPET12, respectively (Table II). The PmeI fragments from the various plasmids, containing the disrupted gene constructs, were inserted at the XbaI site of pPR27 (31), generating pPET42 (⌬Rv2962c::km), pPET36 (⌬Rv2957::km), and pPET38 (Rv2958c::km) (Table II). M. tuberculosis H37Rv was electrotransformed as previously described, and transformants were selected on 7H11 ϩ OADC ϩ kanamycin at 32°C (32). Two transformants obtained with each of the plasmids were grown in 5 ml of 7H9 ϩ ADC ϩ kanamycin ϩ Tween at 32°C until saturation. Dilutions of these cultures were plated on 7H11 ϩ OADC ϩ kanamycin ϩ sucrose and incubated at 39°C for 4 weeks. Colonies of M. tuberculosis H37Rv containing pPET42, pPET36, or pPET38 were screened by PCR using primers C, D, E, res1, or res2 (Table I), corresponding to the gene to disrupt. The amplification program consisted of 1 cycle of 10 min at 95°C followed by 35 cycles of 30 s at 95°C, 30 s at 55°C, 3 min at 72°C. A final extension of 10 min at 72°C was then performed. For each construct, one clone giving the pattern corresponding to allelic exchange was retained for further analysis. These strains were renamed PMM22 (⌬Rv2962c::km), PMM24 (⌬Rv2957::km), and PMM19 (Rv2958c::km) ( Table II).
Complementation of M. bovis BCG-Plasmid pMV361H is an Escherichia coli -mycobacteria shuttle vector derived from pMV361 by insertion of an hygromycin resistance gene (Table II). The vector pMV361 contains the integration system from mycobacteriophage L5 and a mycobacterial promoter, phsp60, upstream from a multicloning site (33). The full-length Rv2958c gene from M. tuberculosis H37Rv was amplified by PCR using primers Rv2958A and Rv2958B. The PCR was performed in a final volume of 50 l containing 2.5 units of Pfu DNA polymerase (Promega, Lyon, France), 10% Me 2 SO, and 1 mM each primer. The amplification program consisted of 1 cycle of 10 min at 95°C followed by 35 cycles of 1 min at 95°C, 30 s at 57°C, and 3 min at 72°C. A final extension of 10 min at 72°C was then applied. The PCR products were analyzed by electrophoresis in a 0.8% agarose gel. The 3.2-kilobase fragment was gel-purified using the Qiaquick gel extraction purification kit (Qiagen). It was digested with BglII and BclI and inserted into the BamHI site of pMV361H to give pPET52. In this construct, the Rv2958c gene was placed under the control of the mycobacterial promotor phsp60. The plasmid, pPET52, was used for electrotransformation of M. bovis BCG and the PMM19 mutant strain. Transformants were selected on hygromycin-containing plates.
Extraction and Purification of Glycolipids-Mycobacteria were separated from the culture media, and both were kept. The media were sterilized by filtration through 0.2-m pore size membranes and concentrated to 1 ⁄10 of the initial volumes. Lipids from culture media were obtained by adding 2 volumes of CH 3 OH and 1 volume of CHCl 3 to 0.8 volume of concentrated culture medium to yield a homogeneous onephase mixture (34). After a 2-h period, the mixture was partitioned into two phases by the addition of 1 volume of H 2 O/CHCl 3 (1:1, v/v). The organic phases were recovered, washed twice with water (50 ml), and dried.
The cells were left in CHCl 3 /CH 3 OH (1:2 v/v) for 2 days at room temperature to kill bacteria. Lipids were then extracted once with CHCl 3 /CH 3 OH (1:1 v/v) for 24 h and twice with CHCl 3 /CH 3 OH (2:1 v/v) for 24 h each, washed twice with water (50 ml), and dried.
The production of glycolipids by the various strains was examined by thin-layer chromatography (TLC). Briefly, the various extracts were dissolved in CHCl 3 to give a final lipid concentration of 20 mg/ml. Equivalent volumes of each extract were deposited on silica gel G 60 plates (0.3 mm, 20 ϫ 20 cm, Merck) which were run in CHCl 3 /CH 3 OH (95:5, v/v) or CHCl 3 /CH 3 OH (97:3, v/v). Glycolipids were visualized by spraying the plates with 0.2% anthrone (w/v) in concentrated H 2 SO 4 followed by heating. Glycolipids were purified as previously described (16). Crude lipid extracts from cells or culture media were subjected to chromatography on a Florisil (60 -100 mesh) column and eluted with a series of concentrations of CH 3 OH (0, 1, 2, 3, 4, 5, 10, 50, and 100%) in CHCl 3 . Each fraction was analyzed by TLC on silica gel G 60 using CHCl 3 /CH 3 OH (95:5, v/v) as the solvent system. Glycolipids were visualized as described above. When necessary, glycolipids were additionally purified by preparative chromatography on silica gel G 60 plates (0.3 mm, 20 ϫ 20 cm, Merck) using CHCl 3 /CH 3 OH (95:5, v/v) as the developing solvent.
Apolar lipids, such as DIMs, were analyzed by TLC on silica gel G 60 plates using petroleum ether/diethyl ether (9:1, v/v) as eluant. Lipid compounds were visualized by spraying the plates with 10% phosphomolybdic acid in ethanol followed by heating.
Nuclear Magnetic Resonance (NMR) Spectroscopy-The 1 H NMR spectra were recorded at 295 K on a Bruker DMX 500 apparatus using a 5-mm BBI probe. Samples were dissolved in 100% CDCl 3 . The chemical shifts obtained by NMR spectroscopy were assigned using chloroform as the reference for protons (7.23 ppm).
Matrix-assisted Laser Desorption-ionization Time-of-flight (MALDI-TOF) Mass Spectrometry-MALDI-TOF mass spectrometry was performed using a Voyager DE-STR MALDI-TOF instrument (PerSeptive Biosystems) equipped with a pulse nitrogen laser emitting at 337 nm as The underlined and bold characters indicate the restriction sites introduced in the primers and used for the plasmids constructions.

Gene
Oligonucleotide Sequence

Rv2962c
Rv2962A 5Ј-GCTCTAGATCTCATAAAAATGTATCCTAAATCAAATATC-3Ј previously described (35). Samples were analyzed in the reflector mode using an extraction delay time of 100 ns and an accelerating voltage operating in positive ion mode of 20 kV. The mass spectra were mass assigned by external calibration. Samples (1 l of a 1 mg/ml solution in CHCl 3 ) were directly applied onto the sample plate. The matrix solution (0.5 l of 2,5-dihydroxybenzoic acid at 10 mg/ml in CHCl 3 /CH 3 OH (1:1, v/v)) was added. The samples were then allowed to crystallize at room temperature.
Chemical Analysis-Lipid samples were methanolized using CH 3 OH/HCl prepared by the acetyl chloride reaction on methanol in anhydrous conditions. Samples were dissolved in 500 l of CH 3 OH/HCl (1 N) and incubated overnight at 80°C in nitrogen atmosphere. The solvent was evaporated under nitrogen and then co-evaporated three times with anhydrous methanol.
For trimethylsilyl derivatization, samples were dissolved in 200 l of anhydrous pyridine. We then added 100 l of hexamethyldisilazane and 50 l of trimethylchlorosilane (36). The reaction mixture was incubated at room temperature for 30 min. The mixture was dried under nitrogen, and the trimethylsilyl derivatives were dissolved in diethyl ether for gas chromatography (GC) and GC-mass spectrometry (GC-MS).
GC and GC-MS Analysis-GC analyses were performed using a Girdel series 30 instrument equipped with an OV1 capillary column (0.30 mm ϫ 25 m) using helium gas (0.7 bar) and a flame ionization detector at 310°C. The temperature program was from 60 to 100°C at 20°C/min then 100 to 310°C at 5°C/min. GC-MS analyses were performed on a Hewlett-Packard 5889 X mass spectrometer (electron energy, 70 eV) working in both electron impact and chemical ionization modes (using NH 3 as the reagent gas) coupled with a Hewlett-Packard 5890 series II gas chromatograph fitted with a similar OV1 column (0.30 mm ϫ 12 m).

RESULTS
Disruption of the Rv2962c, Rv2957, and Rv2958c Genes in M. tuberculosis H37Rv by Allelic Exchange-The genome sequence of M. tuberculosis revealed that the genes known to be involved in the biosynthesis of DIMs and PGLs, such as fadD26, ppsA-E, mas, fadD28, and pks15/1 are clustered on the chromosome (2). A region containing three putative genes encoding proteins with similarities to glycosyltransferases was found downstream of pks15/1. This region is conserved in M. leprae but contains only two genes encoding putative glycosyltransferases. We hypothesized that these genes are involved in the formation of the carbohydrate moiety of PGLs (20). To determine the biosynthetic roles of these putative glycosyltransferases, we constructed three M. tuberculosis knock-out mutants with insertions in Rv2957, Rv2958c, or Rv2962c. Three DNA fragments of 3.3, 3.2, and 2.6 kilobases overlapping the genes Rv2957, Rv2958c, and Rv2962c, respectively, were amplified by PCR and cloned. The three cloned genes were disrupted by insertion of a kanamycin cassette (Fig. 2). The DNA fragments carrying the mutated alleles of the three putative glycosyltransferase genes were then inserted into the vector pPR27 (31) to give plasmids pPET42 (Rv2962c), pPET36 (Rv2957), and pPET38 (Rv2958c). These constructs were independently transferred into M. tuberculosis H37Rv by electroporation, and the allelic replacement of the wild-type copy of these genes by the mutated allele was selected as described by Pelicic et al. (31). Several clones obtained with each of the three plasmids were screened by PCR using primers CϩD, Cϩres2, or Eϩres1 to test whether the wild-type allele of the different genes had been replaced by the disrupted allele (Fig. 2). Three clones that gave an amplification pattern consistent with the insertion of the km cassette within genes Rv2957, Rv2958c, or Rv2962c (Fig. 2) were retained for further analysis and named PMM22 (⌬Rv2962c::km), PMM24 (⌬Rv2957::km) , and PMM19 (Rv2958c::km).
Biochemical Characterization of the Putative Glycosyltransferase Mutants-To determine the effects of the mutations in genes Rv2957, Rv2958c, and Rv2962c, we first examined p-HBAD production. These compounds are mostly found in the culture supernatant of M. tuberculosis. Therefore, the various mutants and the wild-type strain, H37Rv, were grown in liquid culture, and lipids were extracted from the culture supernatant. We have previously shown that M. tuberculosis strains produce two glycoconjugates, namely p-HBAD I and p-HBAD II (Fig. 3A), the structures of which have been shown to correspond to 2-O-methyl-rhamnosyl-␣and 2,3,4-tri-O-methylfucosyl-␣-(133)-rhamnosyl-␣-(133)-2-O-methyl-rhamnosyl-␣p-hydroxybenzoic acid methyl esters, respectively (20). TLC analysis of the organic solvent extracts of the various mutants revealed different patterns of glycoconjugate production. In the mutant strain PMM22 (⌬Rv2962c::km), no glycoconjugate was detected at all. In contrast, the organic solvent extracts from pMM19 (⌬Rv2958c::km) and PMM24(⌬Rv2957::km) revealed a single spot with a mobility similar to that of p-HBAD I (Fig.  3A). None of the mutant strains produced p-HBAD II, which was the most abundant product in the control H37Rv strain. Therefore, these data indicated that the three putative glycosyltransferase genes, Rv2957, Rv2958c, and Rv2962c play a role in the biosynthesis of p-HBAD. We next addressed the question of their roles in the biosynthesis of PGL-tb. Although all strains of M. tuberculosis examined to date synthesize p-HBAD (20), only a few produce the structurally related PGL-tb (16 -19). Recently, we demonstrated that the polyketide synthase gene pks15/1 is involved in the elongation of p-hydroxybenzoic acid derivatives to form p-hydroxyphenylalkanoates, which are in turn converted, presumably by the PpsA-E synthase, to phenolphthiocerol deriv-atives (20). M. tuberculosis isolates devoid of PGL-tb, such as H37Rv, are natural mutants with a frameshift mutation in pks15/1 (20). The production of PGL-tb in these strains can be restored by transferring the functional pks15/1 gene of M. bovis BCG in these strains (20). Accordingly, to determine the role of the putative Rv2957, Rv2958c, and Rv2962c genes in the biosynthesis of PGL-tb in M. tuberculosis, we transferred a functional pks15/1 gene, carried by plasmid pPET1, into the three mutant strains PMM19 (⌬Rv2958c::km), PMM22 (⌬Rv2962c::km) and PMM24 (⌬Rv2957::km) and the wild type parental H37Rv strain. We then examined the production of PGL-tb in the transformants. In contrast to p-HBAD, found mostly in the culture supernatants of M. tuberculosis strains, PGL-tb remains associated with the bacterial cells. TLC analyses showed that the production of PGL-tb by the transfor- mants differed from that of H37Rv:pPET1 (Fig. 3B), as observed for the p-HBAD. The glycoconjugates from PMM19: pPET1 and PMM24:pPET1 exhibited similar TLC mobility to 2-O-methyl-␣-rhamnosyl-phenolphthiocerol dimycocerosates, the major PGL from M. bovis BCG, formerly called mycoside B, which is also present in PGL-tb-containing M. tuberculosis as a minor product (37). In the cases of PMM22:pPET1, no glycoconjugate was detected at all. Thus, these preliminary analyses demonstrated that mutations in Rv2957, Rv2958c, and Rv2962c simultaneously affect the biosynthesis of p-HBAD and PGL-tb in M. tuberculosis, suggesting that the encoded enzymes are involved in both biosynthesis pathways.
Structural Analysis of the Compounds Accumulated in the M. tuberculosis H37Rv ⌬Rv2962c::km Mutant-We first focused on the role of Rv2962c in PGL-tb biosynthesis. We speculated that the Rv2962c gene, mutated in strain PMM22: pPET1, is involved in a very early glycosylation step of PGL-tb biosynthesis and that a non-glycosylated intermediate of PGL biosynthesis may accumulate. Therefore, we looked for this type of intermediate by analyzing the lipid extracts by TLC using a less polar solvent system. As expected, the lipid profile of the PMM22:pPET1 (⌬Rv2962c::km) mutant strain contained additional spots not found in extracts from the control H37Rv: pPET1 strain in the petroleum ether/diethyl ether (9:1, v/v) solvent system (Fig. 3C). The two compounds, named A and B, that accumulated in the H37Rv ⌬Rv2962c::km mutant (Fig.  3C) were purified by chromatography on a Florisil column and analyzed by MALDI-TOF. The MALDI-TOF mass spectrum of product A showed a series of pseudomolecular ion (MϩNa) ϩ peaks at 1370, 1384, 1398, 1412, 1426, 1440, 1454, 1468, 1482, and 1496 m/z (Fig. 4B). These mass values were 494 mass units lower than those observed with PGL-tb from H37Rv (Fig. 4A). This difference corresponded to the mass of the tri-O-methylfucosyl-(133)-rhamnosyl-(133)-2-O-methyl-rhamnosyl moiety found in PGL-tb, suggesting that this accumulated putative biosynthetic intermediate corresponds to phenolphthiocerol dimycocerosate. The MALDI-TOF mass spectrum of compound B showed pseudomolecular ion peaks at the same m/z values as those of compound A, but the relative intensities of the peaks were different (Fig. 4C). To find out why compounds A and B had similar masses but different mobilities on TLC, we further analyzed the two purified by UV spectroscopy (Fig. 5). The spectra of both products A and B showed three absorption peaks, a major peak at 220 nm and 2 peaks with absorptions of comparable molecular extinction (⑀) values. The maxima of these latter peaks were seen at 277 and 283 nm for product B, corresponding to the expected absorption spectrum of a phenol ring with a free hydroxyl group (Fig. 5) (38). For product A, the absorption peaks were larger and less well defined (Fig. 5), a feature observed for methylated phenolic groups (38). This observation is consistent with the higher mobility of product A on TLC. To firmly establish the structures of the two putative precursors of PGL-tb, purified compounds A and B were analyzed by 1 H NMR. The resulting spectra were very similar, and most of the proton resonances (Table III) were common to the two spectra and to that obtained with PGL-tb (16,20). The main exception was the absence of sugar proton resonances in the spectra of compounds A and B (Table III). In addition to the broad signal resonances attributable to the multimethylenic (CH2) protons (at 1.25 ppm), several terminal methyl (CH3) proton resonances were observed at 0.8 -1.0 ppm (Table III), consistent with the presence of multimethyl-branched fatty acyl residues in the two molecules. The presence of a doublet at 1.15 ppm indicated the occurrence of a methyl group in the ␣-position of the multimethyl-branched fatty acyl residues ( Fig.  1 and Table III, signal f); the methine proton of the ␣-carbon was detected at 2.55 ppm ( Fig. 1 and Table III, signal d). As expected from the esterification of both hydroxyl groups of the ␤-glycol, proton resonances were observed at 4.83 ppm ( Fig. 1 and Table III, signal a). The proton resonances of a methoxyl (OCH 3 ) group and of the carbon bearing this methoxyl group were observed, respectively, at 3.32 ppm ( Fig. 1 and Table III, signal b) and 2.85 ppm ( Fig. 1 and Table III, signal c). These resonances corresponded to those of the methoxyl group found in phthiocerol and phenolphthiocerol (16,20). Two deshielded doublets at 6.80 ( Fig. 1 and Table III, signal g) and 7.08 ppm ( Fig. 1 and Table III, signal h) for compound A and at 6.72 ( Fig.  1 and Table III, signal g) and 7.01 ppm ( Fig. 1 and Table III, signal h) for compound B were assigned to phenolic proton resonances, indicating a more polar environment for the phenolic protons in the compound B than in the compound A. Importantly, an additional signal resonance was observed at 3.65 ppm in the spectrum of the compound, with the highest mobility on TLC, i.e. compound A (Fig. 3C), and corresponded to the proton resonances of a methoxyl group located on the phenol ring ( Fig. 1 and Table III, signal k).
Altogether, data from the NMR, UV, and MALDI-TOF spectra clearly show that compounds A and B are structurally related and that compound B corresponds to phenolphthiocerol dimycocerosate, whereas compound A is a methylated form of compound B. These findings strongly support the hypothesis that the product of Rv2962c is involved in the first glycosylation step of phenolphthiocerol. Furthermore, the production of non-glycosylated phenolphthiocerol dimycocerosates by M. tuberculosis PMM22:pPET1 (⌬Rv2962c::km) mutant proves that the glycosylation of the phenolic compounds is not required for the synthesis of the phenolphthiocerol dimycocerosate moiety of phenolic glycolipids.
Structural Characterization of the Biosynthetic Intermediates Accumulated in the ⌬Rv2957::km and Rv2958c::km Mutant Strains-To gain insight into the biosynthetic steps involving the products of genes Rv2957 and Rv2958c, the compounds, products C and D, that accumulated in the ⌬Rv2957::km and Rv2958c::km mutants (Fig. 3) were purified by chromatography on a Florisil column and structurally characterized.
First, the 1 H NMR spectra of the glycoconjugates, products C, with similar mobilities that accumulated in the culture supernatants of strains PMM24 (⌬Rv2957::km) and PMM19 (Rv2958::km) (Fig. 3A) were superimposable and showed two deshielded doublets at 7.10 and 8.00 ppm assigned to phenolic proton resonances ( Fig. 1 and Table III, signals g and h). Compounds C contain one sugar residue, the anomeric proton resonance of which was seen at 5.65 ppm ( Fig. 1 and Table III, signal i). Two methoxyl group proton resonances (3H, each) were also observed at 3.55 and 3.90 ppm ( Fig. 1 and Table III, signal j and j'). These features indicated that both PMM24 and PMM19 produced a mono-O-methyl-glycosyl-p-hydroxybenzoate methyl ester with similar mobility on TLC to that of the minor glycoconjugate, p-HBAD I, produced by the wild-type M. tuberculosis strain, suggesting that these compounds have the same structure. GC analysis of the trimethylsilyl derivative of native compound C purified from PMM24 and PMM19 culture supernatant gave a peak superimposable to that of the trimethylsilyl derivative of the purified p-HBAD I (data not shown) (20). Therefore, these results demonstrate that both PMM24 and PMM19 produce p-HBAD I but are unable to synthesize the major glycosylated p-HBAD, p-HBAD II, usually found in M. tuberculosis (20).
To evaluate the implication of the enzymes encoded by the Rv2957 and Rv2958c genes in PGL-tb biosynthesis, we analyzed the structures of the major glycolipids found associated with the cells of PMM24 (⌬Rv2957::km) and PMM19 (Rv2958c::km) transformed with pPET1. These compounds exhibited similar mobilities on TLC and, thus, probably share the same structure. Accordingly, structural analyses were performed only on compound D from PMM24:pPET1. The MALDI-TOF mass spectrum of glycoconjugate D showed a series of pseudomolecular ion (MϩNa) ϩ peaks at 1516, 1530, 1544,  1558, 1572, 1586, 1600, 1614, 1628, 1642, 1656, 1670, and 1684 m/z (Fig. 4D). The same peaks were observed in the mass spectrum of the PGL-tb from the M. tuberculosis Canetti strain or H37Rv:pPET1 but the mass values of the major pseudomolecular ion peaks were 334 mass units lower for the glycolipid from the mutant PMM24:pPET1 (Fig. 4D). This difference corresponds to the calculated mass of tri-O-methylfucosyl-␣-(133)-rhamnose which is the terminal disaccharide of the sugar moiety of PGL-tb. Acid hydrolysis of the glycolipids from PMM24:pPET1, followed by trimethylsilylation and GC and GC-MS analysis revealed that the only sugar constituent of compound D is 2-O-methylrhamnose; i.e. the GC-MS spectrum obtained in chemical ionization mode showed a pseudomolecular ion [MϩNH 4 ] ϩ peak at 412 m/z, whereas fragmentation peaks were observed at 191, 146, and 73 m/z in the electron impact spectrum.
These results demonstrate that mutations in genes Rv2957 and Rv2958c have similar effects on p-HBAD and PGL-tb biosynthesis, blocking the production of PGL-tb and p-HBAD II and leading to the accumulation of truncated forms containing only the first sugar, the 2-O-methyl rhamnosyl residue of the saccharide moiety. This finding supports a model in which the glycosyltransferases encoded by Rv2957 and Rv2958c catalyze the transfer of the terminal two sugars onto p-HBAD I and 2-O-methyl-rhamnosyl-phenolphthiocerol dimycocerosate to form p-HBAD II and PGL-tb. Although our data clearly show that the two glycosyltransferases encoded by Rv2957 and Rv2958c are involved in the transfer of sugar residues onto the monoglycosyl acceptors, they did not allow us to define their respective roles.
Complementation of M. bovis BCG with Rv2958c from M. tuberculosis-To determine the role of each of the glycosyltransferases encoded by Rv2957 and Rv2958c as well as the sequence of transfer, i.e. a monosaccharide versus a disaccharide, we took advantage from comparative genomics. Unlike M. tuberculosis, M. bovis, and M. bovis BCG mostly produce 2-O-methylrhamnosyl-phenolphthiocerol dimycocerosate (mycoside B) and p-HBAD I (1,4,20). A close examination of the sequences corresponding to Rv2957 and Rv2958c in M. bovis BCG and M. bovis AF2122/97 revealed that the proteins encoded by Rv2957 and its orthologs in the two M. bovis strains are identical. In contrast, a single base insertion at position 867 nucleotides of the ortholog of Rv2958c leads to a 186-bp shorter open reading frame and a completely different amino acid sequence after amino acid 289 in both M. bovis BCG and M. bovis AF2122/97 in comparison with M. tuberculosis. We speculated that this insertion may have inactivated the Rv2958c ortholog in M. bovis and the vaccine BCG strains. To test this hypothesis we tried to complement M. bovis BCG for the production of PGL-tb with the Rv2958c gene from M. tuberculosis. The fulllength Rv2958c gene from M. tuberculosis H37Rv was placed under the control of the mycobacterial promotor phsp60 in the shuttle plasmid pMV361H. The resulting plasmid, pPET52, was first transferred into M. tuberculosis ⌬Rv2958c::km (PMM19) to test the functionality of the construct. TLC analysis of the p-HBAD secreted into the culture supernatant of PMM19:pPET52 revealed that both p-HBAD I and p-HBAD II were now synthesized by the transformant, just like in the wild-type H37Rv strain (Fig. 6A). Thus, expression of Rv2958c from pPET52 complements the chromosomal mutation in the PMM19 strain for the production of p-HBAD II. The same plasmid, pPET52, was then transferred into M. bovis BCG. Transformants were grown in liquid culture, and their lipid contents were analyzed by TLC (Fig. 6B). Unlike the parental strain, M. bovis BCG transformed with pPET52 produced two major glycoconjugates, one corresponding to mycoside B and the other (product E) with a lower mobility (Fig. 6B). Based on the mobility on TLC, the compound E did not correspond to PGL-tb. Therefore the mutation in the M. bovis ortholog of Rv2958c is not the only reason for the deficiency in PGL-tb and p-HBAD II synthesis in M. bovis strains. To determine whether the formation of compound E requires the presence of Rv2957, we constructed a M. bovis BCG strain in which the Rv2957 gene was inactivated by allelic exchange (data not shown). The M. bovis BCG ⌬Rv2957::km strain (named PMM32) synthesizes mycoside B (Fig. 6C) as does the parental strain M. bovis BCG and the M. tuberculosis ⌬Rv2957::km strain transformed with plasmid pPET1. Strain PMM32 was transformed with pPET52, allowing the expression of a functional Rv2958c gene. Several PMM32:pPET52 transformants were grown in liquid broth, and their lipid contents were examined by TLC, showing the production of both mycoside B and compound E. This demonstrates that synthesis of the additional glycoconjugate E does not require a functional Rv2957 gene.
The additional glycolipid E was purified from the transformant PMM32:pPET52 by chromatography on a Florisil column and analyzed by 1 H NMR spectroscopy and MALDI-TOF mass spectrometry. Although most of the proton resonances observed in the 1 H NMR spectrum of the purified compound E were also found in the spectrum of PGL-tb, only two anomeric proton resonances, at 5.15 and 5.50 ppm (1H each), and two methoxyl group proton resonances (3H each), at 3.32 and 3.49 ppm, were detected in the spectrum of the new glycolipid (Table III). The small coupling constant 3 J 1,2 values (1-2 Hz) indicated the manno configuration of the two sugar residues. The signals observed for methoxyl proton resonances at 3.32 and 3.49 ppm were attributed, respectively, to the methoxyl groups linked to the lipid (signal b) and sugar (signal j) by comparison with their chemical shift values to those of related products (16,20,37). These structural features showed that M. bovis BCG

H NMR analysis of the PGLs and p-HBAD produced by the various recombinant strains
The columns correspond to the various proton resonances (in ppm). Attribution of the various resonance signals of PGL-tb, p-HBAD I, and p-HBAD II were described previously (20). The letters (see Fig. 1 ⌬Rv2957::km: pPET52 produced a mono-O-methyl-diglycosylphenolphthiocerol dimycocerosate. This result was supported by the MALDI-TOF mass spectrum of this purified compound, which showed a series of pseudomolecular ion (MϩNa) ϩ peaks at 1592, 1606, 1620, 1634, 1648, 1662, 1676, 1690, 1704, and 1718, consistent with the calculated molecular masses of dimycocerosates of mono-O-methyl-diglycosyl-phenolphthiocerol.
Altogether, these data indicate that Rv2958c is mutated in M. bovis BCG because the complementation of M. bovis BCG and M. bovis BCG ⌬Rv2957::km with the Rv2958c gene from M. tuberculosis resulted in the synthesis of diglycosylated PGLs. Furthermore, these results demonstrate that the Rv2958c gene product is able to transfer additional sugar residue onto the mono-rhamnosylated phenophthiocerol dimycocerosates in the absence of the protein encoded by Rv2957. However, these data also demonstrate that the failure to produce PGL-tb in M. bovis BCG and, by extension in the other M. bovis strains, is not solely due to the insertion of one nucleotide in the Rv2958c ortholog of M. bovis.

DISCUSSION
The main objectives of the present study were to decipher the biosynthesis of PGLs and p-HBAD in mycobacteria and to determine the roles of three genes, located close to pks15/1 and encoding putative glycosyltransferases, in the formation of the carbohydrate domain of these glycoconjugates. We constructed a series of M. tuberculosis mutants with single mutations in these three genes. We then transferred in these mutants a complementation plasmid carrying a functional pks15/1 gene, which was previously shown to endow M. tuberculosis H37Rv with the ability to produce PGL-tb (20). Biochemical and structural analyses of the lipids produced by these recombinant strains revealed that these mutations affected the biosynthesis of PGL-tb and p-HBAD in a similar way. Our results demonstrated that the glycosyltransferases encoded by genes Rv2957, Rv2958c, and Rv2962c are all involved in the biosynthesis of both p-HBAD and PGL-tb, where they catalyze the transfer of three species-specific sugar residues to form the saccharide domain of these molecules. The observation that mutations in Rv2957, Rv2958c, or Rv2962c led to the accumulation of truncated forms of p-HBAD and PGL-tb demonstrated that the glycosylation of p-hydroxybenzoic acid is not required before the elongation by the Pks15/1 polyketide synthase during the formation of PGLs. Thurman et al. (39) previously showed that aglycosyl compounds, i.e. dimycocerosates of phenolphthiocerol and relatives, recovered from the lipid extracts of Mycobacterium microti, a member of the M. tuberculosis complex, could be glycosylated by acellular extracts of the bacterium. Therefore, these results strongly suggested that the Rv2957, Rv2958c, and Rv2962c encoded enzymes can transfer sugar residues both onto p-hydroxybenzoic acid and onto dimycocerosates of phenolphthiocerol and structurally related compounds.
We propose that the product of gene Rv2962c catalyzes the transfer of a rhamnosyl residue on p-hydroxybenzoic acid or phenolphthiocerol derivatives to form, after O-methylation at position 2 of the sugar unit, p-HBAD I and mycoside B. This proposal is based on the observation that the mutation in Rv2962c abolished the synthesis of p-HBAD and led to the accumulation of aglycosyl derivatives of PGL in PMM22:pPET1. The finding that PMM22:pPET1 (⌬Rv2962c::km) produces the biosynthetic intermediate of PGLs, i.e. p-hydroxyphenolphthiocerol dimycocerosate, was expected, but the identification of an additional methylated form of this intermediate was surprising. This observation raises the question of whether this methylated compound is also a biosynthetic intermediate or, more likely, represents a byproduct produced to detoxify the putative highly reac-tive free phenolic group that could be toxic for the bacterial cell. Besides, this methylated phenolphthiocerol dimycocerosate has been previously isolated from some M. tuberculosis clinical isolates by Goren et al. (7), who tentatively associated its production with attenuated virulence.
Concerning the precise biosynthetic steps involving the product of Rv2958c and Rv2957 in M. tuberculosis, our results established their functions as glycosyltransferases involved in glycosylation steps downstream of p-HBAD I and mycoside B in the biosynthesis pathway. However, the data did not provide an unambiguous answer because mutations in either Rv2957 or Rv2958c led to the accumulation of the same products, p-HBAD I and mycoside B, in which the saccharide moiety is a 2-O-methyl rhamnosyl residue. The simplest explanation for this observation is that one of the two enzymes first catalyzes the formation of a disaccharide that is subsequently transferred by the other glycosyltransferase onto p-HBAD I or 2-Omethyl-rhamnosyl-phenolphthiocerol dimycocerosate to form p-HBAD II and PGL-tb respectively. Alternatively, the two enzymes may sequentially catalyze the transfer of the rhamnosyl and tri-O-methyl fucosyl residues, but a regulation loop would prevent the accumulation of biosynthetic intermediates possessing two sugar residues. Both models are consistent with the previous identification of only tri-and mono-glycosylated phenolphthiocerol dimycoserosates in M. tuberculosis Canetti but never diglycosylated compounds (37,16).
The observation that M. bovis strains contain a single nucleotide insertion within Rv2958c led us to speculate that this mutation could explain why these strains synthesized mycoside B and p-HBAD I instead of the fully glycosylated PGL-tb and p-HBAD II found in M. tuberculosis. According to this hypothesis, complementation of M. bovis BCG with a functional Rv2958c should lead to the synthesis of PGL-tb. However, this was not the case. Instead, a new PGL containing two sugar residues was produced in M. bovis BCG:pPET52. The observation that this new glycolipid is produced in M. bovis BCG:pPET52 but not found in the M. tuberculosis Rv2958c::km mutant strain is surprising. One possibility would be that Rv2958c, which is under the control of the hsp60 promotor in pPET52, is expressed at a higher level in M. bovis BCG:pPET52 or PMM32:pPET52 than in PMM24. In this "artificial" genetic context, an intermediate may accumulate that was not detectable when Rv2958c is under its own promotor. In any case, this finding shows that the protein encoded by Rv2958c is able to transfer sugar residues onto the mono-rhamnosylated phenophthiocerol dimycocerosates even when Rv2957 is inactivated.
We found that the Rv2958c ortholog of M. bovis BCG is mutated because the complementation with the Rv2958c gene from M. tuberculosis resulted in the production of a diglycosylated phenolphthiocerol dimycocerosate. However, M. bovis BCG: pPET52 was unable to synthesize the full-length PGL-tb, demonstrating that the mutation found in Rv2958c is not the sole explanation for the glycosylation defect in M. bovis derivatives. Because the amino acid sequences of the Rv2957-encoded orthologs from M. bovis BCG, M. tuberculosis H37Rv and M. bovis AF2122/97 are identical, a mutation in this glycosyltransferase gene cannot explain the failure of M. bovis derivatives to produce PGL-tb. It is a possible explanation that the tri-O-methyl-fucosylrhamnosyl disaccharide is not produced in M. bovis because of a methylation defect. This hypothesis will be tested when the methyltransferase involved in the methylation of the fucosyl residue in M. tuberculosis has been identified.
Numerous roles have been attributed to PGLs during the infectious process, notably for PGL-1 from M. leprae. In this mycobacterial species, PGL-1 has been implicated in the tropism of the leprosy bacillus for Schwann cells in the interaction of M. leprae with phagocytes, in the resistance of the leprosy bacillus to intracellular killing, and in immunomodulatory activities (10,14,40). The role of these molecules in other PGLproducing mycobacterial pathogens, such as M. tuberculosis, is less documented. Nevertheless, despite the fact that many clinical isolates of M. tuberculosis do not produce PGL, epidemiological data and laboratory work have shown that the different clinical isolates of M. tuberculosis do not all exhibit the same virulence or dissemination patterns (41,42). It was recently suggested that PGL-tb production is associated with the hypervirulent phenotype encountered in several clinical strains of M. tuberculosis (15). Therefore, the contribution of PGL-tb in the pathogenicity of M. tuberculosis remains to be established. The proper evaluation of the role of PGL-tb in M. tuberculosis pathogenesis is made difficult by the recent finding that all the strains examined, even those that did not synthesize PGL-tb, produced p-HBAD in the culture supernatant (20). These p-HBADs correspond to truncated forms of PGL-tb in which the oligosaccharide moiety of PGL-tb essential for the various activities of the PGL of M. leprae is retained (10,14,40). Moreover, given that they are secreted, these compounds are ideally located to interact with host cells. Therefore, the role of p-HBAD in M. tuberculosis pathogenicity also needs to be addressed. Identification of the gene, Rv2962c, involved in the first glycosylation step during the biosynthesis of p-HBAD I and p-HBAD II provided us with a tool to investigate the contribution of these molecules in tuberculosis pathogenesis. Indeed, a H37Rv derivative containing a mutation in Rv2962c no longer synthesized p-HBAD. Accordingly, we have now a set of three isogenic strains, H37Rv (PGL-tb Ϫ , p-HBAD ϩ ), H37Rv: pPET1 (PGL-tb ϩ , p-HBAD ϩ ), and H37Rv ⌬Rv2962c::km (PGLtb Ϫ , p-HBAD Ϫ ). Comparison of their virulence and the histological features of animals infected with these strains should clarify the contribution of PGL-tb and p-HBAD in the pathogenesis of M. tuberculosis. In addition, during this study and the one presented in the accompanying paper, we have produced a collection of related M. tuberculosis strains producing various forms of PGL-tb and p-HBAD. If PGL-tb, p-HBAD, or both have a role in pathogenesis, this collection may provide a powerful tool to identify the structural determinants of these molecules important for their functions. The availability of both the purified molecules and the M. tuberculosis strains producing them will allow us to address these questions both in vitro and in the context of the live bacteria.