Gene Knockout Reveals a Novel Gene Cluster for the Synthesis of a Class of Cell Wall Lipids Unique to Pathogenic Mycobacteria*

Surface-exposed unusual lipids containing phthiocerol and phenolphthiocerol are found only in the cell wall of slow-growing pathogenic mycobacteria and are thought to play important roles in host-pathogen interaction. The enzymology and molecular genetics of biosynthesis of phthiocerol and phenolphthiocerol are unknown. We postulate the domain organization of a set of multifunctional enzymes and a cluster of genes (pps) that would encode these enzymes for the biosynthesis of phthiocerol and phenolphthiocerol. A cosmid containing the postulated ppsgene cluster was identified by screening a genomic library ofMycobacterium bovis BCG with the postulated homologous domains from mycocerosic acid synthase and fatty acid synthase genes as probes. Homologous cosmids were also identified in the genomic libraries of Mycobacterium tuberculosis andMycobacterium leprae. M. bovis BCG was transformed with apps disruption construct, made from the BCG cosmid by introducing the hygromycin resistance gene as the positive-selectable marker and the sacB gene as the counter-selectable marker. Gene disruption by homologous recombination with double crossover was confirmed by polymerase chain reaction and Southern hybridization. Chromatographic analysis showed that the phenolphthiocerol derivative, mycoside B, and phthiocerol dimycocerosates were not produced by the gene knockout mutants. This result confirms the identity of the pps genes. With the identification of thepps gene clusters in both M. tuberculosis andM. leprae, it should be possible to test the postulated roles of these unique lipids in tuberculosis and leprosy.

Tuberculosis, caused by Mycobacterium tuberculosis, claims twice as many human lives as AIDS and more lives than any other infectious disease (1). The rapid spread of tuberculosis among the AIDS patients and the emergence of multidrugresistant strains of M. tuberculosis demand the development of new drugs (2,3). Unfortunately, no new antimycobacterial target for drug development has been characterized in the past 25 years. Since the unique cell wall lipids of mycobacteria play a major role in survival of the pathogen in the hostile environment within the host (4 -6), inhibition of synthesis of wall components can be an effective antimycobacterial therapy (7). Surface-exposed lipids such as phthiocerol esters and mycosides have been found to be unique to pathogenic mycobacteria (8 -10) and play important functions in mycobacterial interac-tion with the host (11,12). These lipids contain two diols, phthiocerol and phenolphthiocerol, respectively, esterified with multiple methyl-branched long chain fatty acids called mycocerosic acids (8,(12)(13)(14). Thus, biosynthesis of mycocerosic acids, phthiocerol, or phenolphthiocerol may provide new antimycobacterial drug targets and clues to pathogenesis of tuberculosis. We previously purified the mycocerosic acid synthase (MAS) 1 (15) and cloned and sequenced the gene encoding MAS (mas) (16).
Labeling studies using 14 C-labeled propionate and acetate indicated that both are incorporated into the long chain diols (17)(18)(19). Therefore, it seems likely that the aliphatic chains are derived from both malonyl-CoA and methylmalonyl-CoA. However, nothing is known about the enzymes involved in their biosynthesis or the genes that encode them. Based on current biochemical concepts, here we postulate a unique biosynthetic pathway for phthiocerol and phenolphthiocerol and a genetic organization for a cluster of genes (pps) that would encode a set of enzymes to catalyze the synthesis of phthiocerol from docosanoyl and phenolphthiocerol from p-hydroxyphenylheptadecanoyl primers. We report the identification of the postulated pps gene cluster in Mycobacterium bovis BCG by demonstrating that disruption of this gene cluster abolishes the production of both phthiocerol and phenolphthiocerol derivatives. We also report that the same genes are present in M. tuberculosis and Mycobacterium leprae, the two most pathogenic species of slowgrowing mycobacteria.

EXPERIMENTAL PROCEDURES
Screening of M. bovis BCG Genomic DNA Library-A cosmid DNA library of BCG genome was screened by colony-blot hybridization at low stringency with active site motifs of acyl transferase (AT; nucleotides 3259 -3753) and ketoacyl synthase (KS; nucleotides 2061-2370) domains of the mas gene (16) as probes. The positive clones were rescreened with AT (nucleotides 103-354) and KS (nucleotides 7534 -7842) domains of the fatty acid synthase gene (fas) (20). DNAs from the cosmids, found positive with both sets of probes, were analyzed by Southern hybridization with AT, KS, ketoreductase (KR; nucleotides 6847-7164), enoyl reductase (ER; nucleotides 6040 -6622), and acyl carrier protein (ACP; nucleotides 7789 -8079) domains of mas, and AT and KS domains of fas. Two clones containing the original mas were discarded. The positive fragments from another clone were subcloned into pUC19 and sequenced. Sequences were searched for homology by BLAST server. A total of 8.2 kb was sequenced from 12 segments of BCG cosmid 49, and on the basis of homology search, this was selected as a candidate cosmid for the pps gene cluster and found to be the homolog to M. leprae cosmid L518 (GenBank™ accession number U00023).
Construction of a Plasmid for pps Gene Disruption-An 8.2-kb BamHI fragment was isolated from the BCG cosmid 49 by hybridization with a 6.6-kb coding segment, obtained by PCR from M. leprae cosmid L518. The 8.2-kb fragment was cloned into pUC19 to produce plasmid 49-8B. An internal 1.9-kb fragment was deleted from 49-8B with NruI and replaced with a 1.5-kb Klenow-filled BglII-SmaI fragment, containing the hygromycin resistance gene (hyg), from plasmid pIJ963 (21) to generate pAKD9. A plasmid pAZ1.1 was created by cloning a 1.95-kb PCR fragment from pCVD442 (22), containing the sacB gene of Bacillus subtilis (23), into the SmaI site of pUC19. A 7.9-kb BamHI fragment was isolated from pAKD9 and ligated to the unique BamHI site of pAZ1.1 to construct pAKD10 (Fig. 2D), which thus contains the sacB, and the hyg flanked by 3.8-and 2.5-kb mycobacterial DNA in pUC19, a vector non-replicative in mycobacteria.
Transformation of BCG with pAKD10 and Screening of Transformants-Construct pAKD10 was linearized with XbaI and electroporated into BCG competent cells as described previously (24). Transformant colonies grown on 7H11 agar (Difco) medium containing hygromycin B (50 g/ml; Calbiochem) were individually transferred to 3 ml of M-ADC-TW broth (25) in 15-ml tubes and grown with shaking in a rotator (Thomas Scientific, Swedesboro, NJ). Culture from each tube was then streaked onto 7H11 agar containing hygromycin B and 10% sucrose. BCG transformants grown on sucrose were regrown in 100 ml of broth in the presence of hygromycin B for extraction of genomic DNA for use in Southern hybridization and PCR.
Genomic DNA Preparation and Southern Hybridization-Genomic DNA from BCG cultures was extracted as described (25) after disrupting the cells with 425-600-m glass beads (Sigma) in a Mini-Beadbeater cell disruptor (Biospec-Products, Bartlesville, OK). Southern hybridization of cosmid or genomic DNA was performed according to standard procedures as described previously (24).
Radiolabeling Studies and Biochemical Analysis of Cell Wall Lipids from BCG Wild Type and pps Mutant Strains-To 100 ml of the 12day-old culture of each of BCG wild type and its pps-disrupted mutant, 42 Ci of Na [1-14 C]propionate (specific activity, 55 Ci/mol; American Radiolabeled Chemicals) was added, and incubation of the cultures was continued in roller bottles for a further 16 h. The cells were collected by centrifugation and extracted with an excess of a 2:1 mixture of chloroform and methanol. The total lipids were recovered by the Folch method (26), and aliquots were assayed for 14 C. The total lipid was subjected to thin-layer chromatography (TLC) on 20 ϫ 20 cm ϫ 0.5-mm layers of silica gel G with a 19:1 (v/v) mixture of chloroform and methanol as the solvent. Chromatograms were scanned for 14 C with Berthold Tracemaster 20 TLC-linear analyzer. The silica gel from the labeled area, representing phthiocerol and other wax esters, was scrapped off, and lipids were recovered by elution with ethyl ether. This lipid fraction was applied to TLC plate and developed in a 9:1 (v/v) mixture of n-hexane and diethyl ether as the solvent. Lipids from non-radioactive cells from 12-day-old wild type and gene-disrupted mutant cultures were also analyzed using the same methods. For detection of lipids, chromatograms were sprayed with 5% K 2 Cr 2 O 7 in 50% H 2 SO 4 and heated for 15 min at 180°C. Identification of mycoside B and dimycocerosyl phthiocerol was done by comparison of their R F values with those of authentic standards obtained from Professor P. J. Brennan, Colorado State University, and by hydrolysis followed by radio-gas chromatography as described previously (24).
Extraction of Total Proteins and Electrophoresis-Cells from 200 ml of culture of each of BCG wild type and pps mutant were harvested by centrifugation, disrupted in a French press, and the total proteins were extracted as described previously (15). Proteins were analyzed by SDSpolyacrylamide gel electrophoresis with a 3% stacking gel and a 5% running gel. For detection of protein bands, the gel was silver-stained by using the Bio-Rad silver stain kit (Bio-Rad) according to the manufacturer's instructions.
MAS Activity Determination-The cell-free extract was fractionated by DEAE-Sephadex chromatography, and MAS activity was measured with labeled methylmalonyl-CoA as described (15).

Identification of the Cosmids Containing the Postulated pps
Gene Cluster in Pathogenic Mycobacteria-The structures of the closely related diols, phthiocerol and phenolphthiocerol, suggest that they can be produced from two different starter molecules using identical series of biochemical reactions as indicated in Fig. 1. Based on the current knowledge of fatty acid synthase (FAS), MAS, and polyketide synthase, the enzyme domains required to catalyze each biosynthetic step can be postulated as shown in Fig. 1. The open reading frames (ORFs) required to encode these domains could be organized in the mycobacterial genome in a cluster, designated pps1, pps2, pps3, pps4, and pps5 (Fig. 1).
The biosynthesis of both diols involves synthesis of a straight chain segment followed by synthesis of methyl-branched segments. FAS-like domains might be involved in the synthesis of the straight chain segment by incorporating malonyl-CoA, whereas methylmalonyl-CoA-specific MAS-like domains might synthesize the branched segments. The AT and KS domains of MAS show high degree of homology to the corresponding domains of erythronolide synthase (eryA) that uses methylmalonyl-CoA as substrate (27,28), whereas the other domains of MAS showed equal homology to the corresponding domains of both FAS and eryA (16). Therefore the AT and KS domains of MAS were suggested to be selective for methylmalonyl-CoA, and this suggestion was recently experimentally verified with expressed AT and KS domains of MAS. 2 Thus, AT and KS domains of FAS and MAS might selectively recognize the pps domains that encode enzymes involved in the synthesis of nand methyl-branched segments of the diols, respectively. Therefore, a cosmid library of the M. bovis BCG genomic DNA was screened with AT and KS domains of mas, and the positive clones were rescreened with AT and KS domains of fas to select clones that hybridize with both sets of domains. This approach yielded three cosmid clones (D, G, and 49), which were further analyzed by Southern hybridization with AT, KS, KR, ER, and ACP domains of mas and AT and KS domains of fas. DNA sequence analysis of segments of these cosmids revealed that the overlapping cosmids D and G contain the original mas gene. The nucleotide sequences of the 12 segments of cosmid 49, indicated in Fig. 2, showed 85% identity with sequences in M. leprae cosmid L518 (GenBank™ accession number U00023). Analysis of L518 sequence showed that nucleotides 56 -25,563 of this cosmid contain domains organized as postulated for pps genes and are depicted in Fig. 2A. DNA sequence from nucleotide 25,005-31,307 of the M. tuberculosis cosmid MTCY338 (GenBank™ accession number Z74697) contained pps1 and part of pps2; whereas nucleotide 1-15,409 of cosmid MTCY19H9 (GenBank™ accession number Z83857) contained part of pps3 and the entire pps4 and pps5 (Fig. 2B). The nucleotide sequence of the 12 different segments totalling 8. Disruption of the pps Genes by Allelic Exchange-To test whether the postulated pps gene cluster actually encodes the proteins responsible for phthiocerol and/or phenolphthiocerol synthesis, the effect of disrupting the pps gene cluster in M. bovis BCG was investigated. To achieve allelic exchange, a construct (pAKD10) was made that contained a segment of the pps gene cluster, a small portion of the 3Ј region of pps1, the entire pps2, and the 5Ј-half portion of pps3 (Fig. 2, C and D). In this gene segment, parts of pps2 and pps3 were deleted and replaced with hyg. As a counter-selectable marker, the sacB gene (23) was introduced into the vector adjacent to the cloned pps gene for positive selection of the allelic exchange mutants. Allelic exchange is only possible by homologous recombination with double crossover, which would insert the hyg gene but not the sacB gene into the mycobacterial genome. Since the expression of the intact sacB gene would not allow growth on sucrose (29), transformants that have undergone allelic exchange would be hygromycin-resistant and would grow on sucrose. On the other hand, nonhomologous or single-crossover recombinants should be hygromycin-resistant but unable to grow on sucrose. Twenty-five hygromycin-resistant BCG transformants generated by electroporation of the construct were screened for growth on sucrose. Five transformants grew on sucrose, and they were tested for homologous recombination using PCR to amplify both sides of the inserted hyg gene. With two sets of primers (H1-D1 and S2-D2; Fig. 2F), each representing a hyg primer and a primer in the mycobacterial genome outside the pps gene used for making the disruption construct, three transformants (BPS-1, BPS-2, and BPS-3) yielded a 2.6-kb and a 3.9-kb PCR product as expected from homologous recombination with double crossover (Fig. 3A; shown for BPS-1 and BPS-2). When the deleted fragment (P in Fig. 2E), the sacB gene, and the hyg gene (Fig. 2D) were used as probes in Southern hybridization of the genomic DNA from two of these transformants, they showed the absence of the genomic segment deleted in making the disruption construct, the absence of sacB gene, and the presence of hyg gene (Fig. 3B). These results confirmed that the pps gene cluster was disrupted by allelic exchange.
Biochemical Analysis of Lipids from the pps Mutant-If the postulated gene cluster is responsible for the production of phthiocerol and/or phenolphthiocerol, the mutants should be unable to produce their naturally occurring derivatives, phthiocerol dimycocerosates and/or mycoside B, respectively. Thinlayer chromatographic analysis of the lipids from the mutant 2 N. Fernandes and P. E. Kolattukudy, manuscript in preparation. showed that mycoside B was absent in the mutant (Fig. 4A). To test for the production of phthiocerol dimycocerosates, the wax fraction recovered from the above chromatographic step was subjected to TLC in a less polar solvent system. The results showed that the wild type contained a component that had an R F identical to that of phthiocerol dimycocerosates, whereas the mutant did not contain this lipid (Fig. 4B). Incubation of the cultures with [1-14 C]propionate and analysis of the lipids also confirmed that the mutant was incapable of synthesizing mycoside B and phthiocerol dimycocerosates (data not shown). Radio gas-liquid chromatographic analysis of the fatty acids showed that the mutant did not incorporate labeled propionate into the long chain mycocerosic acids found in the wild type M. bovis BCG (data not shown). To test whether the inability of the pps gene-disrupted mutant to produce mycocerosic acids is caused by some indirect effect resulting in interference with the mas gene functions, we examined the MAS level. SDSpolyacrylamide gel electrophoresis analysis showed that the mutant produced as much MAS as the wild type (data not shown). DEAE-Sephadex fractionation (15) and enzymatic activity measurements showed that the pps gene-disrupted mutant contained as much MAS activity as the wild type. DISCUSSION Molecular strategies used for the synthesis of fatty acids and polyketides allow us to postulate reasonable mechanisms for the biosynthesis of the two diols, phthiocerol and phenolphthiocerol, found uniquely in pathogenic mycobacteria. Both diols have a segment that should be derived from malonyl-CoA and a segment that should be derived from methylmalonyl-CoA. The position of the two hydroxyl groups in both suggests that they could be left during the synthesis of the aliphatic chain rather than by introduction of hydroxyl groups into preformed chains. Thus, an enzyme module that synthesizes this segment would have an AT, KS, KR, but no DH and ER, domains. Only when the synthesis of the entire product involves repetition of all of the steps, such as in the case of FAS or MAS using malonyl-CoA and methylmalonyl-CoA, respectively, a single enzyme module (as dimer) is used. When the synthesis involves multiple dissimilar modules, each enzyme usually catalyzes only one cycle and does not repeat the process. We therefore postulate that two modules each with only AT, KS, and KR sequentially catalyze the condensation with malonyl-CoA each time followed by reduction to generate the 1,3-diol function. Since the next step would involve extension by a fully reduced two-carbon unit, we postulate the need for a full FAS-like module that would use malonyl-CoA as the substrate. The next step should be chain extension by a fully reduced three carbon unit with a methyl branch by a complete MAS-like module. The final process could be condensation by methylmalonyl-CoA requiring a module containing KS and AT domains resembling the corresponding domains of MAS. The resulting 3-keto acid could undergo decarboxylation yielding a keto derivative. Reduction of the carbonyl and methylation of the resulting hydroxyl group would complete the process. The postulated keto intermediates in the synthesis of both phthiocerol and phenolphthiocerol occur naturally in mycobacteria (8,14,30).
Finding of the postulated specific and unique combination of domains organized in a modular manner in the mycobacterial genome supports our hypothesis that this cluster represents phthiocerol and/or phenolphthiocerol synthase genes. The postulated need for mas-like and fas-like AT and KS domains to synthesize the straight chain and methyl-branched segments prompted us to search for mycobacterial genomic segments that would hybridize with both types of domains. This approach led us to the BCG cosmid 49 as the one containing the postulated pps genes. The sequence of cosmid 49 that we had generated helped us to recognize the pps gene cluster when M. tuberculosis and M. leprae genomic sequences became available as part of the mycobacterial genome project.
Allelic exchange or gene knockout by homologous recombination is an efficient tool for generating defined mutants to study gene function(s). Gene knockout proved difficult in slowgrowing mycobacteria such as M. tuberculosis and M. bovis BCG (31). So far, only three reports on gene disruption by allelic exchange in this group of mycobacteria are available (24,32,33). One of these reports involved disruption of the ureC gene with pH indicator screening for urease production (32). Another involved the leuD gene, and disruption resulted in leucine auxotrophic mutants (33). Since no general convenient method is available for screening the gene-disrupted mutants, we used PCR to identify mas gene-disrupted mutants (24). In the present work, we used the sacB gene from B. subtilis (23) as a sucrose sensitivity counter-selectable marker to avoid the need for extensive screening to isolate the allelic exchange mutants. Although some of the mutants that grow in sucrose may represent the mutated sacB gene and not the desired knockout mutants, this approach should be useful as a general method for screening for the homologous recombinants with double crossover in the slow-growing mycobacteria, because the sucrose-grown transformants are highly enriched in such mutants.
Disruption of the gene encoding some of the modules involved in the biosynthesis of phthiocerol and/or phenolphthiocerol should prevent the synthesis of the diol(s) if another set of gene cluster in the genome cannot substitute for the disrupted one. Even though pks-like genes appear to be widely distributed in mycobacterial genomes, we could not find another cluster by our search by hybridization with fas and mas domains. Our disruption construct was designed in such a way that pps2 and pps3 would be disrupted. The gene-disrupted mutant was incapable of producing mycoside B and phthiocerol dimycocerosates, the known end products of the diol synthase, showing that the disrupted genes encode enzymes involved in the biosynthesis of both diols. This cluster of genes is therefore designated phthiocerol/phenolphthiocerol synthase genes, pps. These results show that the same set of enzymes synthesize both diols using two different starting acyl moieties as shown in Fig. 1. The relative amounts of the diols are probably determined by the availability of the starting acyl moieties.
Radio gas-liquid chromatographic analysis of fatty acids generated from [1-14 C]propionic acid by the pps-disrupted mutant showed no mycocerosic acids, whereas such acids are the major labeled fatty acids in the wild type (24). That the lack of such acids was not due to any indirect interference on MAS synthesis was shown by the fact that the mutant had normal levels of enzymatically active MAS. Mycocerosic acids occur mainly as esters of phthiocerol and phenolphthiocerol (8,12,14). In the absence of these diols, mycocerosic acids are either not synthesized or are degraded, as they cannot be esterified to the diols.
The pps-disrupted mutant should be useful in testing the postulated roles for mycoside B and phthiocerol dimycocerosates in mycobacterial interaction with the host. Comparison of this mutant, which lacks both mycoside B and phthiocerol dimycocerosates, with the mas-disrupted mutant, which only lacks the former, should provide information on the function of phthiocerol dimycocerosates. The rapidly expanding information on the genomic sequence of pathogenic mycobacteria can be used if the functional genes can be identified. The present results constitute an example of how the sequence emerging from the genome project can be used to identify functional genes. The gene-disrupted mutants should be useful in testing the postulated roles of the phenolic glycolipids such as inhibiting the production of reactive oxygen species by mononuclear cells (34) and promoting mycobacterial multiplication inside the host (35). Since the pps genes are unique to pathogenic mycobacteria, they may be used as targets in high throughput screening for antimycobacterial drugs desperately needed to combat a major threat to human health.