A Major Cell Wall Lipopeptide of Mycobacterium avium subspecies paratuberculosis*

Mycobacterium avium subspecies paratuberculosis (MAP), the causative agent of Johne disease in cattle and other ruminants, is proposed to be at least one of the causes of Crohn disease in humans. MAP and Mycobacterium avium subspecies avium, a closely related opportunistic environmental bacterium, share 95% of their genes and exhibit homologies of more than 99% between these genes. The identification of molecules specific for MAP is essential for understanding its pathogenicity and for development of useful diagnostic tools. The application of gas chromatography, mass spectrometry, and nuclear magnetic resonance led to the structural identification of a major cell wall lipopeptide of MAP, termed Para-LP-01, defined as C20 fatty acyl-d-Phe-N-Me-l-Val-l-Ile-l-Phe-l-Ala methyl ester. Variations of this lipopeptide with different fatty acyl moieties (C16 fatty acyl through C17, C18, C19, C21 to C22) were also identified. Besides the specificity of this lipopeptide for MAP, the presence of an N-Me-l-valine represents the first reported N-methylated amino acid within an immunogenic lipopeptide of mycobacteria. Sera from animals with Johne disease, but not sera from uninfected cattle, reacted with this lipopeptide, indicating potential biological importance.

Mycobacterium avium subspecies paratuberculosis (MAP) 2 and Mycobacterium avium subspecies avium (MAA) are closely related subspecies. They share 95% of their genes and exhibit homologies of more than 99% between these genes (1). MAP can be distinguished from MAA by the presence of IS900; however, there is a very similar repetitive element (IS901) in MAA. MAP also differs from MAA by its growth characteristics: 1) MAP grows more slowly, with a generation time of 22-26 h compared with 10 -12 h for MAA, and 2) MAP requires the siderophore Mycobactin J for growth (MAA can synthesize this siderophore) (2,3). MAP also lacks the antigenic glycopeptidolipids that serve as the basis for serotyping the M. avium complex, although rough colony variants of MAA are also devoid of these molecules. However, none of these differences helps to explain why MAP occupies a specific biological niche distinct from MAA. MAP is the causative agent of Johne disease in cattle and other ruminants and is proposed to be at least one of the causes of Crohn disease in humans (2). Both diseases are presented as a chronic inflammation of the bowel. MAA is an environmental bacterium found in soil and water that causes opportunistic infections in humans, including lymphadenitis in children, "hot tub lung" in the general population, and systemic infection in immunocompromised individuals (4). MAP-specific traits have recently been defined by sequencing the genomes of strains of each subspecies (1) and by identifying subspecies-specific genes (5,6), but more investigations are necessary especially in the fields of proteomics, lipidomics, glycomics, and gene regulation. Lipidomics is a promising area of study because members of the family Mycobacteriaceae contain large numbers of complex lipids in their cell wall, and we recently identified several cell envelope and culture filtrate lipids present in MAP strain K-10 but absent from MAA strain 2151. 3 Here we report the chemical structure and seroreactivity of Para-LP-01, a major cell wall-associated MAPspecific lipopeptide.

EXPERIMENTAL PROCEDURES
Chemical Reagents-All chemical reagents were of the highest grade from Sigma unless otherwise specified.
Bacterial Growth-MAP strain K-10 is a bovine isolate from Nebraska that was provided by V. Kapur (University of Minnesota). MAA strain 2151 is a human sputum isolate (7). They were grown on Middlebrook 7H11 agar plates supplemented with 10% OADC (oleic acid, albumin, dextrose, catalase); for MAP, Mycobactin J (2 g/ml) (Allied Monitor, Fayette, MO) was also added. After 4 weeks (MAA) or 12 weeks (MAP) of growth at 37°C, cells were harvested by scraping colonies off the plates into phosphate-buffered saline, pH 7.0, and centrifuging at 3,500 rpm for 30 min. Cell pellets were then lyophilized.
Extraction of Total Lipids and Isolation of Lipopeptide Para-LP-01-Total lipids were extracted from lyophilized cells with chloroform/ methanol (2:1) (30 ml/g dried cells) at 55°C for 3 h. The extracts were dried under nitrogen and Folch-washed with 6 ml of chloroform/methanol (2:1) and 1 ml of water. The organic layer was transferred to a new tube and dried.
For visualization, total lipids were resuspended in chloroform/methanol (2:1) at a concentration of 10 mg/ml and then separated by twodimensional thin layer chromatography (TLC) on aluminum-backed silica 60 F 254 gel plates (EMD Chemicals, Gibbstown, NJ) using chloroform/methanol (96:4) in the first dimension and toluene/acetone (80: 20) in the second dimension. Plates were sprayed with 10% copper sulfate in 8% phosphoric acid and the lipids visualized by heating. Total lipids and lipid Para-LP-01 were further analyzed by differential spraying of the plates to detect carbohydrates (using ␣-naphthol), free amino groups (using ninhydrin), and phosphates (using ceric ammonium molybdate) after heating.
The lipopeptide Para-LP-01 was purified by preparative TLC. It was first scraped from plates run in chloroform/methanol 96:4, and then the lipopeptide was subjected to a second preparative TLC purification using the solvent system of toluene/acetone 80:20. For extraction from the silica, chloroform/methanol (2:1) was used in an incubation at 4°C overnight. Extracts were dried under nitrogen and purified by Folch wash, and the organic layer was transferred to a new tube. The dried organic layer was suspended in chloroform/methanol (2:1) at a concentration of 10 mg/ml.
Detection of Amino Acids-The EZ:faast amino acid analysis kit (Phenomenex Inc., Torrance, CA) was used to identify amino acids following the hydrolysis of 100 g of lipid Para-LP-01 with 6 N HCl overnight at 110°C. Derivatized amino acids were analyzed by gas chromatography coupled to mass spectrometry (GC/MS). Specifically, the analyte was applied to a DB-5 column (10 m ϫ 0.18-mm internal diameter, 0.18-m film thickness) (J&W Scientific, Folsom, CA) on a ThermoQuest Trace gas chromatograph 2000 (ThermoQuest, Austin, TX) connected to a GCQ/Polaris MS mass detector (ThermoQuest) at an initial temperature of 60°C for 1 min, increasing to 130°C at 30°C/min, and finally to 280°C at 5°C/min.
Analysis of Fatty Acids-Lipid Para-LP-01 was first hydrolyzed with 6 N HCl overnight at 110°C followed by 3 N HCl in methanol at 85°C for 16 h. The sample was treated with TRI-SIL reagent (Pierce Biotechnology, Rockford, IL) for 20 min at 70°C. Analysis of the trimethylsilylated compounds was performed by GC/MS using the same column and temperature program described for amino acids. D and L Determination of Amino Acids-To determine the enantiomeric form of amino acids an (R)-(Ϫ)-2-butanol and an (S)-(ϩ)-2-butanol derivatization was performed as described by Huasek and Macek (8). The O-butyl, N-heptafluorobutyryl amino acid butyl derivatives were analyzed by GC/MS as described above.
Partial Hydrolysis of Lipopeptide Para-LP-01-Hydrolysis of the lipid was performed with 6 N HCl at 110°C for 15, 40, 90, and 180 min.
Permethylation of Lipid Para-LP-01-The deuteromethylated derivative of lipopeptide Para-LP-01 was generated by permethylation as described by Ciucanu and Kerek (9) ES/MS-Electrospray/mass spectrometry (LC/MS-ES/MS) was performed with an LCQ electrospray ion trap mass spectrometer (Ther-moFinnigan, San Jose, CA). Lipid Para-LP-01 and its deuteropermethylated derivative were suspended in methanol, 0.1% trifluoric acid at a concentration of 1 mg/ml. Ionization was achieved with spray voltages of 3 and 4.5 kV for the deuteromethylated and native lipid, respectively. Fragmentation of the major positive ions was performed using 30 -60% normalized collision energy.
NMR Spectroscopy-One-dimensional 1 H and 13 C nuclear magnetic resonance (NMR) and 1 H-1 H shift-correlated spectroscopy (COSY) NMR were performed on a Varian Inova 400 MHz, and two-dimensional 1 H-13 C heteronuclear single quantum correlation spectroscopy (HSQC) NMR spectra were acquired on a Varian Inova 500 MHZ NMR spectrometer using the supplied Varian pulse sequences. NMR spec-troscopies were performed in the Central Instrument Facility, Dept. of Chemistry, Colorado State University.
ELISA-Enzyme-linked immunosorbent assay (ELISA) was performed at room temperature in a 96-well microtiter plate with bovine sera (generous gift from Mycos, Fort Collins, CO) tested by IDEXX HerdChek (Mycobacterium paratuberculosis test kit for Johne disease) (IDEXX, Westbrook, ME) at the Diagnostic Laboratory, James L. Voss Veterinary Teaching Hospital, Colorado State University. Strong IDEXX-positive (IDEXX values 1.33-2.60) and IDEXX-negative (IDEXX values 0 -0.1) sera were used to make serial 2-fold dilutions from 1:1 to 1:512. All sera and antibody dilutions were made with 10% fetal bovine serum in phosphate-buffered saline (pH 7.4). Lipid Para-LP-01 was suspended in hexane, sonicated for 3 min, and 100 g was loaded into each well and air dried. Blocking was performed with 200 l of blocking buffer (3% bovine serum albumin in phosphate-buffered saline, pH 7.4) for 1 h. After removing the blocking solution, serial dilutions of the bovine sera (200 l) were added to duplicate wells and incubated for 2 h. Wells were washed five times with 200 l of blocking buffer, and then 100 l of the secondary antibody (sheep anti-bovine IgG coupled to horseradish peroxidase (Bethyl Laboratories, Montgomery, TX)) diluted 1:2000 was added and incubated for 2 h. Finally, the wells were washed five times with 200 l of phosphate-buffered saline (pH 7.4) before 100 l of 3,3Ј,5,5Ј-tetramethylbenzidine was added. After 5 min the reaction was stopped with 100 l of 2 N sulfuric acid and the A 450 was determined with a plate reader model 680 (Bio-Rad).

RESULTS
A lipid designated Para-LP-01 was recently identified in a lipidomic analysis of MAP strain K-10 in comparison to MAA strain 2151. 3 It was selected for further characterization because it is a major cell wall-associated lipid, was well isolated when the total lipids were separated by two-dimensional TLC using an apolar solvent system, and was present only in MAP (Fig. 1). Initial analysis by two-dimensional TLC followed by differential spraying suggested that Para-LP-01 lacked carbohydrates, free amino groups, and phosphate groups.  GC/MS analysis of the derivatized amino acids using the EZ:faast protocol yielded three major components with retention times of 4.75, 5.47, and 7.97 min, corresponding to standards for alanine, isoleucine, and phenylalanine, respectively. A minor component with a retention time of 4.78 min that correlated with a valine standard was also identified.
The enantiomeric forms of the amino acids were determined by analyzing the O-butyl, N-heptafluorobutyryl amino acid butyl derivatives. GC analysis demonstrated that the amino acids alanine, valine, and isoleucine were present only in the L configuration, whereas phenylalanine was detected in both the L and D configurations. Thus, these analyses indicated that lipid Para-LP-01 is comprised of a saturated C20 fatty acid (as the major fatty acid moiety) linked to a peptide, possibly a pentapeptide, containing D-and L-phenylalanine, L-alanine, L-valine, and L-isoleucine. These findings were consistent with the initial twodimensional TLC analysis that identified Para-LP-01 as a nonpolar lipid.
Structural Analysis of Lipopeptide Para-LP-01 by FAB-MS and MALDI-TOF Analyses-Positive ion FAB-MS analysis of lipopeptide Para-LP-01 resulted in several ions ranging from m/z 500 to 1100 (Fig.  2). The major ion of the molecule was m/z 918.6, and ions found at m/z 815.6, 668.5, and 555.4 were identified as putative fragmentation ions. Associated with each of these ions were minor ions differing by Ϯ 28 atomic mass units representing the different fatty acid chain lengths. The differences between the major and fragmentation ions were 103   FEBRUARY 24, 2006 • VOLUME 281 • NUMBER 8 ). These differences corresponded to the masses of alanine (methyl ester), phenylalanine, and isoleucine, respectively. In addition, minor ions were obtained Ϯ 14 atomic mass units and Ϫ42 and Ϫ60 atomic mass units apart from the major ion of m/z 918.6, suggesting that seven different fatty acyl chains were linked to a pentapeptide core (Fig. 2B).

Mycobacterium paratuberculosis-specific Cell Wall Lipopeptide
The finding of seven different fatty acyl chains was supported by MALDI-TOF analysis (Fig. 3). This analysis showed seven peaks that differed by 14 atomic mass units in their masses for native Para-LP-01, and all seven peaks shifted by ϩ 71 atomic mass units in the deuteromethylated lipid. This indicated that the peptide core for the seven molecules was the same but that these molecules differed in their fatty acyl chains. Thus, the presence of seven different saturated fatty acyl chains linked to the peptide core could be demonstrated: hexadecanoic acid (C16:0), heptadecanoic acid (C17:0), octadecanoic acid (C18:0), nonadecanoic acid (C19:0), eicosanoic acid (C20:0), heneicosanoic acid (C21: 0), and docosanoic acid (C22:0).
The above analyses indicated that Para-LP-01 is actually a lipopeptide complex or family in which a pentapeptide core is attached to a series of saturated fatty acids dominated by C20. From component analysis, the calculated mass (890.6) differed from the major ion identified by FAB-MS (918.6) by 28 atomic mass units. The nature of this difference was examined by NMR.
Structural Features of the Lipopeptide by NMR Analyses-1 H NMR analysis (Fig. 4) revealed 10 protons derived from aromatic rings. It also showed that there were, as predicted, five ␣-C protons within this structure corresponding to the five amino acids. The exact assignment and coupling of the other protons to these ␣-C protons was defined through their connectivity within the two-dimensional COSY 1 H-NMR ( Table  1). The following assignments were made: ␣-C protons for two phenylalanines at ␦ 5.13 and ␦ 4.63 ppm, one alanine at ␦ 4.35 ppm, one valine at ␦ 4.13 ppm, and one isoleucine at ␦ 4.41 ppm. A large peak at ␦ 1.25 ppm represented protons within the fatty acyl chain.
Furthermore, eight methyl groups were identified in the 1 H-13 C NMR (Fig. 5). Six of them exhibit signals with chemical shifts up to 1.5 ppm. Of these, two were associated with isoleucine, two with valine, one with alanine, and one with the fatty acyl chain. However, two signals with chemical shifts of 3.05 and 3.71 ppm suggested methyl groups linked to nitrogen and oxygen, respectively. Thus, 1 H NMR and 1 H-13 C NMR demonstrated the presence of two phenylalanines, one alanine, one isoleucine, and one valine and also provided strong evidence for a saturated fatty acid. In addition, the calculated difference in the mass of the lipopeptide of 28 atomic mass units could be accounted for by Nand O-linked methyl groups. The location of these methyl groups was examined by MALDI-TOF.
Further Structural Analysis by MALDI-TOF-MALDI-TOF analysis of the native lipopeptide and its deuteromethylated derivative showed   (Fig. 6, A and  B). Note that typically in peptides, "a" ions are less abundant than "b" ions when analyzed by collision-induced MS. However, here and in our previous analyses of lipopeptides (7), the reverse has been observed, i.e. that "a" ions are more abundant than "b" ions. Thus the data in panels A and B showed that the positions of the two phenylalanines and the single alanine within Para-LP-01 could be determined. However, a mass difference of m/z 113 was detected twice. This suggested that one m/z 113 was isoleucine and the other was an N-methylated valine. Panels C and D of Fig. 6 show the MS/MS/MS analyses of the major ions representing the tetrapeptide of the C terminus. The newly generated ions demonstrated clearly that the peptide sequence is comprised of N-Me-valine-isoleucine-phenylalanine-alanine methyl ester. The presence of N-methylated valine also explained why GC/MS analysis of the derivatized amino acids indicated that valine was a minor component: the free valine had to be generated by further hydrolysis of N-Me-Val. Thus the final structure of the lipopeptide consists of pentapeptide Phe-N-Me-L-Val-L-Ile-Phe-L-Ala methyl ester N-linked to a C20 fatty acid that was identified as the major fatty acid.
In Silico Analysis to Determine the Location of the Enantiomeric Forms of Phenylalanine-As noted above, amino acid analysis revealed the presence of both the D and L isomers of phenylalanine. Several attempts at partial hydrolysis failed to generate truncated lipopeptides that could be used to define the location of the D-and L-phenylalanine in the pentapeptide. However, in silico analyses of the MAP genome revealed two single genes and three gene clusters encoding putative peptide synthetases. The one most likely to be involved in the biosynthesis of the lipopeptide Para-LP-01, MAP1420, consists of five modules, one for each of the five amino acids (Fig. 7). The first module contains the motif for the incorporation of an epimerized amino acid as the first amino acid. The second module contains the motif for N-methylation of a non-epimerized amino acid, and the final three modules would also direct the incorporation of L amino acids. In addition, the first module exhibits a high degree of homology (63% identity; 75% similarity) to the first module of the pstA gene product in MAA that is responsible for the incorporation of a D-phenylalanine into the lipopeptide core of the highly immunogenic glycopeptidolipids (10). Thus, the final structure of Para-LP-01 was determined to be C20:0 fatty acyl D-Phe-N-Me-L-Val-L-Ile-L-Phe-L-Ala methyl ester.
Seroreactivity of Lipopeptide Para-LP-01 Using Bovine Sera-ELISAs with the lipopeptide Para-LP-01 were performed with bovine sera previously screened using the IDEXX HerdChek Mycobacterium paratuberculosis test kit for Johne disease. IDEXX-negative sera at a titer of 1:4 showed no detectable O.D. for the lipopeptide. ELISAs performed with IDEXX-positive sera resulted in high O.D. readings for the lipopeptide Para-LP-01. There was a significant difference between the O.D. numbers for the lipopeptide with IDEXX-positive sera versus those obtained with IDEXX-negative sera. Two exceptions were found: 1) no positive O.D. reading was obtained repeatedly for the IDEXX-positive serum 9687, and 2) low O.D. was measured for the IDEXX-negative serum 2727, which was still significantly lower than those obtained for IDEXXpositive sera (Fig. 8).

DISCUSSION
A major nonpolar lipid, termed Para-LP-01, was identified by comparative lipidomics to be present only in MAP, but not in MAA, 3 and it was purified and subjected to structural analyses. The fatty acids associated with this lipid were saturated and ranged from C16 to C22. This was a surprise because the lipopeptides and glycopeptidolipids (GPLs) that together represent 3-4% of the total cell weight of MAA are comprised of a tetrapeptide that is N-linked to mono-or even diunsaturated 3-OH fatty acids ranging from C32 to C36 (7), and we expected the MAP lipids to be similar.
Para-LP-01 is a major lipid in the cell wall and is likely a major component of the outer part of the cell envelope, just as the GPLs of MAA are surface exposed (10). Structurally, the GPLs consist of a lipopeptide core comprised of a tetrapeptide (three amino acids, one amino alcohol) that is N-linked to mono-or diunsaturated long chain fatty acids. O-linked to this core molecule are mono-and oligosaccharides, which can be further modified (e.g. by methylation or acylation). These sugar moieties form the basis for the 28 different serovars of the M. avium complex (10). Although MAP is technically a member of the M. avium complex, it lacks GPLs and is missing some of the genes responsible for their biosynthesis. In comparison, Para-LP-01 consists of a peptide core with five amino acids that are distinct from those found in the GPL core. Furthermore, an additional modification of an amino acid (N-methylation) was identified within the Para-LP-01 that is not found in the GPLs.
Finding an N-methylated valine in the lipopeptide structure of Para-LP-01 was unexpected. To our knowledge, there are only two reports of N-methylated amino acids within a lipopeptide in mycobacteria: 1) a "unique phenylalanine-containing lipopeptide" of a rough colony vari-   (12). This type of modification is not unknown in other prokaryotic nonribosomal peptide syntheses. The best example is cyclosporine with seven N-methylated amino acids within the undecapeptide. This high degree of N-methylation protects the molecule from proteolytic digestion (13,14). Other nonribosomal peptides with N-methylated amino acids are enniatin (15,16) (N-Me-l-amino acids), actinomycin (17) (N-Me-Gly, N-Me-Val), and pristinamycin and virginamycin (18) (N-Me-Phe, further modified to 4-dimethylparaamino-N-Me-Phe). Better protection from proteolytic cleavage has also been proposed for peptides containing D amino acids (19). Almost every nonribosomally synthesized peptide is comprised in part of D-configured amino acids (19 -21). The specific location of these amino acids within the peptide is also essential for further modifications such as cyclization and release from the synthesizing enzymes (19).
In addition, in silico analysis revealed an open reading frame (MAP1420) representing a putative peptide synthetase comprised of five modules for the incorporation of amino acids into a peptide and is the most likely gene behind the biosynthesis of the lipopeptide Para-LP-01. Several other putative peptide synthetases were found in the genome of MAP but contained fewer modules (MAP1242 (3 modules), MAP1871c/1870c (2 modules), MAP3740/3741/3742 (2 modules), MAP2178 to MAP2171c (2 complete and 2 incomplete modules)), had additional modification motifs (e.g. reduction, oxidation, cyclization), or did not exhibit epimerization motifs reflecting the presence of D amino acids.
Many lipid components of the cell envelope of mycobacteria demonstrate seroreactivity. The best examples are the highly immunogenic GPLs of the M. avium complex (11,22,23) that are localized to the outer part of the cell envelope (11). Indeed, most of the antigenic lipids of Mycobacterium subspecies contain sugar moieties. Molecules such as 2,3 diacyl trehalose, phenolic glycolipids, lipooligosaccharides, sulfolipids, and trehalose dimycolate were successfully used to demonstrate high seroreactivity for patients with active tuberculosis (24 -27).
Although many studies have demonstrated the seroreactivities of different lipid molecules, none has identified a lipopeptide as the target molecule. Thus, the findings in our studies would make this MAPspecific molecule the first described mycobacterial lipopeptide exhibiting biological activity through its seroreactivity with sera from cattle with Johne disease.
Recently, Moody et al. (28) described T-cell activation through CD1a presentation of a lipopeptide that most likely functions as an intermediate in mycobactin biosynthesis. The most potent T-cell response was achieved with the monounsaturated C20:1 fatty acid, whereas molecules with shorter or saturated fatty acids were substantially less stimulatory. Based on similarity within the structures of lipopeptides in general and of Para-LP-01 in particular, it would be interesting to analyze the importance of the structural findings of this lipopeptide for T-cell activation through CD1 in humans with Crohn disease and in cattle with Johne disease.