Novel Bacterial Lipoprotein Structures Conserved in Low-GC Content Gram-positive Bacteria Are Recognized by Toll-like Receptor 2*

Background: The lipid-modified structures of bacterial lipoproteins in low-GC Gram-positive bacteria remains elusive. Results: Three novel structures of bacterial lipoproteins were determined and functioned as TLR2 ligands. Conclusion: Identified novel TLR2-stimulating lipoprotein structures are conserved in low-GC Gram-positive bacteria. Significance: Results open further fields of research concerning functions and biosynthesis of bacterial lipoproteins. Bacterial lipoproteins/lipopeptides inducing host innate immune responses are sensed by mammalian Toll-like receptor 2 (TLR2). These bacterial lipoproteins are structurally divided into two groups, diacylated or triacylated lipoproteins, by the absence or presence of an amide-linked fatty acid. The presence of diacylated lipoproteins has been predicted in low-GC content Gram-positive bacteria and mycoplasmas based on the absence of one modification enzyme in their genomes; however, we recently determined triacylated structures in low-GC Gram-positive Staphylococcus aureus, raising questions about the actual lipoprotein structure in other low-GC content Gram-positive bacteria. Here, through intensive MS analyses, we identified a novel and unique bacterial lipoprotein structure containing an N-acyl-S-monoacyl-glyceryl-cysteine (named the lyso structure) from low-GC Gram-positive Enterococcus faecalis, Bacillus cereus, Streptococcus sanguinis, and Lactobacillus bulgaricus. Two of the purified native lyso-form lipoproteins induced proinflammatory cytokine production from mice macrophages in a TLR2-dependent and TLR1-independent manner but with a different dependence on TLR6. Additionally, two other new lipoprotein structures were identified. One is the “N-acetyl” lipoprotein structure containing N-acetyl-S-diacyl-glyceryl-cysteine, which was found in five Gram-positive bacteria, including Bacillus subtilis. The N-acetyl lipoproteins induced the proinflammatory cytokines through the TLR2/6 heterodimer. The other was identified in a mycoplasma strain and is an unusual diacyl lipoprotein structure containing two amino acids before the lipid-modified cysteine residue. Taken together, our results suggest the existence of novel TLR2-stimulating lyso and N-acetyl forms of lipoproteins that are conserved in low-GC content Gram-positive bacteria and provide clear evidence for the presence of yet to be identified key enzymes involved in the bacterial lipoprotein biosynthesis.

In innate immunity, mammalian Toll-like receptors (TLRs) 3 play an important role in the recognition of invading microorganisms and in the induction of first-line host defense responses (1). To date, 11 human TLRs and 13 mouse TLRs have been identified, and each TLR appears to sense specific pathogen-associated molecular patterns derived from various microorganisms, including bacteria, viruses, protozoa, and fungi (2). Among these, TLR2 is unique in its cooperation with TLR1 or TLR6, and the TLR2/1 or TLR2/6 heterodimer recognizes bacterial lipoproteins and lipopeptides as a pathogen-associated molecular pattern. The bacterial lipoproteins/lipopeptides as well as a number of TLR ligand molecules are known to induce inflammatory cytokine secretion and to be involved in the establishment of adaptive immunity (3). Because bacterial lipoproteins/lipopeptides function as TLR2 ligands, they are also potential target molecules for the development of vaccines and adjuvants (4,5). The identification and biochemical characterization of novel native bacterial lipoproteins as TLR2 ligands are important for elucidating the molecular basis of host-microbe interactions.
Bacterial lipoproteins are structurally divided into two groups: diacylated and triacylated forms (4). These two groups * This work was supported by Bio-Program Grant 2008-2004086 and the BK21 programs supported by the National Research Foundation of Korea (to B. L. L.). □ S This article contains supplemental text, Figs. S1-S7, and Table S1. 1  are distinguished by the absence or presence of a third enzyme involved in the maturation of bacterial lipoproteins. During the maturation process, the first enzyme Lgt, prolipoprotein diacylglycerol transferase, catalyzes the transfer of diacylglycerol to the sulfhydryl moiety of a cysteine residue conserved in the signal peptide of bacterial lipoprotein precursors (6). The second enzyme Lsp, prolipoprotein signal peptidase, subsequently cleaves the signal peptide from the lipoprotein precursors and leaves diacylated lipoproteins (7,8), which contain S-diacylglyceryl-cysteine residues at their N termini. The third enzyme Lnt, apolipoprotein N-acyltransferase, transfers an additional acyl group to the amino group of the diacylated cysteine residue (9,10), yielding a triacylated lipoprotein, which contains an N-acyl-S-diacylated cysteine residue. The conversion of the diacylated form to the triacylated form is generally considered to alter TLR binding specificity from the TLR2/6 to TLR2/1 heterodimer (11,12). Interestingly, diderm Gram-negative bacteria and high-GC content Gram-positive bacteria are reported to have Escherichia coli-type Lnt homolog(s) in their genomes, suggesting that these bacteria can produce triacylated lipoproteins (13). However, monoderm low-GC content Gram-positive bacteria, such as Bacillus, Enterococcus, Lactobacillus, Listeria, Staphylococcus, and Streptococcus and cell wall-less mycoplasmas, are assumed to produce diacylated lipoproteins because they do not have the E. coli-type Lnt in their genomes. However, our recent studies provided clear biochemical evidence that staphylococcal lipoproteins are triacylated (14,15). In addition, Serebryakova et al. (16) reported that lipoproteins from Acholeplasma laidlawii, a mycoplasma strain, are in the triacyl form. These studies indicate that staphylococci and mycoplasma strains must have another type of Lnt whose structure is distinct from the E. coli-type Lnt protein. However, the exact N-terminal lipopeptide structures in other low-GC content Gram-positive bacteria have not yet been determined. TLR2mediated bacterial lipoprotein recognition is essential for the activation of the innate immune response against infection by low-GC content Gram-positive pathogens as well as mycoplasmas (17)(18)(19)(20). Therefore, determination of N-terminal lipid structures of lipoproteins from other low-GC content monoderm bacteria and characterization of their abilities to stimulate TLR2 for the induction of the innate immune responses are important and essential to understanding host-microbe interactions.
Here, through intensive MS analyses, we unexpectedly identified a novel lipoprotein structure containing N-acyl-S-monoacyl-glyceryl-cysteine, which we named the lyso structure, in Enterococcus faecalis, Bacillus cereus, Lactobacillus bulgaricus, and Streptococcus sanguinis. When the production of the proinflammatory cytokines was examined with purified native lyso-type lipoproteins using mice peritoneal macrophages, a B. cereus lyso-form lipoprotein induced tumor necrosis factor (TNF)-␣ and interleukin-6 (IL-6) by TLR2-dependent and TLR1-and TLR6-independent manners, similarly to the triacylated Streptococcus aureus SitC lipoprotein (14). By contrast, another E. faecalis lyso form lipoprotein induced cytokine secretion by TLR2-and TLR6-dependent and TLR1-independent pathways, similarly to the diacylated lipopeptide. Addi-tionally, we identified two other new lipoprotein structures. One is an "N-acetyl" lipoprotein structure containing N-acetyl-S-diacyl-glyceryl-cysteine, which was identified from five bacterial species, including food-associated Bacillus subtilis and Bacillus licheniformis. The other is a "peptidyl" form, an unusual diacyl lipoprotein structure purified from Mycoplasma fermentans and containing two additional amino acids before the lipid-modified cysteine residue. Two lipoproteins with the N-acetyl structures induced the release of TNF-␣ and IL-6 in TLR2-and TLR6-dependent and TLR1-independent manners, supporting that lyso-and N-acetyl-form lipoproteins in low-GC content Gram-positive bacteria function as TLR2 ligand molecules. These results also implicate strong evidence for the presence of yet to be identified key enzymes involved in bacterial lipoprotein biosynthesis.
Purification of Lipoproteins by Triton X-114 Phase Partitioning-Native bacterial lipoproteins were obtained using the Triton X-114 phase partitioning method (15). Briefly, the harvested bacterial cells were disrupted with glass beads and centrifuged at a low speed, and the supernatant was further centrifuged at high speed. The obtained supernatant was supplemented with Triton X-114 to a final concentration of 2% and was then incubated at 4°C. The mixture was subsequently incubated at 37°C and centrifuged for phase separation. The Triton X-114 phase was washed repeatedly and precipitated with ethanol. The precipitates were used for subsequent experiments as a Triton X-114 phase. Triton X-114 fractions of mycoplasmas were prepared as described previously (23,24).
In-gel Digestion-Proteins in the Triton X-114 phase were separated by SDS-PAGE and stained with Coomassie Brilliant Blue R-250. The protein bands excised from the gel were destained with 50% (v/v) methanol and dried by vacuum centrifugation. The dried gel pieces were rehydrated with 2 l of a 10 ng/l of trypsin solution (Promega) and then incubated in ϳ20 l of 50 mM Tris-HCl (pH 8.5) containing 0.1% (w/v) n-decyl-␤-D-glucopyranoside (Sigma) at 37°C for 18 h. The resulting digests were analyzed by MS directly or after chloroform/ methanol extraction. The sample tubes used for in-gel digestion and subsequent analytical procedures were hydrophilic polypropylene tubes (Proteosave SS; Sumitomo Bakelite) to decrease the loss of lipoproteins or lipopeptides by nonspecific adsorption to the tubes.
Purification of Lipopeptides by Chloroform/Methanol Extraction-Lipopeptides from the purified native lipoproteins were extracted according to previous methods (15). Briefly, the peptide solution (10 -20 l) resulting from in-gel digestion was acidified with 1 l of 70% formic acid, mixed with the same volume of chloroform/methanol (2/1, v/v) by vortexing, and then centrifuged. The organic phase contained the lipopeptides. The acidification step is necessary because the formation of a methyl ester at the C-terminal carboxyl group of lipopeptides is often observed in the organic phase in the presence of tryptic activity.
MALDI-TOF MS and MS/MS-MALDI-TOF MS was conducted using an Ultraflex (Bruker Daltonics) MALDI-TOF mass spectrometer in positive reflectron mode. Saturated ␣-cyano-4-hydroxycinnamic acid solution in chloroform/ methanol (2/1, v/v) was used as the matrix. A thin layer of ␣-cyano-4-hydroxycinnamic acid matrix was prepared and the samples were deposited on the matrix. The resolution was about 10,000. The mass spectra were calibrated externally so that the typical mass error was about 50 ppm. The MS/MS spectra were acquired using a MALDI-TOF/TOF instrument (Ultraflex; Bruker Daltonics) using ␣-cyano-4-hydroxycinnamic acid as matrix. The MS/MS isolation width was Ϯ2 Da. Fragments of tryptic lipopeptides observed using MALDI MS/MS are mainly C terminus-containing sequence ions (y-type) because the positive charge tends to localize at the C-terminal lysine or arginine of the peptides.
On-target Oxidation of Lipopeptides-To facilitate the neutral loss of the acylglyceride moiety from lipopeptides by MS/MS, the thioether sulfur at the N-terminal S-acyl-glycerylcysteine of a lipopeptide was oxidized to sulfoxide as described previously (15). One microliter of hydrogen peroxide (30% aqueous solution) was spotted onto the sample matrix co-crystal of a MALDI sample target, and the target was then completely dried at room temperature.
Lipoprotein Lipase Treatment-The organic phase containing the lipopeptides was dried under vacuum and redissolved in 10 -20 l of water by sonication for 5 min. The solution was heated at 100°C for 1.5 h to inactivate trypsin and then incubated at 37°C with 80 ng/l of lipoprotein lipase from Pseudomonas sp. (Sigma). The resulting digests were directly analyzed by MALDI MS and MS/MS.
Protein Identification by LC-MS/MS-In-gel digests were analyzed by a nano-LC-MS/MS (1100 series; Agilent Technologies) coupled with a hybrid quadrupole-TOF instrument (Q-Tof2; Waters) as described previously (15). The mobile phases of a homemade capillary column packed with Inertsil ODS3 (GL Science) were: solvent A (0.075% (v/v) formic acid in water) and solvent B (0.075% (v/v) formic acid and 80% (v/v) acetonitrile in water). The resulting MS/MS data were applied to Mascot (Matrix Science) for protein identification.
Protein Sequence and Amino Acid Composition Analyses-After SDS-PAGE, the lipoproteins were electroblotted onto a polyvinylidene difluoride (PVDF) membrane and then stained with Coomassie Brilliant Blue R-250. The bands excised from the membrane were subjected to Edman degradation using a Procise 494HT protein sequencing system (Applied Biosystems). For identification of S-glycerylcysteine, the excised protein or the synthetic model peptide were individually dried in clean 6-mm ϫ 32-mm glass tubes containing 50 pmol of norvaline as an internal standard. The tubes were placed in a glass vial that contained 200 l of constant-boiling HCl and a piece of phenol crystal. The vial was sealed after evacuation for a few minutes using by a Mininert valve (Pierce). The samples were hydrolyzed for 20 h at 110°C and then derivatized in situ by 6-aminoquinolyl-N-hydroxysuccinimidyl carbamate (AQC) for fluorophore detection. The AQC-amino acids were separated by ion-pair chromatography (25). The AQC-S-glycerylcysteine was eluted between AQC-glycine and AQC-threonine.

Identification of a Novel Lyso Structure from Gram-positive
Gut Microbes-To determine the native lipoprotein structures of low-GC content Gram-positive bacteria, we first selected E. faecalis, a well known intestinal bacterium. Inappropriate acti-vation of the innate immune system by gut microbes causes chronic inflammation, resulting in human inflammatory bowel diseases, such as Crohn disease and uncreative colitis (26). The lipoprotein-enriched Triton X-114 phase fraction was prepared and analyzed on SDS-PAGE (Fig. 1A). The 46-kDa purine nucleoside receptor PnrA (EF0177 in strain V583) (27) was ingel digested with trypsin, extracted with chloroform/methanol, and analyzed using MALDI-TOF MS. The MS spectrum showed a peak at m/z 997.7 (Fig. 1B, inset), which corresponded to the N-terminal PnrA lipopeptide containing lipidated cysteine with two fatty acids of 34:1 (34 and 1 refer to the numbers of total carbons and double bond, respectively), but no peaks corresponding to conventional triacylated form. Although the MS spectra of triacylated lipoproteins from staphylococci contain 14-Da interval distinctive ions due to various numbers of methylene (-CH 2 -) groups in fatty acids (14,15), the E. faecalis PnrA lipopeptide peak did not exhibit such accompanying ions (Fig. 1B, inset). When N-terminal PnrA lipopeptide was incubated with lipoprotein lipase, which can hydrolyze O-esterified fatty acids, we unexpectedly observed that only one 18:1 fatty acid was specifically removed (Fig. 1, B and C). Further MS/MS analysis of the peak at m/z 997.7 provided C terminus-containing y-series ions corresponding to the amino acid sequence (CGGK) of PnrA (Fig. 1D). Ions at m/z 715.5, 733.5, and 759.6 correspond to the product ions that have lost an 18:1 fatty acid (C 17 H 33 COOH), 18:1 ketene (C 16 H 31 CH ϭ C ϭ O), and 16:0 ketene (C 14 H 29 CH ϭ C ϭ O), respectively, and a characteristic fragment ion at m/z 625.4 corresponds to the N-acyl(16:0)-dehydroalanyl peptide, which is generated by the neutral loss of monoacyl(18:1)-thioglycerol. In addition, Edman sequencing of the N-terminal PnrA lipopeptide was blocked. Taken together, these results indicate that the N-terminal PnrA lipopeptide contains an N-acyl(16:0)-O-acyl(18:1)-S-glyceryl cysteine residue (Fig. 1C), and we named this lipoprotein structure the "lyso" form. Although we could not determine the exact O-acylation position, it is generally assumed that the sn-1 (R3 position)-acylated lyso form is the major species because the acyl group of lysophospholipids is known to transition between the sn-1 and sn-2 positions and the ratio of sn-1 to sn-2 acylation is about 9 (28). We also found the same lysolipidation in two other E. faecalis lipoproteins, EF2256 and EF3256 ( Table 1). The N-terminal EF3256 lipopeptide peak and its deacylated peak were observed at m/z 1069.7 and 805.5, respectively (Fig. 1, B and C).
We also determined the structure of lyso form lipoproteins from B. cereus, one of components of the microbiota in the human gut and one of causative agents of food poisoning. A similar method of the E. faecalis PnrA lipoprotein as described above is applied. Lipoprotein-enriched Triton X-114 phase fraction was separated using SDS-PAGE, and the 33-kDa band containing peptidylprolyl isomerase PrsA (BC1043 of B. cereus strain ATCC14579) (29) was analyzed. The MS spectrum showed a main ion at m/z 1334.8 ( Fig. 2A), which corresponds to the N-terminal CGTSSSDK lipopeptide of the PrsA containing lipid-modified cysteine with two saturated fatty acids, 32:0 in total. In contrast to other lyso form lipoproteins from E. faecalis, the B. cereus N-terminal lipopeptide ion consisted of 14-Da interval distinctive ions due to different lengths of saturated and monounsaturated fatty acids ( Fig. 2A, inset). Ions corresponding to the triacyl form were not observed ( Fig. 2A). To characterize an esterified fatty acid(s), the lipopeptide was treated with lipoprotein lipase. Lipoprotein lipase removed only one esterified fatty acid (17:0 as the major fatty acid) from the lipopeptide (Fig. 2, B and C). The MS/MS spectrum of the ion at m/z 1334.8 provided y-series ions that confirmed the GTSSSDK peptide sequence of B. cereus PrsA (Fig. 2, D and E). The m/z 1064.8 peak corresponded to a product fragment ion that lost a 17:0 fatty acid from the precursor ion, and a charac-  APRIL 13, 2012 • VOLUME 287 • NUMBER 16   (Fig. 2E), in which we assumed that the sn-1 (R3 position) lyso form is the major component because of the same reason as described above. Additionally, the major forms of oligopeptide-binding OppA (BC3586) and BC0200 were also identified to be the same lyso form of N-acyl(15:0)-O-acyl(17:0)-S-glyceryl-cysteine (Table 1).

Novel Lipoprotein Structures in Gram-positive Bacteria
In addition to E. faecalis and B. cereus, we also found lysoform lipoproteins from Lactobacillus delbrueckii ssp. bulgaricus, a probiotic strain originating from Bulgarian yogurt, and S. sanguinis, a member of the human indigenous oral microflora (Table 1, supplemental text, and Figs. S1 and S2). These results suggest that the lyso form is a well distributed lipoprotein structure in low-GC content Gram-positive bacteria and that at least several Gram-positive bacterial species in gut microbes and a probiotic strain produce lyso-form lipoproteins.
TLR2 Stimulation by Lyso form Lipoproteins-Recently published elegant crystal structures of TLR2/1 and TLR2/6 heterodimers complexed with a synthetic lipopeptide demonstrated that TLR2 holds both esterified fatty acids linked to the S-glyceryl group of lipopeptides (30,31). Because lyso form lipoproteins lose one of two esterified fatty acids linked to the S-glyceryl group, we wondered whether lipoproteins harboring the novel lyso-type structure can stimulate host immune cells via the TLR2/1 or TLR2/6 heterodimer. Peritoneal macrophages from parental or TLR-deficient mice were stimulated with purified lyso form lipoproteins. Surprisingly, the lyso form B. cereus OppA lipoprotein induced proinflammatory cytokine production in TLR2-dependent and TLR1-and TLR6-independent manners (Fig. 3, A and B), similarly to the triacylated S. aureus SitC lipoprotein (14). Induction of the lyso form-mediated early cellular signaling response in mice macrophages was also examined. The lyso form OppA lipoprotein of B. cereus induced degradation of IB␣ and phosphorylation of p38 and ERK1/2 MAPKs in TLR2-dependent manners (supplemental Fig. S3). TLR1 or TLR6 enhanced these early responses but was not essential. These results suggest that the lyso form OppA lipoprotein of B. cereus stimulates immune cells through both the TLR2/1 and TLR2/6 heterodimers. By contrast, another lyso-form E. faecalis PnrA lipoprotein induced cytokine secretion in TLR2-and TLR6-dependent and TLR1-independent manners (Fig. 3, A and B), indicating that E. faecalis PnrA mediates the immune signal through the TLR2/6 heterodimer. Therefore, the TLR2 heterodimer selectivity by lyso-form lipoproteins is not solely determined by the position of the acyl chains on the lipid-modified cysteine residue but also affected by the protein sequence. These results suggest that the newly identified lyso-form lipoproteins can function as the TLR2 ligand and that one of the two acyl chains of the S-diacylglyceryl group can be removable in the interaction between triacyl lipoproteins/lipopeptides and TLR2.
N-Acetyl Structure from Food-associated Gram-positive Bacillus-Next, we selected B. licheniformis, a soil bacterium capable of causing food poisoning. The MS analysis of the ingel-digested 33-kDa band (Fig. 4A) showed a main peak at m/z 1016.7 (Fig. 4, B, inset, and C), which corresponds to the N-terminal lipopeptide of manganese-binding MntA (BLi03547 of strain ATCC14580) (32) possibly modified with two 35:0 fatty acids. The MS/MS spectrum showed the y-ions corresponding to the N-terminal amino acid sequence (SSK) of MntA (Fig. 4B). However, product ions at m/z 774.4, 746.4, and 504.2 suggest the loss of 15:0, 17:0, and both of the two fatty acids (32:0), respectively. These results indicate acyl modifications of 32:0 in total, which is smaller than the estimated 35:0. In addition, treatment with lipoprotein lipase also removed two O-esterified fatty acids of 15:0 and 17:0 (Fig. 4, C and D). Based on additional experimental data, an explanation for the apparent discrepancy was provided by acetylation of the ␣-amino group. A fragment ion at m/z 432.2 corresponds to the 42-Da-modified dehydroalanyl-SSK peptide, which is generated by the neutral loss of diacyl(17:0/15:0)-thioglycerol (Fig. 4B). As expected, Edman degradation did not provide an N-terminal sequence. Moreover, acid hydrolysis of B. licheniformis MntA generated S-glyceryl-cysteine, as did the same treatment of a synthetic model lipopeptide, N-acetyl-S-dipalmitoyl-glyceryl-cysteinyl. These results corroborate our predication of N-acetylation at the ␣-amino group and indicate that no acid-resistant modifications, such as trimethylation, occurred. Taken together, we conclude that the N terminus of B. licheniformis MntA has the N-acetyl-S-diacyl(15:0/17:0)-glyceryl-cysteine structure (Fig.  4E), which we named the N-acetyl form. Generally, modification on the sn-1 position of the glyceryl group of a phospholipid  is more stable than the sn-2 modification (28). The peak at m/z 774.2 (17:0-containing) was higher than the peak at m/z 746.4 (15:0-containing) (Fig. 4B), suggesting that the sn-2 and sn-1 (2 and 3, respectively) positions of the S-glyceryl group were modified by 15:0 and 17:0 fatty acids, respectively (Fig. 4E). These fatty acid positions of the S-glyceryl group are consistent with literature on Bacillus phospholipids (33), the substrates of Lgt. We also identified the same N-acetyl form in another B. licheniformis lipoprotein OppA (BLi01232). Lipoproteins from neutrophilic B. subtilis, the best characterized low-GC content Gram-positive bacteria that is used in the production of Natto, a traditional Japanese dish of fermented soybeans, also contained the N-acetyl form ( Table 1, supplemental text and Fig.  S4, A-F).
To examine whether the N-acetyl form is ubiquitous in Bacillus-related strains, we determined the N-terminal struc-tures of lipoproteins from three other strains living in extremophilic conditions. One of the three strains was O. iheyensis strain HTE831, which was collected at a depth of 1,050 meters on the Iheya Ridge deep-sea sediment and is known to have extremely halotolerant and facultative alkaliphilic properties (34). O. iheyensis cells were grown aerobically at pH 9.5 until late-log phase. MS analysis revealed a main ion at m/z 899.6 and accompanying 14-Da interval ions (Fig. 5A). The main ion corresponded to the N-terminal CGR lipopeptide of cytochrome c oxidase subunit II CtaC (OB1437 of strain HTE831) (34) with two possible saturated fatty acids, 33:0 in total. The MS/MS spectrum of the ion at m/z 899.6 provided y-and y*-series ions describing the GR peptide sequence of O. iheyensis CtaC (Fig.  5B). The ion at m/z 657.3 corresponded to a product ion that had lost a 15:0 fatty acid. The MS/MS spectrum also provided a characteristic fragment ion at m/z 343.0 that corresponded to the N-acetyl-dehydroalanyl peptide generated by the neutral loss of diacyl(15:0/15:0)-thioglycerol. These results indicate that CtaC in O. iheyensis is the N-acetyl form: an N-acetyl-Odiacyl(15:0/15:0)-S-glyceryl-cysteine structure (Fig. 5C).
Next, we determined the N-terminal structures of lipoproteins of thermophilic G. kaustophilus strain HTA426, which was isolated at a depth of 10,897 meters in the Mariana Trench and can grow up to 74°C (35). G. kaustophilus cells were grown aerobically at 60°C until late-log phase. MALDI-TOF MS analysis revealed a main ion at m/z 1868.0 that accompanied 14-Da interval ions (Fig. 5D). The main ion corresponded to the N-terminal CGQAGNNAGGGNQK lipopeptide of an ABC transporter substrate-binding lipoprotein (GK1283 of strain HTA426) (35) with two possible saturated fatty acids, 35:0 in total. The lipopeptide ions having a monounsaturated fatty acid were also observed (Fig. 5D, inset, filled square). The MS/MS spectrum of the ion at m/z 1868.0 provided y-series ions describing the GQAGNNAGGGNQK peptide sequence of G. kaustophilus GK1283 (Fig. 5E). The ion at m/z 1626.7 corresponded to a product ion that had lost a 15:0 fatty acid. The MS/MS spectrum also provided a characteristic fragment ion at m/z 1283.6 that corresponded to the N-acetyl-dehydroalanyl peptide generated by the neutral loss of diacyl(17:0/15:0)-thioglycerol. These results indicate that GK1283 in G. kaustophilus is the N-acetyl form: an N-acetyl-O-diacyl(17:0/15:0)-S-glyceryl-cysteine structure (Fig. 5F). Regarding the fatty acid modification of the 3 and 2 positions of the S-glyceryl group, only the ion at m/z 1626.7 (17:0-containing), and not the 15:0-containing ion, was represented in the MS/MS spectrum (Fig. 5E), suggesting that the 2 and 3 positions of the S-glyceryl group could be modified by the 15:0 and 17:0 fatty acids, respectively (Fig.  5F). Additionally, GK0969 was identified to take the N-acetyl form, an N-acetyl-O-diacyl(32:0 in total)-S-glyceryl-cysteine, as its major form (Table 1). We also found N-acetyl form lipoproteins in other extremophilic Bacillus-related strains, namely alkaliphilic Bacillus halodurans grown at pH 9.5 (Table 1, supplemental text and Fig. S5). These results suggest that the N-acetyl form is also a well distributed structure of lipoproteins in low-GC content Gram-positive Bacillus-related strains. As described above, B. cereus lipoproteins are determined as the lyso form, indicating that the lipoprotein structure is not limited to the N-acetyl form in the Bacillaceae family. These N-acetyl form lipoprotein-producing bacteria grown in environments or food sources can be exposed to cutaneous immunity or to mucosal immune systems of the host gut after ingestion. So, we examined whether N-acetyl form lipoproteins can stimulate host immune cells through TLR2 and found that N-acetyl form lipoprotein/lipopeptides stimulate immune cells through TLR2/6 ( Fig. 3, C and D).
Peptidyl Structure from Mycoplasma-Next, we determined the structures of multiple mycoplasma lipoproteins. Consistent with our previous predictions (23,24), lipoproteins from M. genitalium and M. pneumoniae adopted a conventional triacyl structure (Table 1, supplemental text and Fig. S6), suggesting the presence of a new type of Lnt enzyme in mycoplasmas, in accordance with a recent A. laidlawii study (16). In addition, macrophage-activating lipopeptide-2 precursor (MBIO_0763 of strain PG18) and MBIO_0869 from M. fermentans were determined to adopt the S-diacyl(34:0 in total)-glyceryl structure (Table 1, supplemental text and Fig. S7, A-C), consistent with a previous report (36). However, we identified a novel and unusual diacyl lipoprotein structure in M. fermentans. The N-terminal lipopeptide from a 104-kDa MBIO_0319 (Fig. 6A) produced a peak at m/z 1071.7 (Fig. 6B, inset). Intriguingly, the MS/MS analysis indicated the presence of additional alanylserine residues in front of the lipidated cysteine (Fig. 6C). Edman degradation confirmed the N-terminal amino acid sequence as ASXGR. Similarly, the N terminus of MBIO_0661 from M. fermentans had alanyl-glycine residues in front of the lipidated cysteine (Table 1). We named this new N-terminal lipoprotein structure the peptidyl form. We could not detect the conventional diacyl form in either MBIO_0319 or MBIO_0661 from MS analyses. Therefore, the lipoproteins in M. fermentans take only one of the two structures, either the conventional diacylated form or the peptidyl form, suggesting that lipoprotein structures are variable among mycoplasma species and within individual proteins of M. fermentans.
We examined the cleavage sites of the four M. fermentans lipoproteins listed above to be located at the C terminus of the serine residue in front of the lipidated cysteine (Fig. 6D). Twenty-seven predicted lipoproteins in the M. fermentans JER genome (37) have a serine residue(s) within three amino acids in front of the predicted lipid-modified cysteine (supplemental Fig. S7D). Therefore, we speculate that M. fermentans Lsp may have a unique specificity for a cleavage site compared with other bacteria. Unfortunately, because it was unable for us to obtain sufficient amounts of homogeneous M. fermentans native peptidyl form lipoproteins, we could not examine their TLR2 stimulation profiles.
Finally, we present the lipoprotein structures of L. monocytogenes, a food poisoning bacterium capable of living as an intracellular parasite. Studies using an lgt-deleted mutant showed critical roles of lipoproteins for virulence and TLR2medated immune activation in L. monocytogenes (19). The MS and MS/MS spectra showed that N-terminal lipopeptides of Lmo2196 and two other lipoproteins (protein number of strain EGD-e) (38) have the S-diacyl(17:0/15:0)-glyceryl-cysteine structure (Table 1, supplemental text, and Fig. S7, E-G). These results suggest that the conventional diacyl form is the TLR2stimulating lipoprotein structure of L. monocytogenes, that N-acylation is not essential for Listeria cell growth, and that the N-acyl state is not obligatory for all Firmicutes lipoproteins.

DISCUSSION
Determination of the exact lipoprotein structures from low-GC content Gram-positive bacteria is hampered due to difficulties in the purification of native bacterial lipoproteins and in their molecular structural analyses. Here, we first determined the native bacterial lipoprotein structures from 13 different bacterial species by MALDI-TOF-MS-based strategies. We found a novel and unique lyso form lipoprotein structure that functions as a TLR2 ligand. In addition, N-acetyl and peptidyl forms were also identified. Both the lyso and N-acetyl forms are conserved in low-GC content Gram-positive bacteria. Until this time, bacterial lipoproteins were mainly classified into two groups, diacylated and triacylated forms; this study identifies three additional new structures of bacterial lipoproteins and expands the possible structural variants of bacterial lipoproteins. These structural variants of lipoproteins are sensed by TLR2 heterodimers. The lyso form lipoproteins are specifically recognized by TLR2/1 and/or TLR2/6 ( Fig. 3 and supplemental Fig. S3), whereas the N-acetyl form lipoproteins are recognized by TLR2/6 ( Fig. 3), showing that the cooperation of TLR2 with TLR1 and TLR6 allowed the TLR2-mediated recognition of the structural variations of bacterial lipoproteins.
The lyso form is also an exceptionally unique structure from a bacterial lipoprotein biosynthesis viewpoint. To predict how newly identified bacterial lipoproteins are maturated, we summarized the N-terminal lipoprotein structures into three major classes (A to C) based on published and our current studies (Table 1) and made a schematic of possible bacterial lipoprotein biosynthetic pathways (Fig. 7). Class C has S-diacyl-glycerylcysteine structures, which are maturated by two enzymes, Lgt (Fig. 7, step 1) and Lsp (steps 2 or 6). Class B contains N-acyl-S-diacyl-glyceryl-cysteine structures, which are produced by a three-step modification by Lgt, Lsp, and Lnt that catalyzes N-acylation (steps 3 or 7). Because the low-GC content monoderm bacteria lack E. coli-type Lnt homologues, Lnt in these bacteria must have a new primary structure (14,16) and may have distinct enzymatic properties. Finally, class A has an N-acyl-S-monoacyl-glyceryl-cysteine lyso structure whose maturation may require either a putative lipoprotein O-deacylase acting on the conventional triacyl form (step 4) or a putative lipoprotein transacylase acting on the conventional diacyl form and transferring a fatty acid from the S-glyceryl group to the ␣-amino group of the lipidated cysteine (step 5). Possible biochemical characteristics of the unidentified lipoprotein biosynthesis enzymes are discussed in the supplemental text and Table S1. The existence of the N-acetyl form lipoproteins indicates that, unlike E. coli-type Lnt, the yet to be discovered Lnt enzyme might not use membrane phospholipids as substrates. Taken together, the identification of novel lipoprotein struc-  tures supports the presence of novel modes of lipid modifications in bacterial lipoprotein biosynthesis.
Because the anchoring of lipoproteins onto bacterial membrane appears to be sufficiently achieved using two O-esterified fatty acids, the additional N-acylation may have a different role than just simple membrane association. N-Acylation in E. coli is required for sorting lipoproteins from the inner membrane to the outer membrane by the Lol system (39 -41), as these monoderm bacteria do not have outer membranes. Possible roles of N-acylation may be analogous to the functions of N-myristoylation in eukaryotic proteins, such as protein localization and protein-protein interactions (42). Intriguingly, the N-acetyl form seems to be distributed among environmental bacteria isolated from soil and deep-sea environments, whereas normal inhabitants of humans tend to be N-long chain acylated (Table  1). Therefore, these structures may have been established during adaptation to environmental conditions including host immune systems.
In conclusion, we determined the molecular structures of novel TLR2-stimulating lyso, N-acetyl, and peptidyl form lipoproteins and also provide clear biochemical evidence for the presence of yet to be identified key enzymes in bacterial lipoprotein biosynthesis. Our discovery of the new lipid-modified lipoprotein structures will help to elucidate the biological functions of bacterial lipoproteins during microbe-host interactions.