Structural basis for lipid binding and mechanism of the Mycobacterium tuberculosis Rv3802 phospholipase

The Mycobacterium tuberculosis rv3802c gene encodes an essential enzyme with thioesterase and phospholipase A activity. Overexpression of Rv3802 orthologs in Mycobacterium smegmatis and Corynebacterium glutamicum increases mycolate content and decreases glycerophospholipids. Although a role in modulating the lipid composition of the unique mycomembrane has been proposed, the true biological function of Rv3802 remains uncertain. In this study, we present the first M. tuberculosis Rv3802 X-ray crystal structure, solved to 1.7 Å resolution. On the basis of the binding of PEG molecules to Rv3802, we identified its lipid-binding site and the structural basis for phosphatidyl-based substrate binding and phospholipase A activity. We found that movement of the α8-helix affords lipid binding and is required for catalytic turnover through covalent tethering. We gained insights into the mechanism of acyl hydrolysis by observing differing arrangements of PEG and water molecules within the active site. This study provides structural insights into biological function and facilitates future structure-based drug design toward Rv3802.

In 2015, 10.4 million incident cases of tuberculosis were reported with 1.4 million people succumbing to the disease (1). Lack of antibiotic adherence has led to an increase in multiple and totally drug-resistant strains of Mycobacterium tuberculosis, the causative agent of tuberculosis (2). As such, improved antibiotics and novel targets for antibiotic development are of high importance. Isoniazid and ethambutol are two first-line antibiotics used to treat tuberculosis; both drugs inhibit cell wall biosynthesis (3). The M. tuberculosis cell wall remains an appealing target for antibiotic development (3,4).
The M. tuberculosis cell envelope consists of four major components: an inner cytoplasmic membrane, a periplasmic region composed of the peptidoglycan and arabinogalactan, the outer membrane (mycomembrane), and the extracellular capsule (5). The hallmark fatty acids of mycobacteria are the mycolic acids (MA), 3 which account for the majority of the mycomembrane (5). Secondary to MA, but of equal importance, are the essential phosphatidylinositol (PI)-based glycerophospholipids (6,7). The major derivatives of PI are phosphatidylinositol mannoside (PIM) and the further glycosylated forms of PIM, lipoarabinomannan (LM) and mannose lipoarabinomannan (ManLM) (5). All three can be found within both the cytoplasmic membrane and mycomembrane, with the glycans of LM and ManLM extending into the extracellular capsule (5). Mycobacterial lipids and the subsequent modulation of the lipid composition have been associated with cell division, virulence, pathogenesis, and macrophage survival (5).
Residing within the gene cluster that encodes proteins responsible for MA and arabinogalactan biosynthesis is the rv3802c gene (8,9). The biological function of rv3802c remains in question; however, the gene has been determined to be essential for M. tuberculosis viability (10 -12). In addition to being required for general viability, the Mycobacterium avium ortholog of M. tuberculosis rv3802c has been shown to be associated with intestinal epithelium invasion (13). The rv3802c gene encodes an N-terminal translocation signal sequence that includes a predicted transmembrane region (14). Rv3802 was found only in cell wall extracts upon cellular lysis and maintained a molecular mass indicative of an uncleaved protein, suggesting that the enzyme remains anchored in the cytoplasmic membrane (14). Due to sequence similarities with cutinase-like enzymes, the rv3802c gene product was named CULP6 (cutinase-like protein 6), despite lacking cutinase activity (14). However, Rv3802 has been shown to possess both thioesterase and phospholipase A (PLA) activity utilizing residues Ser 175 -His 299 -Asp 268 as the catalytic triad (8,14). The structure of the Mycobacterium smegmatis ortholog, MSMEG_6394, has been solved, revealing that the enzyme possesses an ␣/␤-hydrolase fold (12). The apo structure presented two intact disulfides and a small cavity leading to the identified active-site residues (12). M. tuberculosis Rv3802 shares a high level of sequence identity with a variety of Mycobacterium homologs (68.8% sequence identity between encoded M. tuberculosis Rv3802 This work was supported by National Institutes of Health Grant AI105084. The authors declare that they have no conflicts of interest with the contents of this article. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The atomic coordinates and structure factors ( and MSMEG_6394) in addition to the Corynebacterium glutamicum ortholog NCgl2775 (11). Upon heat-induced cell stress, NCgl2775 was shown to play an essential role in the modulation of the mycomembrane composition (11). As a result, the total mycolate content increased, whereas glycerophospholipid content decreased (11). This elevated mycolate/glycerophospholipid ratio was also observed when both MSMEG_6394 and NCgl2775 were overexpressed under physiological conditions (11). Meniche et al. (11) therefore proposed naming this class of enzymes the "envelope lipids regulation factor," or the ElrF family. Possessing lipase activity, Rv3802 was shown to be susceptible to in vitro inhibition by tetrahydrolipstatin (THL), a Food and Drug Administration-approved covalent lipase inhibitor (8). THL is reported to have a K i app of high nanomolar affinity for Rv3802 (12). As such, a series of THL derivatives have been synthesized and tested, displaying improved in vitro and correlated in vivo activity against M. tuberculosis (15). It has been shown that up to 14 lipases undergo covalent modification by THL in Mycobacterium bovis (16). Although the M. bovis ortholog of Rv3802 was identified as being modified by THL, it was not indicated as one of the 14 statistically validated targets of the study (16).
Due to the essentiality and potential role of Rv3802 in cell wall modulation, we sought to further characterize the function and mechanism. We present the X-ray crystal structure of M. tuberculosis Rv3802 to 1.7 Å resolution. Due to binding of PEG molecules to both protein molecules within the asymmetric unit, we were able to characterize the lipid-binding site. Based on the location of the identified lipid-binding site and previous studies, further insights into a biological function for Rv3802 are gained as well as a basis for future structure-based drug design efforts.

M. tuberculosis Rv3802-PEG structure
The M. tuberculosis Rv3802 crystal diffracted to 1.7 Å, having a space group of P2 1 2 1 2 1 with two molecules present in the asymmetric unit having a C␣ root mean square deviation of 0.14 Å (X-ray diffraction and model refinement statistics are given in Table 1). Molecule A was used for structural analyses if not otherwise noted. As predicted by homology to the M. smegmatis ortholog, M. tuberculosis Rv3802 was found to have an ␣/␤ hydrolase fold ( Fig. 1) (12). Rv3802 has 10 helices (nine ␣-helices and one 3 10 -helix) and a single, central ␤-sheet composed of five parallel ␤-strands (Fig. 1). The nucleophilic Ser 175 is located on the aptly named nucleophilic elbow connecting the ␤3-strand to the ␣3-helix. The remaining catalytic triad residues, His 299 and Asp 268 , are located on the ␣9-helix and a connecting loop between the ␤5-strand and ␣8-helix, respectively (Fig. 1). Two intact disulfides are present in the Rv3802 structure: Cys 72 -Cys 164 and Cys 264 -Cys 271 (Fig. 1). These conserved disulfides are also present in the MSMEG_6394 structure (12).
During the early stages of model refinement, two regions of linear difference density converging at the active site were observed. This non-protein electron density is interpreted as two molecules of PEG found bound to each of the two protein molecules in the asymmetric unit. Difference density maps for PEG molecules in protein molecule A can be found in Fig. 2a, and the PEG molecules in protein molecule B are shown in Fig.  S1. PEG 3350 (average molecular mass is 3350 kDa or ϳ75 repeating monomer units) was present in the crystallization solution; however, the bound PEG polymers consist of only 5 ethylene glycol monomer units.

Identification of lipid tail-binding site
Due to the presence of PEG molecules within Rv3802 and PEG resembling the fatty acid chains of lipids, the lipid-binding site of Rv3802 and presumptive orthologs have been structurally identified (Fig. 2). The fatty acid portion of the lipid-binding site begins at the active site, with the terminal oxygen of PEG1 positioned within hydrogen bonding distance to both Ser 175 and His 299 , 3.2 and 3.8 Å, respectively. The terminal oxygen of the PEG2 resides 6.1 Å away from the terminal oxygen of PEG1 and is positioned between the side chains of Tyr 142 and Ile 270 . Both PEG molecules extend away from the active site in a linear fashion, flanked on both sides by mostly aliphatic amino acid side chains that are positioned on helices ␣6 and ␣8 as well as various loop regions ( Fig. 2A). Without ligand present, the aliphatic side chains involved in PEG binding form a hydrophobic core of van der Waals interactions, as seen in the previously solved apo MSMEG_6394 structure (Fig. 2B) (12). Whereas Rv3802 and MSMEG_6394 have a relatively high level of sequence identity for all encoded residues, 68.8%, the residues that make up the identified fatty acid-binding region possess a higher level of conservation, 76.2% between M. tuberculosis and M. smegmatis specifically, as well as a higher level of conservation with other mycobacterial orthologs (Fig. 2C).

M. tuberculosis Rv3802 structure and lipid binding Identification of dynamic protein regions
As stated above, Rv3802 and the previously solved apo MSMEG_6394 ortholog share a common core scaffold with two disulfides present. However, a noticeable difference between the M. tuberculosis and M. smegmatis structures exists between residues 279 and 298 (Fig. 3A). To accommodate the two bound PEG molecules, ␣8-helix and flanking residues are shifted away from the adjacent loop in the Rv3802 structure.
The N-terminal region of ␣8-helix is extended outward 8.8 Å upon PEG binding, when compared with the apo M. smegmatis ortholog (Fig. 3B). The opening of ␣8-helix alters a connecting loop to the neighboring 3 10 -7-helix; however, this does not affect 7-helix positioning. The loop connecting ␣8 to the upstream catalytic His 299 is unstructured. Due to poor difference density, this initially modeled region was ultimately removed in the deposited structure. Moreover, it is also unresolved in the M. smegmatis ortholog structure. This structurally dynamic region is the hinge point for ␣8-helix movement, being composed of mostly alanine and glycine residues in both enzymes.
Due to a high sequence similarity in the active site between Rv3802 and its ortholog, a direct active-site comparison between the two conformations is feasible. Throughout the remainder of this work, analogous residue numbers will be given with the Rv3802 number first and the MSMEG_6394 number second (e.g. Thr 83/84 ). The top of the ␣8-helix abuts the active site; despite significant movement between the open and closed states of ␣8-helix upon PEG binding, little movement is observed in the residues of the active site (Fig. 3C). However, the nucleophilic Ser 176 of MSMEG_6394 is positioned toward the presumed oxyanion hole consisting of the backbone amides of Thr 83/84 and Asn 176/177 , whereas the nucleophilic Ser 175 of Rv3802 is hydrogen-bonded to the catalytically relevant His 299 . Uncertainty remains as to whether this difference is a function of the modest 2.9 Å resolution of the MSMEG_6394 structure or represents a true structural difference between the open, substrate-bound form and the closed, substrate-free form.

Structural insights into substrate binding and catalysis
When ␣8-helix is in the open conformation, a sizable substrate-binding site is apparent (Fig. 4A). However, in the closed state visualized in the M. smegmatis ortholog, only a small cavity leading to the nucleophilic serine is observed (12). Therefore, for a large substrate possessing hydrophobic, aliphatic character to bind, ␣8-helix must move to expose the substratebinding site. It is expected that initial substrate binding positions the scissile ester/thioester bond in proximity to Ser 175 , affording bifurcation of the active site into two distinct areas that accommodate either the hydrophobic tail or the hydrophilic headgroup. With the reasonable expectation that the PEG molecules bind within the lipid tail-binding site, the lipid head-binding site becomes apparent (Fig. 4B). The headgroupbinding site is unchanged from that observed in the closed state of M. smegmatis ortholog, being the observed small cavity described previously (12). This site is composed of the hydrophobic side chains of Trp 84 , Leu 102 , Phe 174 , Ala 292 , and Ala 300 ; the negatively charged residues or carbonyl backbones of Thr 83 , Glu 85 , and Ala 300 ; and the positively charged side chain of Lys 100 . These residues are fully conserved between Rv3802 and the M. smegmatis ortholog aside from Lys 100 and Ala 300 ; these two residues are Leu and Asn in M. smegmatis, respectively. Perpendicular to and intersecting the substrate-binding site is a solvent channel leading to the active site (Fig. 4A). In both protein molecules within the asymmetric unit, ordered water molecules are found alongside His 299 , opposite of the lipid headgroup-binding site. In molecule A, the terminal PEG1 oxygen is 3.2 Å from the Ser 175 hydroxyl, whereas the Ser 175 hydroxyl is 3.0 Å from the ⑀ nitrogen of His 299 (Fig. 5A). Additionally, a water molecule is found 4.2 Å away from His 299 . In molecule B, the terminal PEG1 oxygen is pointing toward the plausible oxyanion hole, 3.4 Å equidistant to both backbone amides of Thr 83 and Asn 176 . A water molecule is now found within hydrogen-bonding distance, 2.9 Å, to the ⑀ nitrogen of His 299 (Fig. 5B). It should be noted that, although not depicted in the figures, solvent molecules are positioned in a similar fashion in the lipid head-binding sites of both protein molecules A and B.

Influence of ␣8 movement on hydrolase activity
To investigate the effects that ␣8 movement has on catalysis and to validate the structural observations, M. tuberculosis Rv3802 was mutated to limit ␣8 movement. In the closed state, Asn 133 and Glu 289 of the M. smegmatis ortholog are ϳ5 Å apart; therefore, we mutated the corresponding residues in M. tuberculosis Rv3802, Asn 132 and Asn 288 , to cysteine residues to form an engineered disulfide. The resulting N132C/N288C mutant would therefore have limited ␣8 helical movement due to covalent tethering. The Rv3802 N132C/N288C variant exhibited lower expression levels but was purified in an identical fashion to WT Rv3802. Both WT and N132C/N288C Rv3802 run as single monomeric bands on a non-reducing SDS-polyacrylamide gel, indicating the lack of intermolecular disulfide bonds (Fig. S2A). Additionally, the amount of free thiols present in   for salmon-colored PEG molecules. Amino acid side chains within 4 Å of the PEG molecules are shown in cyan, and active-site residues are orange, with other non-carbon atoms in CPK color. These residues comprise the lipid tail-binding site. B, without ligand present, the amino acid side chains of the lipid tail-binding site form a wall of hydrophobic interactions, as seen in the apo MSMEG_6394 structure (PDB code 3AJA) (12). C, sequence alignment of the fatty acid-binding region from Mycobacteria. Residues of differing sequence are shaded, and starred residues are shown in A and B, whereas a red underline corresponds to secondary structure location.

M. tuberculosis Rv3802 structure and lipid binding
both WT and N132C/N288C Rv3802 was quantified under non-reducing and reducing conditions using monochlorobimane. Non-reduced samples had levels of fluorescence equal to background, indicating that no free thiols are present in both WT and N132C/N288C mutant (Fig. S2B). The resulting ratio of quantified cysteines in reduced WT to N132C/N288C Rv3802 was 0.68 Ϯ 0.04, consistent with the 4:6 ratio of cysteines present in WT and N132C/N288C mutant, suggesting that all cysteine residues were forming disulfides and were indeed intramolecular (Fig. S2C).
Initial enzymatic activity was assessed using a fluorescencebased assay that monitored the hydrolysis of 4-methylumbelliferyl heptanoate (4MH). Subsequent hydrolysis of 4MH produces the fluorescent molecule 4-methylumbelliferone and heptanoic acid. Michaelis-Menten kinetics were determined for both WT and N132C/N288C mutant using 4MH ( The residual hydrolase activity of the N132C/N288C mutant toward 4MH indicated that N132C/N288C mutant retained sufficient dynamics or space near the active site for 4MH to bind, despite the hindered helical movement. Whereas 4MH has a 7-carbon alkyl chain, the substrate is significantly shorter than phosphatidyl-based substrates that could occupy the entire lipid tail-binding site. Therefore, PLA activity of WT and N132C/N288C mutant was tested using PI as a substrate. Following 1 h of incubation with PI, hydrolyzed product indicative of PLA activity was observed for WT (Fig. 6C, lane 1). However, no PLA activity was observed for the N132C/N288C mutant (Fig. 6C, lane 2).

M. tuberculosis Rv3802 structure and PLA classification
Given the structural and enzymatic attributes of Rv3802, the Rv3802 family of enzymes most likely falls within the PLA 2 enzyme class (18). Rv3802 was previously determined to have PLA activity toward phosphatidyl species (phosphatidylserine, phosphatidylcholine, and phosphatidylethanolamine) (8). In contrast to thioesters, the ester moieties of phospholipids are generally stable in aqueous solutions, so the enzymatic activity observed in vitro is strongly suggestive of biological activity. The two general classes of PLAs differ with regard to specificity for either the sn-1 acyl chain (PLA 1 ) or sn-2 acyl chain (PLA 2 ) of the glycerol phosphate moiety of phosphatidyl species (17). For unknown reasons, PLA 2 enzymes possess numerous disulfides, whereas the PLA 1 enzymes typically lack disulfides (18). M. tuberculosis Rv3802 was found to possess two disulfides ( Fig. 1) (8). These two disulfides are present in the M. smegmatis ortholog structure and were found to be conserved across all Rv3802 orthologs (12).
The first conserved disulfide, Cys 72 -Cys 164 , covalently links the N-terminal peptides to the base of ␣2-helix (Fig. 1). This is of potential structural importance, as the 41 residues preceding Val 75 , Cys 72 included, are predicted to be unstructured until the predicted N-terminal transmembrane region (Ile 12 -Ile 34 ) (secondary structure and transmembrane prediction in Fig. S4). Therefore, this disulfide may help retain proper protein fold of the structured catalytic hydrolase domain while being immediately preceded by a large, unstructured domain. The second conserved disulfide, Cys 264 -Cys 271 , induces a kink in a loop to appropriately position the catalytically important Asp 268 residue. This second disulfide therefore ensures the proper positioning of Asp 268 relative to His 299 , maintaining the catalytic triad despite movement of the nearby ␣7-helix upon lipid binding.

Mechanism of catalysis
PLA 2 enzymes catalyze fatty acid hydrolysis through two general mechanisms: a calcium-dependent His/Asp mechanism or a calcium-independent mechanism (18). Given the identified catalytic triad and the lack of calcium-dependent enzyme activity, Rv3802 is a calcium-independent PLA 2 enzyme. Flanked by aromatic side chains (Trp 84 , Phe 174 ) and displaying little movement of catalytic residues between open and closed enzymatic states, the active site of Rv3802 resembles that of a typical hydrolase enzyme (Fig. 3C) (19). These features

M. tuberculosis Rv3802 structure and lipid binding
observed in the Rv3802 active site are known to help promote substrate hydrolysis over transfer (19). These structural attributes help explain why Rv3802 has no reported transferase activity in the presence of acyl acceptors (8).
Based on the enzymatic activity, structure, and catalytic triad of Rv3802, catalysis most likely proceeds through a Ping-Pong reaction mechanism. The reaction can be broken down into two half-reactions. The first half-reaction proceeds through nucleophilic attack on the carbon of the ester/thioester carbonyl by Ser 175 , which is deprotonated by His 299 . The terminal oxygen of PEG1 would therefore mimic the carbon of the carbonyl moiety of substrate before nucleophilic attack, depicted in Fig. 5A. Following attack, the plausible oxyanion hole consisting of backbone amide nitrogen atoms of Thr 83 and Asn 176 stabilizes the tetrahedral intermediate. Tetrahedral intermediate collapses yield the acyl-enzyme intermediate and the release of the now deacylated lipid at the end of the first half-reaction. The acyl-enzyme intermediate form of the enzyme is approximated by the terminal oxygen of PEG1 from protein molecule B (Fig. 5B). The second half-reaction would therefore proceed through the activation of a water molecule by His 299 and subsequent nucleophilic attack on the acyl-enzyme intermediate. The nucleophilic water molecule is observed within hydrogen bonding distance to His 299 in protein molecule B and is depicted in Fig. 5B. Tetrahedral collapse following this second nucleophilic attack results in the release of a free fatty acid and a regenerated enzyme.

Lipid binding
Based on the structural similarity of PEG to the short-chain fatty acids of various lipids, it is therefore logical that a similar repositioning of the ␣8-helix must occur for binding of natural substrate and subsequent catalysis to occur. The Rv3802 N132C/N288C mutant, with limited ␣8-helix movement, displayed a lower ability to bind 4MH, resulting in a significant loss in activity compared with WT enzyme (Fig. 6B). However, when the larger PI substrate was used, no PLA activity was observed for the N132C/N288C mutant (Fig. 6C). This loss in activity as a result of hindered helical movement highlights the importance of protein dynamics with respect to substrate binding. Given the large variety of possible structural forms of ␣8-helix and adjacent loops, it is difficult to fully assess how this dynamic interplay affects substrate recognition.
Initial substrate recognition may proceed through binding of the lipid head, as this site does not change between open and closed states of the enzyme. The MD simulation of apo Rv3802 suggests that the ␣8-helix quickly adapts the closed conformation and remains in that state without substrate present when in the aqueous phase (Fig. S3). Therefore, substrate binding may drive helical opening in a solvated environment. However, given the uncertainty of the true biological function of Rv3802, it is difficult to fully assess the mechanism of lipid binding based on the Rv3802-PEG structural observations and mutant kinetics. PLA 2 enzymes have been found to bind only the surfaceexposed lipid headgroup, to reside partially within the membrane itself, and to bind the entire lipid or extract the lipid from the membrane for hydrolysis (18,20).
When ␣8-helix is in the open state, a vast hydrophobic lipidbinding site is apparent (Fig. 2). In agreement with typical lipidbinding sites, the identified Rv3802 lipid-binding site is lined with aliphatic amino acid side chains (Fig. 2) (8, 21). The Rv3802 hydrophobic binding channel is relatively linear, extending away from the active site, therefore resulting in a site capable of binding two alkyl chains in a parallel linear arrangement similar to the arrangement of the two observed PEG molecules. Therefore, the binding of PEG molecules observed within Rv3802 provides a structural basis for the binding of the two acyl chains of the phosphatidyl substrate.

Structure-based insights into biological function
Meniche et al. (11) determined that Rv3802 orthologs NCgl2775 and MSMEG_6394 play an essential role in modu- Figure 6. Influence of ␣8-helix movement on enzymatic hydrolysis. A, in the closed state, ␣8-helix blocks the lipid tail (fatty acid) portion of the lipidbinding site. Based on the proximity of Asn 133 and Glu 289 in the MSMEG_6394 closed state, the corresponding amino acids in Rv3802, Asn 132 and Asn 288 , were both mutated to cysteine to create an engineered disulfide and limit ␣8-helix movement. A surface rendering and gray schematic show Rv3802 with MSMEG_6394 depicted in purple (MSMEG_6394 PDB code 3AJA) (12). B, Michaelis-Menten kinetics of WT compared with N132C/N288C mutant using 4MH as the substrate. Both the affinity for 4MH and the k cat decrease by half as a result of hindered helical movement. C, TLC plate of reaction products when the much larger PI is used as the substrate. A shift indicative of PLA1 or PLA2 activity is observed with WT Rv3802, reaction 1 (Rf value of 0.19). However, no acyl hydrolysis is observed with the disulfide mutant (reaction 2), or when enzyme is lacking (reaction 3).

M. tuberculosis Rv3802 structure and lipid binding
lating lipid composition. Specifically, they noted the increase of the mycolic acid/glycerophospholipid ratio, suggesting that this family of enzymes plays a direct role in decreasing glycerophospholipid content within the mycomembrane (11). It should be noted that NCgl2775 is not essential under physiological growth, whereas the M. tuberculosis orthologs are essential (11,12). This differing essentiality of Rv3802 orthologs may be attributed to the differing compositions of the outer membranes between mycobacteria and corynebacteria (22). For example, the inner layer of the mycomembrane of corynebacteria has a much higher composition of cardiolipin relative to other glycerophospholipids, when compared with mycobacteria (22). PI species represent a major portion of the glycerophospholipids found within the plasma membrane and mycomembrane layers in both M. tuberculosis and M. smegmatis (23). Previously, Rv3802 was shown in vitro to hydrolyze a fatty acid from PIM2 upon treatment of M. smegmatis cell wall extracts with the enzyme (8). PIMs play an important role in the permeability of the M. tuberculosis cell wall, affect proper cell division, and influence pathogenesis (24 -27). Interestingly, when a conditional knockout of MSMEG_6394 was cultured, cell morphology was jagged and elongated compared with WT (12). PIM2 has four acyl substituents, two on the phosphoglycerol, one on the inositol, and the fourth on the mannose moiety. The observed lipid-binding site of Rv3802 is significantly larger than the palmitoyl-binding pocket of other hydrolases known to hydrolyze esters of PIM (28). Specifically, the enzyme PatA resides within the PIM biosynthetic pathway and is responsible for the transfer of a palmitoyl group to the 6-position on the mannose sugar (28). Based on the larger Rv3802 lipid-binding site compared with PatA, a sensible biological function for Rv3802 is the hydrolysis of one or both of the fatty acids from the phosphoglycerol component of PIM or other PI species. PIM has been shown to be an antigen of CD1d-restricted T cells; however, PIM binding to CD1d is abolished upon the hydrolysis of the diacyl phosphoglycerol moiety following treatment with a PLA 2 enzyme (27). Whereas Rv3802 has the enzymatic ability to facilitate this chemistry, further in-depth studies are required to truly validate the biological context of such action.
Using the Rv3802-PEG structure as a basis, PI was modeled into the open form of the enzyme (Fig. 7). This model provides a reasonable structural basis for binding of phosphatidyl-based substrates to Rv3802 (8). Structurally, PI can be easily accommodated within the lipid-binding site of Rv3802 in a manner that promotes catalysis. As modeled, the sn-2 acyl chain would be subject to hydrolysis. Although little positive charge is present at this site to counter the negatively charged phosphodiester moiety, stabilization of interactions with the phosphate moiety may be afforded through solvent or by backbone amide nitrogen atoms of the highly dynamic loop upon restructuring as a function of substrate binding (Ala 292 -Glu 296 ). Due to the proximity of this loop to the lipid head-binding site, this region may also be relevant to sugar binding of larger PI species, such as PIM.
The essential nature of Rv3802 to M. tuberculosis viability makes it an alluring new target for future drug development.
Derivatives of THL exhibit enhanced in vitro inhibition against Rv3802 and related in vivo bioactivity (15). However, THL has been shown to target numerous M. tuberculosis lipases and directly affect MA content (16). Therefore, inhibition of M. tuberculosis by THL proceeds through a shotgun, polypharmacology strategy that has little specificity toward Rv3802. The identification of the lipid-binding site of Rv3802 highlights an extremely intriguing and potentially promising site for drug development. The presented structure and subsequent analysis can be utilized for designing inhibitors specific to Rv3802.

Molecular cloning
An Escherichia coli codon-optimized, synthetic gene encoding the periplasmic domain of M. tuberculosis Rv3802c was purchased from Integrated DNA Technologies. The bases of 5Ј-CGCTGTTCCAGGGACCT-3Ј and 5Ј-GCGTCCGGATC-CGAA-3Ј were added to the 5Ј-and 3Ј-ends of the gene, respectively. The resulting gene was PCR-amplified using primers with sequences identical to those added to the gene and inserted into a pET32-derived plasmid linearized with PshAI using a Gibson Assembly TM Master Mix. The protein expression construct produces protein with the following sequence: Met-His 6 -Ser 2 -Gly-rhinovirus 3C protease cut site-Rv3802c. The N132C/ N288C mutant was cloned in pieces overlapping the site of mutation using the WT construct as a template. Using the 5Ј WT amplify primer and 5Ј-TCAGAGGGCAGTGGAACTG-3Ј generated the first fragment. 5Ј-CAGTTCCACTGCCCTC TGA-3Ј and 5Ј-GCAAGGGTGCAAAGAGTCGTA-3Ј were used to generate the middle fragment. 5Ј-TACGACTCTTTG CACCCTTGC-3Ј and the 3Ј WT amplify primer were used to generate the end fragment. The full gene was constructed by amplifying the front and middle fragments together, followed by the middle and end fragments. The two resulting fragments

M. tuberculosis Rv3802 structure and lipid binding
were then amplified together to produce a full-length mutant gene and inserted into an expression plasmid in an identical fashion to WT. Both WT and mutant plasmids were sequenced to confirm the desired DNA sequence.

Protein expression and purification
Protein expression and purification were adapted and modified from a methodology published previously (14). Both WT and N132C/N288C mutant protein were expressed and purified in an identical fashion. Chemically competent T7 Express E. coli cells (New England Biolabs) were transformed with plasmid encoding WT or mutant protein. Subsequent cultures were grown at 37°C in lysogeny broth medium containing carbenicillin. Protein expression was induced for 3 h at 37°C, following the addition of 1 mM isopropyl 1-thio-␤-D-galactopyranoside once culture density reached an A 600 nm of 0.6. Induced cells were pelleted and resuspended in a 50 mM Tris, pH 7.5, 300 mM NaCl buffer and placed at Ϫ80°C for storage.
Induced cells were thawed in a warm bath, followed by the addition of DNase I and lysozyme, and allowed to incubate on ice for half an hour. Resuspended cells were then sonicated, and the crude lysate was pelleted at 18,515 ϫ g for 40 min at 4°C. Supernatant was discarded, and the pellet was resuspended in 50 mM Tris, pH 7.5, 3 M NaCl buffer. Resuspended cells were again pelleted at 18,515 ϫ g for 40 min at 4°C, with the supernatant again discarded. The resulting pelleted inclusion bodies containing Rv3802 were solubilized in 50 mM Tris, pH 7.5, and 4 M guanidine HCl (GdnHCl) and pelleted at 18,515 ϫ g for 40 min at 4°C. The supernatant containing soluble Rv3802c was loaded onto an equilibrated 5-ml HiTrap Talon crude cobalt column (GE Healthcare) and was washed for 15 column volumes with 50 mM Tris, pH 7.5, 4 M GdnHCl. Protein was eluted over a 20-column volume gradient with increasing imidazole concentration. Rv3802 was refolded through the removal of GdnHCl as a result of extensive dialysis using a 50 mM Tris, pH 7.5, 300 mM NaCl buffer. Protein must be at a low concentration (ϳ0.25 mg/ml) within the dialysis bag for refolding, or significant precipitation is observed. Refolded protein was centrifuged at 18,515 ϫ g for 1 h at 4°C to remove any residual insoluble protein. Soluble, pure protein was dialyzed into either 100 mM HEPES, pH 7.5, for crystallization or 50 mM NaPO 4 , pH 7.5, for cysteine quantification, 4MH, and PLA assays.

Crystallization and data collection
Purified Rv3802 was concentrated to 4.5 mg/ml for crystallization. Screening of Rv3802 against the index screen (Hampton Research) using the hanging drop method resulted in the formation of a single orthorhombic crystal. The crystal formed in a 1:1 ratio of well solution (0.1 M BisTris, pH 6.5, 25% (w/v) PEG 3350) to protein, requiring 1 month to grow with incubation at 16°C. The resulting crystal was looped and flash-cooled in liquid nitrogen. Extensive efforts to crystallize the N132C/N288C mutant have been exerted; to date, diffraction quality crystals have not been obtained. X-ray diffraction data were collected using synchrotron radiation at the Advanced Photon Source, LS-CAT beamline F, at Argonne National Laboratory.

Structure determination and refinement
Diffraction data were indexed and scaled in a P2 1 2 1 2 1 space group using HKL2000 (29). The phase solution with two molecules in the asymmetric unit came from molecular replacement (PHASR-MR) using a model generated with SWISS-MODEL that was based on the previously solved MSMEG_6394 structure (PDB entry 3AJA) (30, 31). Atoms lacking 2F o Ϫ F c density were deleted, and the resulting model was subjected to rigid body and simulated annealing refinements (PHENIX Refine) (30). Deleted atoms were built back into corresponding F o Ϫ F c and 2F o Ϫ F c difference density maps using COOT (32). The progressing model was subjected to rounds of xyz coordinate, real-space, occupancy, and individual B-factor refinements (PHENIX-Refine) with manual modeling in COOT (30, 32). The four molecules of PEG were added using LigandFit and subjected to model refinement (33). Model refinement was complete once a final R work /R free of 0.1582/0.1830 was achieved. The 0.4% Ramachandran outlier is due to Val 237 and is warranted, given the strong electron density for all atoms of the residue in both molecules within the asymmetric unit. Mol-Probity was utilized to validate the structure (34). Rv3802 model coordinates and structure factors were deposited in the PDB with accession code 5W95.

Cysteine quantification
To ensure proper disulfide formation of the engineered N132C/N288C mutant, thiols present in WT and mutant were quantified and compared using fluorescent labeling with a nonreversible thiol-reactive probe. Respective enzymes were compared in non-reduced and reduced states. A 5 M concentration of respective enzyme in 50 mM sodium phosphate buffer, pH 7.5, was incubated with 0.3 mM tris(2-carboxyethyl)phosphine (reduced) or equivalent volume buffer (non-reduced) for 30 min at 37°C (200-l reaction volumes). Samples were titrated with 240 M monochlorobimane (100 mM stock, DMSO), mixed, and incubated at 37°C for 30 min. A standard curve to quantify free thiols was generated by serially dilution of L-cysteine in the presence of 0.3 mM tris(2-carboxyethyl)phosphine, reduction for a half hour at 37°C, and modification with 240 M monochlorobimane for 30 min at 37°C (Fig. S2D). Fluorescent reads were conducted in triplicates using ex ϭ 390 nm and em ϭ 490 nm on a Synergy H4 plate reader (Biotek).

In vitro hydrolase assays
A fluorescence-based assay using 4-methylumbelliferyl heptanoate (4MH) (Sigma-Aldrich) was developed for Rv3802, which affords a sensitive method for activity determination and is amenable to drug screening, with a ZЈ score of 0.84 (ZЈ determination and methods shown in Fig. S5). In the presence of enzyme, the heptyl chain is hydrolyzed from 4MH, producing the now fluorescent 4-methylumbelliferone molecule and heptanoic acid. All assays were performed in triplicate in a 50 mM sodium phosphate, pH 7.5, buffer. For Michaelis-Menten kinetics, 50 nM WT Rv3802c and 75 nM N132C/N288C mutant were used with varying concentrations of serial diluted 4MH (20 mM DMSO stock). All fluorescent reads were conducted at 37°C using ex ϭ 360 nm and em ϭ 450 nm on a Synergy H4 plate reader (Biotek). Background fluorescence as a conse-quence of 4MH hydrolysis was subtracted, and relative fluorescence units were converted to concentration of standard curve (Fig. S6). Reaction rates and Michaelis-Menten kinetic parameters were determined using PRISM version 7.

PLA activity
PLA activity was assessed using a modified version of a previously established Rv3802-PIM hydrolysis experiment (8). Briefly, soy PI (Avanti Polar Lipids Inc.) was resuspended in 50 mM sodium phosphate, pH 7.5, forming a 2 mg/ml suspension of PI. Reactions consisted of 300 g of PI suspension, 50 g of respective enzyme or an equivalent volume of buffer and were brought to a final volume of 400 l with 50 mM sodium phosphate, pH 7.5, buffer. Reactions were incubated for 1 h at 37°C, being mixed every 10 min, and quenched with 200 l of 50:50: 0.3 CHCl 3 /CH 3 OH/HCl. The lower organic phase was extracted, spotted onto a silica TLC plate, and resolved using a mobile phase of 80:20:2 CHCl 3 /CH 3 OH/NH 4 OH. Inositol-containing species were visualized by treating the TLC plate with 5% H 2 SO 4 in methanol and charring.

Structural and bioinformatic analysis
X-ray crystal structure alignments, measurements, and subsequent analysis were performed using PyMOL (35). Sequence alignments were performed with ClustalW, and figures were generated through the use of T-COFFEE and BOXSHADE (36,37). Jpred4 was used for secondary structure prediction (38). Transmembrane regions were predicted using the TMHMM server (39).