If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
* This work was supported by program and project grants from the Wellcome Trust (Grants 080085 and 083629, respectively). The on-line version of this article (available at http://www.jbc.org) contains supplemental Fig. 1. 1 Both authors contributed equally to this work. 2 Funded by a PhD student grant awarded by the Iran Ministry of Health and Medical Education. Present Address: Dept. of Pharmacology, School of Medicine, New York University Medical Center, New York, NY 10016. 4 Supported by UCB Celltech.
Mycobacterium tuberculosis encodes five type VII secretion systems that are responsible for exporting a number of proteins, including members of the Esx family, which have been linked to tuberculosis pathogenesis and survival within host cells. The gene cluster encoding ESX-3 is regulated by the availability of iron and zinc, and secreted protein products such as the EsxG·EsxH complex have been associated with metal ion acquisition. EsxG and EsxH have previously been shown to form a stable 1:1 heterodimeric complex, and here we report the solution structure of the complex, which features a core four-helix bundle decorated at both ends by long, highly flexible, N- and C-terminal arms that contain a number of highly conserved residues. Despite clear similarities in the overall backbone fold to the EsxA·EsxB complex, the structure reveals some striking differences in surface features, including a potential protein interaction site on the surface of the EsxG·EsxH complex. EsxG·EsxH was also found to contain a specific Zn2+ binding site formed from a cluster of histidine residues on EsxH, which are conserved across obligate mycobacterial pathogens including M. tuberculosis and Mycobacterium leprae. This site may reflect an essential role in zinc ion acquisition or point to Zn2+-dependent regulation of its interaction with functional partner proteins. Overall, the surface features of both the EsxG·EsxH and the EsxA·EsxB complexes suggest functions mediated via interactions with one or more target protein partners.
Mycobacterium tuberculosis is the primary causative agent of human tuberculosis and one of the oldest pathogens known to man, yet tuberculosis remains a major global health problem with an estimated 9.4 million new cases and over 1.3 million tuberculosis-related deaths annually (
) and comparative studies with attenuated M. bovis BCG strains identified a number of secreted proteins, including members of the Esx or CFP-10/ESAT-6 (10-kDa culture filtrate protein/6-kDa early secreted antigenic target) protein family, PE/PPE (proline-glutamic acid/proline-proline-glutamic acid) proteins, and MPT70/MPT83, which play essential, but as yet undefined, roles in mycobacterial pathogenesis.
M. tuberculosis encodes 23 Esx proteins, EsxA–W, which are generally characterized by their small size (∼100 residues), the presence of a central WXG motif, and their organization in pairs within the genome (
). The genes encoding the Esx pairs EsxA/EsxB (ESAT-6/CFP-10, Rv3875/Rv3874) and EsxG/EsxH (Rv0287/Rv0288) have been shown to be coordinately regulated forming small operons, and it is expected that all M. tuberculosis Esx genome pairs will behave similarly (
). Studies have also shown that the protein products of several Esx pairs, including EsxA/EsxB, EsxG/EsxH, EsxR/EsxS (Rv3019c/Rv3020c), and EsxO/EsxP (Rv2346c/Rv2347c) form tight complexes, which are likely to be the functional form of these proteins (
Five of the 11 ESX loci (ESX-1 to ESX-5) within the M. tuberculosis genome appear to encode examples of the recently identified type VII secretion systems (T7SS), which have been shown to export a number of proteins, including Esx protein complexes and PE/PPE proteins. The best characterized of these systems is ESX-1 (rv3868/eccA to rv3883c/mycP), which is known to secrete the EsxA·EsxB protein complex, as well as at least seven other mycobacterial proteins, including EspA (Rv3616c), EspB (Rv3881c), EspC (Rv3615c), EspE (Rv3864), EspF (Rv3865), PE35 (Rv3872), and EspR (Rv3849) (
). Core components of type VII secretion systems include FtsK/SpoIIIE-like ATPases (Rv3870 and Rv3871 in ESX-1), transmembrane proteins (Rv3877 in ESX-1), and a subtilisin-like serine protease (MycP1 in ESX-1) (
). The ESX systems are well conserved in M. tuberculosis and closely related mycobacteria, such as M. bovis, M. leprae, and Mycobacterium marinum, as well as more distantly related organisms such as Streptomyces coelicolor (
It appears that even within M. tuberculosis, the various ESX systems are differentially regulated and are likely to play different roles in infection. For example, ESX-1 and ESX-5 are thought to play a role in mycobacterial virulence and have been linked to granuloma formation, cell-to-cell spread of the mycobacteria, and escape from arrested phagosomes (
), with ESX-1 reported to be under the control of multiple regulators, such as the DNA binding transcription factor EspR (Rv3849), the two-component system regulator PhoP, and the serine protease MycP1 (
Despite growing evidence for diverse functional roles, secondary structure analysis by circular dichroism (CD), sequence comparisons, and helical wheel predictions suggest that M. tuberculosis Esx protein complexes are likely to adopt similar backbone topologies to the previously reported EsxA·EsxB complex, with distinct surface properties and features reflecting different functional roles (
). Here we report the high resolution solution structure of the M. tuberculosis EsxG·EsxH protein complex, which confirms the expected similarity to the core structure of the EsxA·EsxB complex but reveals striking differences in surface features and properties, including the identification of a potential functional site and a specific Zn2+ binding site. In contrast to EsxA·EsxB, we obtained no evidence for a specific interaction between fluorescently labeled EsxG·EsxH complex and the surface of macrophage/monocyte-like cells. The surface features of both complexes point to roles mediated via interactions with target proteins or complexes. However, striking differences clearly suggest different binding partners, reflecting proposed roles for EsxA·EsxB in pathogen-host cell signaling and for EsxG·EsxH in iron and zinc acquisition by infecting mycobacteria.
Protein Expression Vectors
The full-length coding regions for EsxG (Rv0287) and EsxH (Rv0288) were amplified by PCR from pET28a expression vectors containing EsxG and EsxH, respectively (
). EsxG was ligated into the pET23a Escherichia coli expression vector and expressed as a full-length protein without the N-terminal His tag. EsxH was cloned into the pLeic01 E. coli expression vector by ligation-independent cloning using the In-Fusion dry down PCR cloning kit (Clontech). EsxH was expressed as a full-length protein with an N-terminal His tag and a tobacco etch virus cleavage site (ENLYFQSM).
Protein Expression, Refolding, and Purification
Unlabeled and uniformly 15N-, 13C-, and 15N/13C-labeled EsxG and EsxH were expressed individually from pET23a- and pLeic01-based vectors in E. coli BL21(DE3) as described previously (
). The two proteins were obtained as inclusion bodies, which were solubilized in buffer containing guanidine hydrochloride and co-refolded to produce soluble EsxG·EsxH complex, essentially as reported previously (
). The refolded EsxG·EsxH complex was purified by nickel affinity chromatography followed by gel filtration. The N-terminal His tag attached to EsxH, expressed from the pLeic01 vector, was removed by cleavage with tobacco etch virus protease.
NMR spectra were acquired from 0.35-ml samples of 0.7–1.0 mm EsxG·EsxH complex in 25 mm NaH2PO4, 100 mm NaCl, 0.02% (w/v) NaN3, 0.1 mm 4-(2-aminoethyl) benzenesulfonyl fluoride hydrochloride, pH 6.5, containing either 10% D2O, 90% H2O or 100% D2O as appropriate. All NMR data were acquired and processed as described by Ilghari et al. (
spectra of EsxG·EsxH were acquired in the presence and absence of equimolar Zn2+ and Fe3+ to determine whether the complex contained a specific metal ion binding site. In these experiments, equimolar amounts of ZnCl2 or FeCl3 were added to either 100 μm15N-labeled EsxG·unlabeled EsxH or 100 μm unlabeled EsxG·15N-labeled EsxH in a 20 mm Bis-Tris, 100 mm NaCl, 0.02% (w/v) NaN3, 0.1 mm 4-(2-aminoethyl) benzenesulfonyl fluoride hydrochloride, pH 6.5, buffer. The spectra were acquired at 35 °C on 600-MHz Bruker Avance or DRX systems. Typical acquisition times for the HSQC spectra were 60 ms in F2 (1H), and 30 ms in F1 (15N), with the spectra collected over ∼1.5 h. The spectra were processed using Topspin (Bruker Biospin Ltd.) with linear prediction used to extend the effective acquisition times by up to 2-fold in F1.
The family of converged EsxG·EsxH structures was determined in a two-stage process using the program CYANA (
). Initially, the combined automated NOE assignment and structure determination protocol (CANDID) was used to automatically assign the intra- and intermolecular NOE cross-peaks identified in three-dimensional 15N- and 13C-edited NOESY spectra. Subsequently, several cycles of simulated annealing combined with redundant dihedral angle constraints to increase convergence were used to produce the final converged EsxG·EsxH structures (
), four manually picked three-dimensional NOE peak lists corresponding to all NOEs involving amide protons (1538) and all NOEs between aliphatic protons (2222), and two manually picked two-dimensional NOE peak lists corresponding to all NOEs involving aromatic side chain protons (127). The CANDID stage also included 264 backbone torsion angle constraints (Φ/Ψ) for the EsxG·EsxH complex determined by the protein backbone dihedral angle prediction program TALOS (
). CANDID calculations were carried out using the default parameter settings in CYANA, with chemical shift tolerances set to 0.04 ppm (direct and indirect 1H) and 0.4 ppm (15N and 13C). The final converged EsxG·EsxH structures were produced from 100 random starting coordinates using a standard torsion angle-based simulated annealing protocol combined with five cycles of redundant dihedral angle constraints (
). The calculations were based upon 2323 non-redundant NOE-derived upper distance limits (maximum value 6.0 Å), assigned to unique pairs of protons using CANDID, 264 Φ/Ψ torsion angle constraints derived from TALOS, and 24 hydrogen bond constraints involving slowly exchanging backbone amides in regions of regular helical structure (four per hydrogen bond). Analysis of the family of structures obtained was carried out using the programs CYANA and MOLMOL (
Monocyte-like U937 cells (85011440, European Collection of Cell Cultures (ECACC)) were maintained in RPMI 1640 medium (Invitrogen) containing 10% fetal calf serum (FCS) (Invitrogen) and 2 mm glutamine (Invitrogen) at 37 °C and 5% CO2. For fluorescence microscopy experiments, cells were attached to glass coverslips precoated with either 160 μg/ml poly-l-lysine (Sigma) or 5 μg/ml fibronectin (Sigma) by incubation at 37 °C and 5% CO2 for ∼10 min.
Alexa Fluor 546-labeled full-length EsxG·EsxH complex was prepared as described for EsxA·EsxB, and potential binding of the full-length complex to the surface of U937 cells was investigated as reported previously for the EsxA·EsxB complex (
) was used to determine unique assignments for the NOEs identified in two- and three-dimensional NOE-based spectra. Assignments were obtained for 87% of the total NOE peaks identified, producing 2323 non-redundant 1H to 1H upper distance limits. The final family of EsxG·EsxH structures was determined using a total of 2611 NMR-derived structural constraints (an average of 13.5 per residue), which are summarized in Table 1. Following the final round of CYANA calculations, 30 satisfactorily converged structures were obtained from 100 random starting structures. The residual constraint violations and the structural statistics for the family of converged EsxG·EsxH structures are shown in Table 1.
TABLE 1NMR constraints and structural statistics for the EsxG·EsxH complex
No. of constraints used in final structural calculation
Sequential (short range) NOEs (i, i + 1)
Medium range NOEs (i, i ≤ 4)
Long range NOEs (i, i ≥ 5)
264 (132Φ and 132Ψ)
Maximum (left column) and total (right column) constraint violations in 30 converged EsxG·EsxH structures
Upper distance limits (Å)
0.32 ± 0.00
6.95 ± 0.05
Lower distance limits (Å)
0.10 ± 0.00
0.90 ± 0.04
van der Waals contacts (Å)
0.35 ± 0.00
6.04 ± 0.04
Torsion angle ranges (°)
3.09 ± 0.32
52.8 ± 0.74
Average CYANA target function (Å2)
0.62 ± 0.15
Structural statistics for the family of converged EsxG·EsxH structures
Residues within regions of the Ramachandran plot
r.m.s.d. for structured region (residues 15–79 of EsxG and residues 17–75 of EsxH)
1.11 ± 0.21 Å
1.48 ± 0.25 Å
r.m.s.d. for α-helical regions (residues 18–42 and 49–76 of EsxG and 19–38 and 51–73 of EsxH)
The overlays of the protein backbones for the 30 converged structures obtained are shown in Fig. 1, and together with relatively low root mean square deviation (r.m.s.d.) values to the mean structure for both the backbone and the heavy atoms (Table 1), indicate that the solution structure of the EsxG·EsxH complex has been determined to fairly high resolution. Within the complex, both EsxG and EsxH adopt helix-turn-helix hairpin structures, which are arranged antiparallel to each other, forming a four-helix bundle. The core of the complex is well defined (residues 15–79 in EsxG and 17–75 in EsxH), although there appears to be significant flexibility in the hairpin loop of EsxG (residues 40–47) as well as the N and C termini of EsxG and EsxH, which form flexible arms at both ends of the complex (Fig. 1). The two long helices in the hairpin structures are formed from residues Phe-18–Phe-42 (α1) and Ala-49–Leu-76 (α2) in EsxG and Ala-19–Ala-38 (α1) and Tyr-51–Ser-73 (α2) in EsxH. The helices in EsxG are completely α-helical, whereas in EsxH, helix α2 terminates with a single turn of 310 helix (Ser-74–His-76) followed by another single turn of α-helix (Glu-77–Met-81). The exposed C-terminal region of EsxH shows some propensity to adopt a helical conformation (reflected in both patterns of NOEs and backbone chemical shifts reported by Ilghari et al. (
)), and the conservation of aromatic and hydrophobic residues located in the C-terminal regions of both EsxG and EsxH implies some functional significance for these flexible regions. In particular, Tyr-94 and Phe-97 of EsxG are well conserved in EsxG orthologues from other mycobacteria (Fig. 2C) as well as in the closely related M. tuberculosis EsxS protein; however, these residues are not conserved throughout the M. tuberculosis Esx protein family, implying a functional role specific to the EsxG·EsxH complex.
), the contact surface between EsxG and EsxH is essentially hydrophobic in nature and accounts for ∼15% (∼1340 Å2) of the total surface area of both proteins. The residues found at the intermolecular interface include 19 residues from EsxG (Phe-18, Lys-21, Met-25, Thr-28, Ala-32, Ala-35, Ala-50, Phe-51, Ala-54, Arg-57, Phe-58, Ala-61, Lys-64, Val-65, Leu-68, Val-71, Ala-72, Asn-75, and Leu-76) and 21 residues from EsxH (Met-18, Tyr-21, Leu-25, Leu-28, Glu-31, Ile-32, Glu-35, Leu-39, Ala-42, Trp-43, Thr-47, Ile-49, Trp-54, Gln-57, Trp-58, Ala-61, Leu-65, Ala-68, Tyr-69, Ala-71, and Met-72). The stabilizing interactions between EsxG and EsxH appear to rely almost entirely on favorable van der Waals contacts; however, an intermolecular salt bridge (Lys-21–Glu-31) appears to stabilize the interaction between the N-terminal region of helix α1 in EsxG and the C-terminal region of the equivalent helix in EsxH. The interactions within the helical hairpins are also primarily based on van der Waals interactions; however, close inspection of the structure reveals the potential for the formation of two salt bridges within the EsxG hairpin (Arg-26–Asp-70 and Glu-33–His-55).
Analysis of the electrostatic surface of the complex reveals a fairly even distribution of positive and negative charge (Fig. 2A) with no significant hydrophobic patches on the surface of the complex, which together with solubility to over 1 mm in aqueous solution argues against a membrane-spanning role. In addition to the long flexible N- and C-terminal arms of the proteins, another notable feature of the EsxG·EsxH complex is the presence of a cleft on the surface of the structure, which could indicate a potential binding site for an interaction partner (Fig. 2, A and B). The cleft is formed by elements corresponding to the flexible N-terminal region of EsxH (specifically residues Met-1, Ile-4, Met-5, and Met-18), the hairpin turn region of EsxG (residues Phe-42, Ser-48, Ala-50, and Phe-51), and the C-terminal region of the α2 helix in EsxH (residues Met-72, His-76, and Ala-78) (Fig. 2, B–D). Met-18 and Met-72 from EsxH form the base of the cleft, with the other residues remaining partially exposed to the solvent and therefore also accessible to any potential binding partner. The cleft is predominantly hydrophobic in nature, and although relatively narrow, the precise shape is variable across the family of EsxG·EsxH structures, reflecting the flexibility in the N-terminal region of EsxH. The hydrophobic and aromatic residues that form the cleft are conserved in EsxG·EsxH orthologues from M. bovis, M. marinum, Mycobacterium ulcerans, and M. leprae (Fig. 2, C and D), as well as the closely related M. tuberculosis EsxR and EsxS molecules (Fig. 3). However, they are not conserved throughout other members of the M. tuberculosis ESX protein family (
), which suggests a functional site specific to EsxG·EsxH and EsxR·EsxS complexes. Overall, the surface features of the EsxG·EsxH complex suggest a function most probably mediated via interactions with one or more target proteins, involving either the cleft and/or the flexible N- or C-terminal arms at either end of the complex.
Previous studies have shown the importance of the flexible C-terminal region of EsxB in binding the EsxA·EsxB complex to the surface of host monocyte/macrophage cells (
) were carried out using Alexa Fluor 546-labeled EsxG·EsxH. Interestingly, the experiments provided no convincing evidence of a specific interaction of labeled EsxG·EsxH with the surface of host cells as the low level of cell-associated fluorescence seen (500-ms exposure times as compared with 100–200 ms for EsxA·EsxB) showed no significant reduction in the presence of a 15-fold molar excess of non-labeled EsxG·EsxH. These results, combined with the reported up-regulation of all ESX-3 genes under conditions of low iron/zinc (
), suggest that potential host cell binding partners for the EsxG·EsxH complex are more likely to be found within the host cell rather than on the cell surface.
Specific Zn2+ Binding by the EsxG·EsxH Complex
The role of ESX-3 and its secreted soluble factors, such as EsxG·EsxH, in iron and zinc acquisition is not yet fully understood, although it has been reported that ESX-3 has been linked to iron acquisition via the mycobactin pathway (
). It is possible that EsxG·EsxH could play a role in iron acquisition; however, it is unlikely that the complex interacts with the M. tuberculosis siderophore mycobactin. The chemical structure of mycobactin T from M. tuberculosis (
; PDB 1XZO), suggests that the width and depth of the cleft on the surface of EsxG·EsxH would not be large enough to accommodate mycobactin.
To assess potential Zn2+ or Fe3+ binding by EsxG·EsxH, samples of the complex in which either EsxG or EsxH was uniformly 15N-labeled were prepared containing equimolar amounts of the individual metal ions. The addition of Zn2+ resulted in dramatic changes in the 15N/1H HSQC spectra of both proteins in the complex, with substantial shifts and/or line broadening observed for a significant number of backbone amide signals (Fig. 3, A and B), which clearly indicates relatively tight binding of Zn2+. In marked contrast, the addition of Fe3+ had no effect on the 15N/1H HSQC of either protein in the complex. The location of the specific Zn2+ binding site on EsxG·EsxH was mapped by minimal shift analysis of the changes seen in the backbone amide signals (
), which is summarized for both proteins in the histograms shown in Fig. 3, C and D. In the case of EsxG, this reveals that the majority of the substantially perturbed signals arise from residues located toward the closed end of the hairpin structure, in particular, from residues near the C terminus of helix 1 and in the flexible loop between the two helices. In contrast, signals from residues positioned close to the open ends of the EsxH hairpin are most affected by Zn2+ binding, which is consistent with the antiparallel arrangement of the two hairpin structures within the EsxG·EsxH complex and localizes the effects of Zn2+ coordination to one end of the complex, as illustrated in Fig. 3E.
Examination of the surface regions of the EsxG·EsxH complex perturbed by Zn2+ binding revealed a cluster of three histidine residues from EsxH (His-14, His-70, and His-76) with side chains able to accommodate tetrahedral coordination of a single zinc ion. This region of EsxH also contains an appropriately positioned glutamic acid residue (Glu-77), which is likely to provide the fourth Zn2+-coordinating group (Fig. 3F). The identification of this cluster of residues as the probable zinc ion binding site was further supported by analysis of the family of EsxG·EsxH structures using the program FEATURE (
), which recognized this region as a very likely Zn2+ coordination site. The three histidine residues involved in Zn2+ binding are conserved across mycobacterial species with an obligate pathogenic lifestyle, such as M. tuberculosis and M. leprae, but His-70 is replaced by glutamine or arginine in environmental mycobacteria and opportunistic pathogens (Fig. 2D). Similarly, His-70 of EsxH is substituted by a glutamine residue in the very closely related EsxR (Fig. 4E). This suggests that the functional importance of zinc ion binding by the EsxG·EsxH complex may be restricted to survival or growth within infected host cells and implies that the very closely related EsxR·EsxS complex may not be fully functionally equivalent.
All the residues of EsxH with backbone amide resonances substantially perturbed by zinc ion binding lie either within or adjacent to the identified coordination site, and the spectral changes seen almost certainly reflect direct involvement of this region in Zn2+ chelation. In contrast, the affected but fairly distant residues within the hairpin loop of EsxG appear to have no direct role in zinc ion binding (Fig. 3, E and F), and the significant line broadening observed for backbone amide signals here suggests that Zn2+ coordination leads to a significant change in the mobility of this region, which may affect the interaction of potential functional partners with the cleft discussed previously. The specific Zn2+ binding site present on EsxG·EsxH may reflect a direct role in zinc ion acquisition but could also point to Zn2+-dependent regulation of its interaction with one or more functional partner proteins.
Comparative Analysis between EsxG·EsxH and Other M. tuberculosis Esx Complexes
Close analysis of the structure of the EsxA·EsxB complex, combined with optimized sequence alignments for members of the M. tuberculosis Esx protein family and helical wheel predictions, indicate that all M. tuberculosis Esx complexes are likely to form 1:1 heterodimers with core structures similar to the four-helix bundle of EsxA·EsxB (
). Key hydrophobic residues are well conserved throughout the M. tuberculosis Esx family, allowing the complexes to adopt similar backbone structures. Residues located on the external surfaces and at the N and C termini of the proteins are somewhat less well conserved, which suggests significant differences in surface features and distinct functional sites, potentially reflecting diverse roles for Esx family complexes (
Comparative analysis of the solution structure of EsxG·EsxH with EsxA·EsxB reveals that the overall backbone folds of the two complexes are highly similar. In both cases, the individual proteins adopt helix-turn-helix structures, which are arranged antiparallel to each other, resulting in the formation of stable four-helix bundles (Fig. 4). The similarity is reflected in comparisons of the backbone atom coordinates, which yield an r.m.s.d. of 1.8 Å for the superposition of residues Phe-18–Gln-40 and Ala-50–Leu-76 from EsxG and Ala-19–Ala-38 and Tyr-51–Ser-73 from EsxH with residues Phe-18–Gln-40 and Ala-50–Ile-76 of EsxB and Gln-19–Lys-38 and Tyr-51–Ala-73 of EsxA. However, despite obvious similarities, such as the overall folds of the protein complexes, as well as disordered N and C termini, there are some striking differences between these two related complexes. Firstly, helices of the proteins forming the EsxG·EsxH complex, in particular the N-terminal helices of EsxG and EsxH, are significantly shorter (10–11 residues) than those in the EsxA·EsxB complex (Fig. 4). Secondly, the C-terminal region of EsxB in the EsxA·EsxB complex shows a distinct tendency to adopt a helical conformation; however, in the EsxG·EsxH complex, it is the C-terminal region of EsxH (EsxA-related) that has a propensity to adopt a helical conformation. Finally, other than the flexible arms of EsxA and EsxB, no obvious functional sites were apparent on the surface of the EsxA·EsxB complex (
), whereas the EsxG·EsxH structure reveals a noticeable cleft (Fig. 2, A and B), suggesting the presence of a functional binding site, and contains a specific Zn2+ binding site not present on EsxA·EsxB.
The EsxG protein contains a modified WXG motif, where the tryptophan residue has been substituted with a histidine residue. Comparison of the WXG loops from EsxH, EsxA, and EsxB reveals significant variability in conformation, which seems to be influenced by long range contacts with their partner protein, as illustrated by EsxH (FIGURE 1, FIGURE 2, FIGURE 3, FIGURE 4). The most notable difference with the modified HXG loop of EsxG is an apparent increase in flexibility as compared with the WXG loops of EsxH (Fig. 1A), EsxA, and EsxB (
). The HXG motif is conserved in EsxG orthologues from other mycobacterial species (Fig. 2C), as well as the closely related M. tuberculosis EsxS protein (Fig. 3D), suggesting that the additional flexibility of the WXG loop may be important to the function of the EsxG·EsxH complex.
Interestingly, recently reported studies of the M. tuberculosis EsxR and EsxS proteins, which share 85 and 95% sequence identity with EsxH and EsxG, respectively, have shown the formation of a major heterodimeric complex and a minor heterotetrameric complex (15:1 ratio) (
). The structure of the predominant EsxR·EsxS heterodimer has not been determined. However, due to the high level of amino acid conservation with EsxG and EsxH, it is very likely that this complex will form a structure that is essentially identical to the one reported here for the EsxG·EsxH complex. Arbing et al. (
) solved the crystal structure of the minor heterotetrameric EsxR·EsxS complex, which revealed that the two EsxS molecules each formed single, long α-helices arranged antiparallel to each other, whereas the two copies of EsxR both formed helix-turn-helix hairpin structures, which were closely associated with the ends of the EsxS pair (Fig. 4C). Overall, the positions of helical regions, as well as intra- and intermolecular salt bridges reported for the heterotetrameric complex, fit well with the helical regions and salt bridges observed for the EsxG·EsxH heterodimeric complex (Fig. 4). Comparison of the backbone coordinates show that the EsxH and EsxR molecules are highly similar as the superposition of residues 20–74 from both molecules yields an r.m.s.d. of 1.06 Å. This superposition also yields backbone r.m.s.d. values of 2.60 and 1.71 Å for residues 18–41 (α1) and residues 49–76 (α2), respectively, of EsxG as compared with EsxS. These r.m.s.d. values indicate significant similarities between the structures of the EsxG·EsxH heterodimer and the EsxR·EsxS heterotetramer, as expected for a higher order complex produced by domain swapping.
Despite clear evidence from reported gel filtration experiments for the formation of EsxR·EsxS heterotetrameric complexes (
), our studies with M. tuberculosis EsxA·EsxB and EsxG·EsxH complexes have so far found no indication of higher order complex formation. For example, during gel filtration purification (pH 6.5), we see only a single EsxA·EsxB or EsxG·EsxH peak (supplemental Fig. 1), which corresponds to the expected size of the heterodimeric complex. However, under different solution conditions, the proteins may have the potential to form domain-swapped heterotetramers like EsxR·EsxS.
The findings reported here, together with previous work (
), clearly show that it is possible to predict with confidence a core structure for the Esx family proteins. However, significant differences seen in surface features mean that it will be necessary to solve the high resolution structures of individual complexes to identify the functionally important, complex-specific surface features. The surface features and properties of both the EsxA·EsxB and the EsxG·EsxH complexes suggest roles mediated via binding to one or more functional partner proteins, which remain to be identified and could be of either host cell or mycobacterial origin. The distinct surface features and expression profiles for the two Esx family complexes clearly point to distinct functional roles, with EsxA·EsxB strongly implicated in pathogen-host cell signaling (
). Clearly, multiple gene duplication events in M. tuberculosis have allowed the evolution of diverse functions for Esx family complexes, which presumably exploit the flexibility offered by functional complexes.
The atomic coordinates and structure factors (code2KG7) have been deposited in the Protein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers University, New Brunswick, NJ (http://www.rcsb.org/).