The ghrelin O-acyltransferase structure reveals a catalytic channel for transmembrane hormone acylation

Integral membrane proteins represent a large and diverse portion of the proteome and are often recalcitrant to purification, impeding studies essential for understanding protein structure and function. By combining co-evolutionary constraints and computational modeling with biochemical validation through site-directed mutagenesis and enzyme activity assays, we demonstrate here a synergistic approach to structurally model purification-resistant topologically complex integral membrane proteins. We report the first structural model of a eukaryotic membrane-bound O-acyltransferase (MBOAT), ghrelin O-acyltransferase (GOAT), which modifies the metabolism-regulating hormone ghrelin. Our structure, generated in the absence of any experimental structural data, revealed an unanticipated strategy for transmembrane protein acylation with catalysis occurring in an internal channel connecting the endoplasmic reticulum lumen and cytoplasm. This finding validated the power of our approach to generate predictive structural models for other experimentally challenging integral membrane proteins. Our results illuminate novel aspects of membrane protein function and represent key steps for advancing structure-guided inhibitor design to target therapeutically important but experimentally intractable membrane proteins.

Integral membrane proteins represent a large and essential portion of the proteome, including a growing number of enzymes, receptors, and transporters that serve as desirable drug targets (1, 2). However, these proteins often prove recalcitrant to purification and structural analysis due to their hydrophobic nature and reliance on interactions with lipid bilayers for both stability and activity (3,4). We report here a synergistic approach to develop a structural model of a topo-logically complex integral membrane protein by combining coevolutionary contact constraints and computational modeling with biochemical validation. Building solely upon the protein's primary sequence and a biochemical assay for its function, the approach provides an accessible and efficient route to build structural models of intractable membrane protein targets.
We demonstrate this approach by developing a structural model for ghrelin O-acyltransferase (GOAT), 5 a member of the membrane-bound O-acyltransferase (MBOAT) enzyme family responsible for octanoylation of the peptide hormone ghrelin ( Fig. 1) (5,6). One of three protein-modifying MBOAT family members alongside Hedgehog acyltransferase (Hhat) and Porcupine (Porcn) (7)(8)(9), GOAT plays a central role in regulating energy homeostasis and metabolism through octanoylated ghrelin-dependent signaling pathways (10). Whereas the unique chemistry and biology of ghrelin and GOAT have inspired continued efforts to target this system for therapeutic benefit, the inability to purify active GOAT and determine its structure has hampered progress toward this goal (11)(12)(13). In this work, we report the first structural model for a eukaryotic MBOAT family member. Our human GOAT (hGOAT) structure is highly consistent with a recently reported crystal structure for the bacterial MBOAT homolog D-alanyl transferase DltB (14). Our structure suggests a novel strategy for solving the topological challenge presented by transmembrane protein acylation, where protein targets and co-substrates are separated by a cellular membrane. In an unanticipated mechanism, ghrelin octanoylation occurs in an internal channel within hGOAT without the octanoyl-CoA donor being transported into the endoplasmic reticulum (ER) lumen. The availability of this therapeutically interesting enzyme's structure opens the door to the structure-guided design of inhibitors targeting GOAT and other MBOAT family members. Looking beyond the MBOATs, our success in modeling GOAT and predicting specific protein-ligand interactions validates the power of our approach for creating molecular models for other experimentally challenging integral membrane proteins.

Computational model for human GOAT structure
In generating our hGOAT structural model, we utilized state-of-the-art co-evolutionary contact predictions with computational protein folding and structure optimization methods ( Fig. S1) (15,16). Coevolutionary contact analysis exploits the tendency of residues interacting with each other within folded proteins to co-evolve to maintain energetically beneficial interactions (17)(18)(19). Analysis of many protein sequences employing a multiple-sequence alignment identifies pairs of co-evolving residues, from which it is inferred that these residues lie in proximity to each other. Assigning pairs of residues as co-evolving supports assignment of spatial interactions between them, providing constraints that can define major features of protein structure. Using metagenomics protein databases, we generated a multiple-sequence alignment to predict residues that are potentially in contact (defined as C ␤ -C ␤ Ͻ 8 Å) with each other in the folded structure of hGOAT (File S1) (15,20). This set of contacts (File S2), represented by the contact map ( Fig. 1C and Fig. S2), guided our hGOAT structural modeling (17,20,21). Experimental information on the membrane topology of mouse GOAT and co-evolutionary contact constraints were iteratively  (43). B, octanoylation of a ghrelin-mimetic fluorescent peptide by recombinant hGOAT. C, contact maps for hGOAT showing the probability for a co-evolutionary contact from RaptorX analysis (i) and amino acid contacts in the final optimized hGOAT structure (ii). D, structure of hGOAT in an ER-mimetic lipid membrane, correlated to color-coded membrane topology in A. E, illustration of the internal channel within hGOAT (green) transiting from the ER lumen to the cytoplasm, with the channel determined by the CAVER 3.0 plugin in PyMOL (33). F, structural overlay of hGOAT and DltB showing the absolutely conserved histidine residues (hGOAT His-338 (teal) and DltB His-336 (purple) (Protein Data Bank code 6BUG, chain C) within these acyltransferases.
ACCELERATED COMMUNICATION: Molecular structure of GOAT combined in protein-folding simulations to generate ϳ30,000 potential hGOAT structures (11,17,22). The generated structures were clustered, and the lowest-energy structures that satisfied the contact map were isolated (Fig. S3) (22). Representative structures from the top five clusters were then subjected to further structural refinement to yield the optimal hGOAT model (23). The optimal model was embedded in a lipid membrane and subjected to structural relaxation in explicit solvent using all-atom molecular dynamics simulations (24 -26). This simulation used an ER-mimetic lipid bilayer to ensure optimization of hydrophobic protein-lipid interactions (27).

Features of the human GOAT structure
Our computationally derived structure for hGOAT is consistent with the previously reported topological model of the mouse GOAT ortholog containing a total of 11 transmembrane helices with slightly altered helix boundaries (Fig. 1A) (11), indicating the two sets of constraints from our coevolutionary contact analysis and previous topological studies support a common hGOAT structural model. To determine how strongly our hGOAT structure depends on the experimental topological constraints from mouse GOAT (11), we excluded these constraints and repeated our analysis, which generated an identical hGOAT membrane topology. This indicates that co-evolutionary contact constraints alone are sufficient to predict the membrane topology of hGOAT, suggesting this approach for topology modeling of integral membrane proteins to complement established algorithms for predicting membrane protein topology.
Ramachandran analysis indicates that 92.4% (400 of 433) of hGOAT residues lie in favored (98%) regions, 98.2% (425 of 433) lie in allowed (Ͼ99.8%) regions, and 1.9% (8 of 433) are outliers ( Fig. S4) (28). The enzyme forms an ellipsoidal cone composed of transmembrane helices, with the narrow end facing the ER lumen (Fig. 1D). The exposed ends of five transmembrane helices (TM1, TM4, TM5, TM7, and TM11) converge to form a pore through which the interior of hGOAT is connected to the ER lumen. At the cytoplasmic membrane interface, the predicted cytoplasmic loops fold up to form a core region bounded by the lipid-contacting perimeter helices. As a result, there is minimal cytoplasmic exposure of hGOAT residues beyond the plane of the membrane.
The hGOAT structure contains a contiguous internal channel through the enzyme core that transits from the ER lumen space to the cytoplasm (Fig. 1E). The channel is bent within hGOAT, with the restriction formed by the C-terminal end of helix TM8 and the N-terminal end of TM9. This positions an absolutely conserved histidine residue (His-338) in direct contact with the internal channel (7), consistent with proposals for this histidine to serve as a general base for catalyzing ghrelin acylation. Following completion of our hGOAT structure and during subsequent biochemical validation experiments (described below), the release of a crystal structure for bacterial MBOAT alanyl transferase DltB provided an independent basis for comparison and validation of our hGOAT structure (14). The His-338 residue in hGOAT closely matches the location of the analogous histidine residue (His-336) in the DltB structure ( Fig. 1F) (5,14). Further comparison of the hGOAT model and the DltB structure reveals remarkable similarities in overall topology and structure, with a TM-score of 0.6 and root mean square deviation of 2.23 Å for ϳ100 aligned conserved residues between the structural models for these distantly related MBOAT family members (12.3% sequence identity, 26.8% sequence similarity, E-value 2.7 ϫ 10 Ϫ8 and bit score 48.7; Figs. S5 and S6 and Table S1) (29). However, the low overall homology between DltB and hGOAT leads to very poor structure prediction for nonhomologous sequence positions, as would be expected for this type of comparison. The demonstrated ability of our hGOAT modeling based on coevolutionary contact restraints to arrive at the same protein fold as DltB, in the absence of any experimental structural information, underscores the power of this approach to accurately predict protein structures.

Mutagenesis analysis of hGOAT structural model
To validate our computational hGOAT structural model biochemically, we mutated ϳ10% of the residues within hGOAT to alanine and determined the impact of these mutations on hGOAT octanoylation activity in a peptide-based acylation assay ( Fig. 2 and Fig. S7) (30,31). These 42 alanine mutations were spread across a range of amino acids and degrees of conservation, with the majority of sites chosen conserved at Ͼ75% among GOAT orthologs (File S3). In narrowing the pool of mutations to ϳ40 positions, residues with surface-exposed side chains were deemphasized compared with residues predicted to lie within the enzyme interior. Approximately half of the mutation sites were selected based on the residue's side chain contacting the internal void, as we propose this channel will likely contain the substrate-binding sites and catalytic residues within hGOAT.
In this pool of alanine mutants, we observed a range of activities from near/above WT ghrelin octanoylation activity to complete loss of detectable activity (Table S2 and Fig. S8). When mapped onto the hGOAT structural model, mutations leading to a marked decline (Ͼ3-fold; purple) or loss of enzyme activity (red) appear clustered within the core of hGOAT (Fig.  2). For quantitative analysis of the impact of these mutations, we determined whether alanine mutagenesis of residues contacting the internal void is more likely to yield reduced enzyme activity compared with non-void-contacting mutations. Within the pool of mutations, the void-contacting alanine mutations were significantly more likely to result in loss of enzyme activity (p Ͻ 0.03; Fig. 2D). This mutation activity mapping defines a functionally essential core within hGOAT and expands the number of residues within hGOAT known to be required for enzyme activity (5,6,11).

The octanoyl-CoA-binding site within hGOAT
We expect the octanoyl-CoA acyl donor to enter the hGOAT active site through interaction with the cytoplasmic face of the enzyme, based on the availability of acyl-CoAs within the cell. When docked into our hGOAT model, octanoyl-CoA binds to hGOAT through interactions of both its CoA and octanoyl chain regions with residues in TM6, the TM7-TM8 connecting loop, TM8, and TM9 (Fig. 3). In the docked complex, the CoA portion forms both polar and nonpolar interactions with mul-ACCELERATED COMMUNICATION: Molecular structure of GOAT tiple hGOAT residues while remaining exposed to the cytoplasm (Fig. 3, A and B). The phosphoadenosine group binds into a discrete pocket while the phosphopantetheine chain is in contact with multiple polar amino acid side chains (Fig. 3, C-E). Among these CoA-contacting amino acids, all alanine mutations examined except one lead to a loss of hGOAT activity.
In the docked hGOAT:octanoyl-CoA structure, the acyl chain of octanoyl-CoA makes a sharp turn and penetrates upward into the interior of hGOAT following a channel that terminates at Trp-351 (Fig. 3, C-E). Given the unique preference of hGOAT for an octanoyl acyl donor (6, 13, 32), we examined alanine mutagenesis of predicted contacts within this acylbinding pocket to determine the impact of those mutations on hGOAT acyl donor selectivity. As alanine mutagenesis would provide additional space within the acyl-binding site, we determined the ability of hGOAT alanine variants to accept 12-carbon (lauryl-CoA) and 14-carbon (myristoyl-CoA) acyl donors in place of octanoyl CoA. The WT enzyme and the majority of hGOAT alanine variants exhibited the expected preference for an eight-carbon acyl donor, but alanine mutagenesis of Trp-351 and Phe-331 resulted in loss of appreciable reactivity with octanoyl-CoA but engendered new activity with the longer acyl donors (Fig. 3F and Fig. S9). The F331A variant gained activity with the C12 donor, whereas W351A hGOAT could acylate a ghrelin-derived peptide with both C12 and C14 acyl chains. This altered selectivity supports the modeled positions of Trp-351 and Phe-331 as forming the end of the acyl-binding pocket. The altered preference for longer acyl donors by the F331A and W351A variants was also observed in a direct competition assay where hGOAT variants were provided acyl donors ranging from six to 12 carbons (Fig. 3G). This altered selectivity was not observed for any other alanine variants with detectable ghrelin acylation activity (Fig. S10). Acyl donor reengineering upon targeted alanine mutagenesis localizes Phe-331 and Trp-351 to the distal end of the acyl donor-binding site within hGOAT and provides further biochemical validation of our hGOAT structural model.
Although structural studies play a central role in developing our understanding of protein function, the limited availability of integral membrane proteins within structural databases creates a particularly acute challenge for structurally modeling these proteins (see http://blanco.biomol.uci.edu/mpstruc/). 6 In this work, we demonstrate the development and validation of a structural model for an integral membrane protein that leverages bioinformatics constraints from coevolutionary contact analysis and model evaluation by biochemical analysis while circumventing the requirement of protein purification.
Our model provides indispensable and novel insights into several long-standing questions regarding the mechanism for

ACCELERATED COMMUNICATION: Molecular structure of GOAT
MBOAT-catalyzed transmembrane protein acylation. The topological separation of two essential conserved residues, His-338 and Asn-307, is explained by these two residues playing roles in distinct aspects of GOAT activity. The location of His-338 within the central channel of GOAT, identical to the position observed for the analogous histidine in DltB, (14) is consistent with this residue acting as a general base to activate the ghrelin serine hydroxyl side chain for octanoyl transfer. In contrast, our hGOAT:octanoyl-CoA model implicates Asn-307 in the binding site for the acyl donor. Based on our models of hGOAT and the hGOAT:octanoyl-CoA complex, we propose that hGOAT catalyzes transmembrane acylation of ghrelin by binding both substrates within the hGOAT internal channel and "handing off" the octanoyl group from CoA to ghrelin within this channel D and E, interactions between the octanoyl-CoA acyl donor and hGOAT residues; hGOAT residues shown in purple reduce acylation activity under standard reaction conditions when mutated to alanine, and residues shown in red abolish acylation activity upon alanine mutation. F, acylation activity of WT hGOAT and selected hGOAT alanine variants using octanoyl-, lauryl-, or myristoyl-CoA as the sole acyl donor. Activities are normalized to the most reactive hGOAT variant with each acyl donor; individual data points indicate independent trials, and the dotted line indicates the average of three independent trials. G, acyl donor competition demonstrates altered selectivity to a longer acyl donor for F331A and W351A hGOAT variants, consistent with the predicted interaction of these amino acid side chains with the distal end of the octanoyl acyl chain.
ACCELERATED COMMUNICATION: Molecular structure of GOAT (Fig. 4). Whereas many aspects of this proposed pathway-such as the ghrelin-binding site and location of catalytic residuesremain to be functionally validated by ongoing studies, the established ability of our hGOAT model to efficiently guide biochemical studies demonstrates a novel approach to advance investigations of similar membrane proteins that are intractable to current structural approaches.

Co-evolutionary contact analysis of hGOAT
A multiple-sequence alignment (MSA) was performed with the hGOAT sequence against the UNIREF90 database utilizing the JackHMMER tool (34). The MSA parameters were set to eight iterative searches (n ϭ 8) with an e-value threshold of 1 ϫ 10 Ϫ40 . The resulting alignment was filtered to exclude highly similar sequences using the HHfilter tool with 90% identity and 75% sequence coverage cut-offs. This MSA was used as the input for the hmmbuild tool to construct a hidden Markov model (hmm) curated specifically for the MSA (35), which would represent the consensus sequence of hGOAT and its closest homologs. This hmm was then utilized to search against a master database that included uniref100 and metagenome database (metaclust_2018_01) using the hmmsearch tool with a bit score cut-off of 27 (15,21). The resulting MSA was filtered again using the HHfilter tool with 90% identity and 75% sequence coverage against hGOAT. Furthermore, sequences with unidentified amino acids (X; this is to accommodate for RaptorX) and sequence positions with Ͼ50% gaps were also filtered from the MSA using trimAL (36). The resulting MSA for hGOAT was primarily used to perform co-evolutionary contact analysis using the RaptorX server and GREMLIN (15,17,20,37). The resulting contact maps are provided as Fig. 1B (RaptorX) and Fig. S2 (GREMLIN). The resulting MSA had an M eff -0.8/͌N of 551.7, which is greater than the recommended value of 64 for reliable model prediction using co-evolutionary contacts (15,17). These contacts were used to guide the hGOAT folding. The MSA analysis and curations were performed using in-house Python scripts and the ConKit Python library (38).

Folding simulations
The folding simulations were performed in two stages. In both stages, contact restraints were used, and the models were iteratively clustered, refined, and scored based on their overall backbone energy. Full details of the folding simulation protocols and software are provided in the supporting information.

Refinement and relaxation using molecular dynamics
The optimized hGOAT model from stage 2 was oriented with respect to a membrane bilayer using the PPM server (25). The calculated hydrophobic thickness of the hGOAT structural model is 25.2 Ϯ 2.4 Å, with a tilt angle of 3°relative to the membrane normal vector. The oriented protein was then embedded in an ER-mimetic lipid bilayer (1:1 dipalmitoylphosphatidylcholine/dioleoyl phosphatidylcholine) using the CHARMM-GUI web server and subject to an all-atom equilibration at 310.15 K in explicit solvent and 150 mM NaCl counterions (24,39). The simulation was carried out for 500 ns using GROMACS 2016.4, and the structural deviations were monitored (40). The equilibrated structure was isolated and utilized for prediction of internal channels and docking studies. Ghrelin (GSSFL-ghrelin) and octanoyl-CoA enter the GOAT internal channel from the ER lumenal pore and cytoplasmic acyl donor-binding sites, respectively, followed by acyl transfer to the ghrelin serine side chain hydroxyl. Octanoylated ghrelin dissociates to the ER lumen, resulting in the octanoyl chain transiting through the GOAT interior, and CoA is released back to the cytoplasm. The red and blue rectangles represent perimeter helices, the green rectangle represents intramembrane domains forming the cytoplasmic surface of hGOAT, and dotted lines represent binding interactions between the octanoyl-CoA acyl donor and its binding site within hGOAT.

Molecular docking and relaxation of hGOAT:octanoyl-CoA complex
To build a model of the hGOAT:octanoyl-CoA bound complex, we performed docking using Autodock Vina implemented in the YASARA software suite (41,42); full details of the docking procedure are provided in the supporting information.

Construction of hGOAT mutants
Site-directed mutagenesis was performed on our previously reported hGOAT expression construct as described in the supporting information and File S4 (30). This construct was commercially synthesized by Integrated DNA Technologies (Coralville, IA) containing a C-terminal FLAG epitope tag, a polyhistidine (His 6 ) tag, and 3ϫ human influenza hemagglutinin tags appended downstream of a tobacco etch virus protease site (48).

Expression and enrichment of hGOAT in membrane protein fractions
hGOAT WT and mutants were expressed in insect (Sf9) cell membrane fractions using procedures published previously (13,30,46,49).

hGOAT expression analysis by anti-FLAG Western blotting
Expression of hGOAT was determined by anti-FLAG Western blotting using published protocols (Fig. S7) (46). Each gel contained an empty vector microsomal protein as negative control and N-terminal FLAG-BAP fusion protein as a positive control (Millipore Sigma, P7582-100UG, 1:150 dilution, 30-l total volume).

hGOAT activity assay: Standard reaction conditions
hGOAT activity assays under standard conditions were performed with 50 g of membrane protein, 1.5 M fluorescent peptide substrate, 300 M octanoyl-CoA, 1 M methoxy arachidonyl fluorophosphonate, and 50 M HEPES, pH 7.0, in a total volume of 50 l as described previously (46). All components except for the peptide and acyl-CoA substrates were incubated at room temperature for 30 min prior to reaction initiation by the addition of peptide and acyl-CoA substrates. Reactions were incubated at room temperature for 2 h in the dark and then stopped by the addition of 50 l of 20% acetic acid in isopropyl alcohol. Reaction solutions were clarified and analyzed by reverse-phase HPLC (46). Substrate and acylated products were detected by fluorescence ( ex 360 nm, em 485 nm), with the substrate eluting with a retention time of 5-6 min and the octanoylated peptide eluting with a retention time of 11-12 min. Chromatogram analysis and peak integration was performed using Chemstation for LC (Agilent Technologies) (30). Product conversion was calculated by dividing the integrated fluorescence for the product peak by the total integrated peptide fluorescence (substrate and product) in each run. Percentage activity for each hGOAT mutant was calculated by normalizing the product conversion for the mutant to that of WT hGOAT in a reaction run in parallel on the same day using the same reagents.

Statistical analysis of hGOAT alanine variant reactivity
Full details of hGOAT alanine variant statistical testing using a Wilcoxon signed-rank test (n ϭ 42, test statistic W ϭ 294.5, p ϭ 0.02978) are provided in the supporting information, including the R script utilized (50).

Single acyl donor reactivity assay
To determine the reactivity of hGOAT variants with different length acyl donors, hGOAT activity was measured in the presence of a 100 M concentration of a single acyl donor (octanoyl-CoA, lauryl (dodecanoyl)-CoA, or myristoyl (tetradecanoyl)-CoA). Characteristic retention times for each acylated form of the peptide substrate provided confirmation of the nature of the attached acyl chain, with dodecanoyl-GSSFLC AcDan eluting at ϳ17 min and tetradecanoyl-GSSFLC AcDan eluting at ϳ19 min. For each acyl donor, relative activity was calculated normalized ACCELERATED COMMUNICATION: Molecular structure of GOAT to the highest activity observed across the panel of WT hGOAT and hGOAT variants.

Acyl donor competition assay
To determine the relative preference of each hGOAT variant for acyl donors ranging from six to 12 carbons, hGOAT activity was measured in the presence of a 100 M concentration each of four potential acyl donors (hexanoyl-CoA, octanoyl-CoA, decanoyl-CoA, and lauryl (dodecanoyl)-CoA). Each potential product peak was assigned by retention time compared with a standard reaction containing only one acyl donor for each potential product. Competition experiments including myristoyl-CoA were unsuccessful, potentially due to low critical micelle concentration for this acyl donor lying near 100 M (51).