Structure-based Minimization of Transforming Growth Factor-α (TGF-α) through NMR Analysis of the Receptor-bound Ligand

The investigation of a N-terminally truncated human transforming growth factor-α (TGF-α; residues 8–50) has been completed to determine the contribution of the N terminus to receptor binding and activation. The deletion protein was proposed and designed through study of NMR relaxation and nuclear Overhauser enhancement data obtained from the TGF-α-epidermal growth factor (EGF) receptor complex, which indicated that the residues N-terminal to the A loop remain flexible in receptor-bound TGF-α and thus suggested their lack of involvement in receptor binding (Hoyt, D. W., Harkins, R. N., Debanne, M. T., O’Connor-McCourt, M., and Sykes, B. D. (1994) Biochemistry 33, 15283–15292; McInnes, C., Hoyt, D. W., Harkins, R. N., Pagila, R. N., Debanne, M. T., O’Connor-McCourt, M., and Sykes, B. D. (1996) J. Biol. Chem. 271, 32204–32211). TGF-α 8–50 was shown to have approximately 10-fold lower affinity for the receptor than the native molecule in an assay quantifying the ability to compete with EGF for binding and to have a similar reduction in activity as indicated by a cell proliferation assay. NMR solution structural calculations on this molecule demonstrate correct formation of the three disulfide bonds of TGF-α 8–50 and have established the presence of native secondary structure in the B and C loops of the protein. However, some perturbation of the global fold with respect to the orientation of the subdomains was observed. These results suggest that although the N-terminal residues do not contribute directly to binding, they make a significant contribution in defining the conformation of the growth factor, which is required for complete binding and activity and is therefore significant in terms of producing native folding of TGF-α. They also show that information obtained from the receptor-bound ligand can be used to guide the design and minimization of TGF-α analogues. The implications of the study of TGF-α 8–50 for the design and synthesis of reductants of this growth factor are therefore discussed.

The binding of human TGF-␣ 1 to the EGF receptor initiates a number of cell proliferation events including wound healing and embryogenesis (1,2). Furthermore, this mitogenic protein of 50 residues is involved in the transformation of normal cells into malignant growths (3)(4)(5) and is also believed to promote angiogenesis (6). It is thus apparent that this polypeptide is a significant target from a pharmaceutical and drug design perspective, and to this end, considerable effort has been made to elucidate the essential structural features of this tricyclic growth factor required for binding and function (7,8). Many attempts have also been made to synthesize reductant molecules that display a similar biological profile to the native TGF-␣, although for the most part these efforts have met with limited success (9 -11). NMR structures of the free ligand have been determined by several groups (12)(13)(14)(15), and recently studies have been published from this laboratory on the use of NMR relaxation and NOE measurements in determining the essential components of the TGF-␣-EGFR extracellular domain complex (16,17). The data from the latter experiments suggest that A and C loops and the C-terminal tail of TGF-␣ contain residues (for the sequence of TGF-␣ and location of disulfides forming the three loops, see Fig. 1) that form the major binding interface with the receptor and that the N-terminal amino acids outside the A loop remain flexible in the receptor-bound species. Since the consensus from the relaxation and NOE data was that the N terminus of TGF-␣ does not play a role in the receptor-ligand interaction, the N-terminal deletion mutant, TGF-␣ 8 -50 was synthesized and characterized in terms of NMR structure and biological activity. These experiments were performed to clarify the requirements of the growth factor structure necessary for receptor binding and activation and to determine the contribution of the N-terminal tail to the formation of the global fold of the native molecule. Thus the rationale is to assess any functional changes of the truncated protein with respect to native TGF-␣ in terms of structural variation and concurrently in doing this to ascertain if the N-terminal tail residues are required for establishing the native conformation of the protein. The truncated TGF-␣ was also studied to demonstrate that structural information obtained from NMR experiments on the receptor-bound ligand can be used toward the design of a "minimized" TGF-␣ where nonessential regions of the protein are deleted while retaining near native levels of receptor binding and activation.

Synthesis and Purification of TGF-␣ 8 -50
TGF-␣ 8 -50 was synthesized using standard solid phase synthesis techniques on a Applied Biosystems model 430A peptide synthesizer (Foster City, CA) utilizing t-butoxycarbonyl methodology. The N terminus was acetylated using acetic anhydride, and the peptide was liberated from the resin using a mixture of hydrogen fluoride, anisole and 1,2-ethanedithiol. The crude material was extracted with glacial acetic acid, lyophilized, and then dissolved in 10% acetic acid. The peptide was then purified using reversed phase HPLC (Beckmann System Gold, Fullerton, CA) using a Synchropak RP-4 (250 ϫ 21.2 mm inner diameter) column (Synchrom, Lafayette, IN) at a flow rate of 5 ml/min and subsequently oxidized in air using a solution containing 1.0 M urea, 0.1 M Tris, 1.5 nM oxidized and 0.75 nM reduced glutathione, pH 8.0 (10). A final purification step was used to remove oxidation byproducts, and the final purity was confirmed through reverse phase analytical HPLC. Electrospray mass spectrometry (VG Biotech, Cheshire, UK) verified the presence of a species of the desired molecular weight.

NMR Sample Preparation
To a lyophilized sample of TGF-␣ was added 460 l of buffer containing 50 mM potassium phosphate, 10 mM potassium chloride, 1 mM ethylene diamine tetraacetic acid, 0.5 mM sodium azide, 0.15 mM sodium 2,2-dimethyl-2-silapentane-5-sulfonate (internal standard), and 99.9% D 2 O or 90%/10% (v/v) H 2 O/D 2 O. The solution was adjusted to pH 6.5 by the addition of small aliquots of 0.5 N NaOD or 0.5 N HCl bringing the final volume to 500 l.

NMR Spectroscopy
One-and two-dimensional 1 H NMR spectra for TGF-␣ 8 -50 were collected on a Varian Unity spectrometer operating at 599.9 MHz. TOCSY (18) and NOESY (19) experiments were acquired at 298 K and referenced relative to an internal sodium 2,2-dimethyl-2-silapentane-5sulfonate standard. Pulsed field gradients were implemented in the watergate pulse sequence to suppress the water resonance (20). For TOCSY and NOESY experiments, 64 transients were acquired for each of 256 increments using the hypercomplex method of States et al. (21), and a total of 4096 data points were collected over a spectral width of 8000 Hz for H 2 O spectra and 6000 Hz for D 2 O spectra. The SCUBA-NOESY (22) experiment was utilized for acquisition of D 2 O NOE data. Processing of each two-dimensional FID was accomplished using a shifted sinebell and zero filling to 4096 points in both F1 and F2. A complete resonance assignment was obtained from the TOCSY and NOESY data with the exception of the fast exchanging His 12 amide proton and is shown in Table I. The assignment was assisted through the use of chemical shift information previously published for native TGF-␣ (23).

Structure Calculation of TGF-␣ 8 -50
For TGF-␣ 8 -50, all cross-peaks were volume integrated using the program VNMR (VNMR 5.1A, Varian Associates, Palo Alto, CA) of which a subset were assigned based on the chemical shift assignments and possible NOE distances in the native protein (Protein Data Bank code 1yug; Ref 15.) calculated using the programs NMRPipe (24) and PIPP (25) and a cut-off distance of 12 Å. The suite of programs CAMRA (26) was implemented to convert the volume integral information to distance restraints. NOE cross-peak intensities were classified as strong (upper boundary, 2.3 Å), medium (3.0 Å), weak (4.0 Å), and very weak (5.0 Å), and the appropriate pseudoatom correction was added. The molecular dynamics/simulated annealing protocol in X-PLOR version 3.8 (27, 28) was utilized to calculate an initial set of structures derived only from the unambiguous distance restraints and native disulfides assumed in the starting structure. A program developed in house was then applied in conjunction with X-PLOR to modify NOE distances causing restraint violations and also to make use of those that have ambiguous assignments to generate a more refined structure. This program uses an approach similar to that of Nilges et al. (29) in that it allows the use of NOE distances that are ambiguous, because in most cases only one restraint will be possible based on distances from the ensemble of structures already calculated using unambiguous restraints. The program therefore removes restraints that are violated by greater than a user defined cut-off distance (0.5 Å) because the nonpossible assignments for the ambiguous restraints will give distance violations greater than the cut-off. The restraints that satisfy the distances in the initial structures will be retained. Because it is possible that a correct restraint can be violated in initial but not final structures, NOEs that are removed can be returned to the restraint file in later runs for reevaluation by the program. As the structures are refined in successive runs, these restraints should now be satisfied and are incorporated in the final data set.
This procedure as a whole allows the restraints that satisfy the distances in the successive structural ensembles to be retained and thus facilitates the incorporation of a greater number of restraints in the X-PLOR calculation. For the distance restraints that are consistently  violated by less than the specified cut-off in the calculated structures, the program automatically modifies the upper bound of distance restraints by the average value of the distance violation for a user specified number of structures in which the violation occurs. The modified restraint file is then used in the next round of X-PLOR calculations in an iterative procedure until an structural ensemble is obtained of the desired quality. The reproducibility and accuracy of the ensemble generated for TGF-␣ 8 -50 was examined using the program PROCHECK (30), which checks the distribution of backbone dihedral angles in the structures.

Binding and Activity of TGF-␣ 8 -50
Cell Culture-The HBE4-E6E7 cell line was used and was established by transfection of primary human bronchial epithelial cells with a plasmid construct expressing the human papillomavirus type 16 E6 and E7 genes (31). The cell line was maintained in the Keratinocyte-SF (KSF) medium without supplements and incubated at 37°C in an atmosphere containing 5% CO 2 .
Proliferation Assay-Growth curves were established by measuring the incorporation of [ 3 H]methylthymidine into the acid insoluble fraction of cellular extracts. Briefly, approximately 2 ϫ 10 4 cells were plated in 1 ml KSF into each well of 12-well tissue culture plates. After 1 day, the medium was changed either with fresh KSF (control) or medium containing TGF-␣ or TGF-␣ 8 -50. Cells were allowed to grow for 3 days with [ 3 H]methylthymidine (1 Ci/ml, Amersham Canada) being added during the last 20 -24 h. After sequential and multiple washings in cold phosphate-buffered saline, 10% trichloroacetic acid, and 95% ethanol, the plates were allowed to dry, the cells were solubilized in 2% SDS, and the radioactivity was quantitated by liquid scintillation chromatography. The experimental points were determined in triplicate.
EGF Competition Assay-Cells (2 ϫ 10 3 ) were plated in 24-well plates in KSF and grown to confluence. 24 h before performing the experiment, the medium was changed with KSF. Cells were washed three times with binding buffer containing Dulbecco's modified Eagle's medium, 25 mM Hepes, 5 mM MgCl 2 , 100 g/ml bacitracin, 0.1% bovine serum albumin, pH 7.4, and incubated in 0.5 ml of binding buffer containing various concentrations of TGF-␣/TGF-␣ 8 -50 with 0.05 nM 125 I EGF at 0 C for 2 h. After incubation, the cells were washed twice with ice-cold binding buffer and three times with ice-cold phosphatebuffered saline. Cells were solubilized with 1 N NaOH and counted in a ␥ counter. The nonspecific binding was determined in the presence of 1 M unlabeled TGF-␣ or TGF-␣ 8 -50. Data were analyzed using the method of Scatchard.

Analysis of Chemical Shift Variations between Truncated
and Native TGF-␣-The deletion of the N-terminal tail of TGF-␣ raises the question of whether or not native disulfide bonds are present in the truncated form of the protein and if it is able to assume a similar global fold to the intact protein. For the purpose of determining whether the correct pairing of the disulfides in TGF-␣ 8 -50 is present after synthesis and oxidation, the chemical shifts of the ␣-protons were compared with those of the native protein acquired under similar conditions, i.e. pH 6.5 and 30°C. The ␦⌬ values of the comparison (chemical shift differences) are illustrated in Fig. 2, and it can be seen that there are no large deviations in these shifts. This suggests that the three disulfide links between Cys 8 and Cys 21 , Cys 16 and Cys 32 , and Cys 34 and Cys 43 are present as observed in the intact TGF-␣. The presence of the native disulfide linkages was confirmed through observation of NOEs between cysteines 8 and 21 (Fig. 3) and 34 and 43 (not shown). Also from the absence of major deviations the assumption can be made that a structure cognate to the native molecule exists. However, despite the lack of large differences in the chemical shifts of the deletion mutant, significant changes of between 0.05 and 0.2 ppm are observed. These occur for the most part in the A and B loops and indicate that deletion of the residues N-terminal to the A loop effects changes in the secondary structure and perhaps in the overall fold of the molecule. In particular, the residues undergoing the largest variation are those of the A loop including Tyr 13 , Gln 14 , and Phe 15 , which have resonances differing from the native protein by 0.17, 0.13, and 0.13 ppm respectively. Of the B loop ␣-protons, those that exhibited the most significant changes were Gly 19 and Thr 20 in addition to the ␣-resonances of all the residues between Val 25 and Lys 29 . For the hinge region of TGF-␣ 8 -50 (residues close to Val 33 ), which allows flexibility between the two subdomains (A and B loops comprise the N subdomain, whereas the C loop and tail form the C subdomain), ␦⌬ values of between 0.04 and 0.09 ppm were observed for the residues between Cys 32 and Ser 36 and thus are indicative of a possible variation in the conformation of this region. An important observation of the shifts for the residues of the C loop is that with the exception of Gly 40 and Glu 44 there are no differences larger than 0.05 ppm, suggesting that the reverse double hairpin secondary structure is present in the truncated protein. The ␦⌬ ␣-H value for Glu 44 of 0.08 ppm is of special interest because this residue forms interactions critical to the stabilization of the orientation of the N-and C-terminal subdomains.
Receptor Binding and Activation by TGF-␣ 8 -50 -Investigation of the biological potency of TGF-␣ 8 -50 was performed to assess the contribution of the N-terminal tail to receptor bind- ing and activation. This was accomplished through a competition assay of TGF-␣ 8 -50 with EGF on HBE cells displaying the EGF receptor and by measuring the ability of the truncation analogue to effect cell proliferation. The results from the competition assay are shown in Fig. 4. From the plot it is apparent that it requires approximately 10-fold more TGF-␣ 8 -50 to displace iodinated EGF from HBE cells than it does for native TGF-␣, thus indicating that it has 10 times lower affinity for the EGF receptor. The quantitation of the receptor activation assay, however, illustrates that the reduction in the ability of the truncated protein to cell proliferation is of the same order as the corresponding decrease in receptor binding (data not shown).
NMR Structure Determination of TGF-␣ 8 -50 -As evidenced from the variation in the ␣-H resonances between the N-terminally deleted and native TGF-␣s, the conformation of the truncated TGF-␣ did not appear to be identical to that of previously determined structures of TGF-␣. With the objective of determining more precisely the structural differences that occur in TGF-␣ 8 -50, an ensemble of solution structures was calculated for the N-terminally deleted protein using NOEderived distance restraints. 35 best fit structures were generated from 407 NOE distances using the program X-plor after nine rounds of calculation to minimize restraint violations and make the best use of ambiguous NOEs. The structure statistics for the ensemble of 35 structures are shown in Table II and illustrate their quality in terms of high definition, in terms of fit to the experimental data, and in terms of the component and total energies. Analysis of the average of the 35 structures using the program PROCHECK (30) indicated that 92% of the residues excluding glycine, proline, and end residues are found in the allowed regions of phi-psi space. The residues found in non-allowed regions, which are Phe 15 and Cys 16 , are located in the highly flexible portions of the truncated molecule. Fig. 5 shows a superposition of the 35 best fit structures to the average structure and depicts the excellent reproducibility of the conformational ensemble. The values for the superposition of residues 15-47 for the 35 structures to the average structure are 0.844 Å for the backbone atom r.m.s. deviations and 1.304 for the heavy atom r.m.s. deviations, which are very favorable in their comparability with the ensemble recently generated for TGF-␣ using similar conditions for data acquisition (15). DISCUSSION Biological Potency of TGF-␣ 8 -50 -TGF-␣ 8 -50 was proposed and designed based on the results of the relaxation and NOE analysis data of the ligand-receptor complex, which suggested that the N terminus does not contribute to the complexation process and that the truncated protein should retain the ability to associate with the EGF receptor and initiate cell proliferation. Our results indicate that although the binding and activation are diminished 10-fold, TGF-␣ 8 -50 still retains a significant level of potency in terms of competition with EGF for the receptor and ability to effect cell proliferation. The plethora of studies on synthetic fragments of TGF-␣ have for the most part failed to exhibit biological activity close to the order of that observed for the native molecule. Our results differ slightly from those of Tam et al. (10), who previously synthesized TGF-␣ 8 -50 and reported binding and mitogenic activity of 3%. Our data imply that TGF-␣ 8 -50 is more potent; however, this variation may be the result of different methods used for assaying activity. In our case, the protein synthesized in this laboratory was fully characterized by NMR and thus the possibility of ambiguity in the structure and correct folding of the disulfides is not present, and therefore the biological data are reliable in terms of the integrity of the deletion protein.
Solution Structure of TGF-␣ 8 -50 -The NMR-derived solution structure of TGF-␣ 8 -50 exhibits similar structural characteristics in comparison with the global fold of the native protein as has been determined by several groups to date (12)(13)(14)(15). The overall topology of the truncated protein superimposes favorably to that of the intact TGF-␣ as shown by backbone r.m.s. deviations of 2.78 Å for residues 15-47. The identity between the two structures is demonstrated in Fig. 6, which illustrates the Richardson diagrams of the native and N-terminally deleted TGF-␣s. The secondary structure elements present in TGF-␣ remain in the structural ensemble of the N-terminally truncated molecule since the B loop ␤-sheet comprised of residues 19 -24 and 29 -34 and the C-terminal   proteins are evident. In particular, the ␤-turn of the B loop sheet of the truncated TGF-␣ is oriented away from the Nterminal tail when overlaid with the ␤-sheet of the native protein. Further evidence for this structural change is given by the ␦⌬ values for the turn residues, i.e. Val 25 to Asp 28 and by Lys 29 , which are all between 0.07 and 0.14 ppm. This effect is most presumably the consequence of the absence/disruption of the third strand of the ␤-sheet formed by the interactions of Phe 5 and Asn 6 of the N terminus with Phe 23 and Leu 24 of the second strand. Disruption of these interactions does not destabilize the B loop ␤-sheet; however, it does allow the turn greater freedom to adopt a dissimilar conformation.
The most significant and consequential variation in the structure of the two proteins is in the relative orientation of the C-terminal subdomains comprised of residues 34 -50. As can be observed from the superposition of TGF-␣ 8 -50 to TGF-␣ in Fig. 6, this region in the deletion protein is angled almost perpendicular to the N-terminal subdomain with the first strand of the reverse double hairpin folded away from the hinge region from Cys 32 to Ser 36 . The differences in the arrangement of the C-terminal subdomain for the two molecules as illustrated by the calculated structures are corroborated by the chemical shift data, which indicate that although notable variations in the ␣-proton resonances occur in the region between Cys 32 and Ser 36 (Fig. 2), less significant variation in the ␣-H frequencies of the C loop and tail secondary structure are observed. The conformational transition of the C-terminal subdomain that occurs in the N-terminally deleted TGF-␣ is apparently a propagated effect and is the result of disruption of the interactions between the A and C loops. As previously mentioned, the partial helix present in the intact protein is no longer present in TGF-␣ 8 -50. The consequence of deletion of the N-terminal tail is variation in the conformation of the A loop, which in turn results in altered interactions between the A and the C loops. The specific interactions that differ from the native protein in defining the fold of the A and C subdomains in TGF-␣ 8 -50 appear mostly to be in the packing of the residues that stabilize the orientation of the two regions. The hydrophobic cluster of Phe 17 , His 18 , and Tyr 38 is more closely packed in the truncated TGF-␣ in addition to a more defined interaction between Phe 15 and Arg 42 . The conformation of the interdomain residues His 18 and His 35 also varies in the N-terminally deleted molecule, and the association of the side chains of Glu 44 and Phe 17 is different for TGF-␣ 8 -50, an observation that is confirmed by the chemical shift difference for Glu 44 ␣-H of 0.08 ppm. These changes indicate the transition in domain structure between the two molecules that result from novel interactions between the residues of the A and the C loops.
It is interesting that the new conformation of TGF-␣ 8 -50 and orientation of the subdomains in the truncated molecule is of comparable if not better definition than that seen in the structural ensemble generated for native TGF-␣. This suggests that the N-terminal tail is not required to give a stable protein fold, although it is necessary to produce the native structure of TGF-␣ in terms of orientation of the subdomains.
Implications of TGF-␣ 8 -50 Solution Structure for Functional Differences-With the solution structure of TGF-␣ 8 -50 available, the biological rationale for decreased activity of the truncated protein as indicated by the reduction in receptor binding affinity and mitogenic activity can be envisaged. The previous studies from this laboratory implementing relaxation and NOE analysis techniques to elucidate the residues and regions of TGF-␣ involved in complexation with EGFR (16,17) were used to design the N-terminally deleted protein. The results obtained implicated residues from the A and C loops and C tail as forming the predominant receptor binding interface. Fig. 7 illustrates the average ensemble structures of TGF-␣ and TGF-␣ 8 -50 shown as molecular Connolly surface representations and color coded according to the percentage of NOEs absent in the receptor-bound wild type TGF-␣, i.e. those present in the free ligand but not in the bound protein.
The structure utilized for the native molecule in Fig. 7 is the most recently published structure for TGF-␣ as determined by Moy et al. (15), which was unavailable for our previous studies (17). When this structure is used to display the changes in NOE intensity upon addition of the EGFR, an almost completely contiguous receptor binding face is observed. This visualization gives an even more plausible scenario for the interactions of TGF-␣ with its receptor than indicated in the previous publication from this laboratory (17) since Arg 42 , Leu 48 , and Leu 49 are now present on the same face of the protein structure as the other residues that lose the highest percentage of NOEs upon receptor binding, i.e. those colored in red and green. As the regions of the truncated protein that exhibit the most significant structural changes are the A and C loops, it can easily be seen that perturbation of the conformation of these regions FIG. 7. Grasp (36) molecular surface representation of the native and truncated TGF-␣ average structures illustrating the residues color coded according to the percentage of NOEs, which disappear in the receptorbound form of the native ligand. The residues are colored according to the following percentages of absent NOEs in the bound TGF-␣: 0%, blue; 0 -20%, purple; 20 -30%, white; 30 -40%, green; and 40 -60%, red.