Biophysical Evidence for Intrinsic Disorder in the C-terminal Tails of the Epidermal Growth Factor Receptor (EGFR) and HER3 Receptor Tyrosine Kinases*

The epidermal growth factor receptor (EGFR)/ErbB family of receptor tyrosine kinases includes oncogenes important in the progression of breast and other cancers, and they are targets for many drug development strategies. Each member of the ErbB family possesses a unique, structurally uncharacterized C-terminal tail that plays an important role in autophosphorylation and signal propagation. To determine whether these C-terminal tails are intrinsically disordered regions, we conducted a battery of biophysical experiments on the EGFR and HER3 tails. Using hydrogen/deuterium exchange mass spectrometry, we measured the conformational dynamics of intracellular half constructs and compared the tails with the ordered kinase domains. The C-terminal tails demonstrate more rapid deuterium exchange behavior when compared with the kinase domains. Next, we expressed and purified EGFR and HER3 tail-only constructs. Results from circular dichroism spectroscopy, size exclusion chromatography with multiangle light scattering, dynamic light scattering, analytical ultracentrifugation, and small angle X-ray scattering each provide evidence that the EGFR and HER3 C-terminal tails are intrinsically disordered with extended, non-globular structure in solution. The intrinsic disorder and extended conformation of these tails may be important for their function by increasing the capture radius and reducing the thermodynamic barriers for binding of downstream signaling proteins.

The epidermal growth factor receptor (EGFR) 3 /ErbB family of receptor tyrosine kinases (RTKs) contains four member proteins: EGFR/ErbB1, HER2/ErbB2/neu, HER3/ErbB3, and HER4/ErbB4. These RTKs carry out important signaling functions via the sequential process of ligand binding by the extracellular domain, homo-or heterodimerization, activation of their intracellular kinase domain, and recruitment of downstream signaling proteins. These RTKs are also important oncogenic drivers in many breast, lung, and other human cancers (1). Several structural biology studies on the ErbB family have been published, and this has helped advance drug development for HER2-positive breast cancer (2). Protein crystallography studies published in 2004 showed the structure of pertuzumab bound to the extracellular domain of HER2 and lapatinib bound to the kinase domain of EGFR (3,4). Since that time, growing structural biology-based understanding of how EGFR, HER2, and HER3 function at the atomic level has dramatically reshaped our understanding of RTKs (1,2). Despite these advances, there is a domain in each EGFR/ErbB family protein for which little structural biology information is available; this domain is the C-terminal tail (CTT) domain. The CTTs contain numerous autophosphorylation sites that are essential for recruiting downstream signaling proteins and initiating intracellular signaling (5,6). The CTT can also contribute to autoinhibition of the kinase domain of RTK (7). The lack of available crystallographic information on the CTT region of EGFR/ErbB family RTKs led us to examine whether these proteins lack a stable secondary and/or tertiary structure.
Intrinsically disordered regions (IDRs) represent an emerging area of interest in medicine. IDRs are regions within proteins that exhibit high flexibility and may lack a secondary or tertiary structure but are still able to carry out important biological functions (8 -15). Algorithm prediction methods indicate that around 25-30% of eukaryotic proteins can be categorized as having disordered regions (16). These disordered regions provide certain advantages in protein-protein interactions, including a larger hydrodynamic radius (17,18), faster on-and off-rates of binding (19), high binding specificity (20), and the ability to adopt different conformations depending on the binding partner (9,21). Correlation studies have revealed a high propensity for disordered regions to undergo posttranslational modification, particularly phosphorylation (22). Because traditional methods, such as NMR and X-ray crystallography, were developed to study stable protein structures, IDRs have been more difficult to analyze because of difficulties in obtaining high concentrations or representative protein crystals (11,23). Correlation studies suggest a strong association between IDRs and human cancer-associated proteins (24); therefore, identifying and analyzing IDRs in cancer-related proteins is vital in understanding how they function.
Protein kinases demonstrate a high degree of specificity in facilitating phosphorylation; however, many are able to perform such interactions with multiple substrate partners (25). An analysis of the human kinome shows that as many as 83% of kinase genes contain IDRs, which could facilitate these multiple interactions. RTKs are involved in more protein-protein interactions than any other kinase group, thus potentially pointing to the involvement of IDRs in their recognition mechanisms (26). Multisequence alignment (27) of EGFR/ErbB family RTKs shows that the kinase domain sequences are highly conserved between all four family members. However, the CTT regions are highly divergent between each EGFR/ErbB family member. One previous study using circular dichroism (CD) spectroscopy indicated that the EGFR CTT is rich in ␣-helical and ␤-sheet content (6), but a later study used coarse grained modeling to show many possible conformations of the EGFR CTT based on the assumption that it is naturally disordered (28). A high degree of flexibility in the CTT would provide distinct advantages to EGFR interaction with downstream Src homology 2 (SH2) and phosphotyrosine-binding domains during signaling functions (6).
In this work, we examine the biophysical properties of the CTT of EGFR and HER3. The motivation of this study is to gather structural information on the CTT region to determine whether the tails are highly dynamic, disordered regions. First, we measured the conformational dynamics of intracellular half (ICH) constructs of EGFR/ErbB family members using amide hydrogen/deuterium exchange mass spectrometry (HDX-MS). This information is used to compare the CTTs with the kinase domains. We also expressed and purified EGFR and HER3 CTT-only constructs and demonstrated that they are functional because they are recognized and phosphorylated by EGFR family kinases and once phosphorylated can be bound by the Grb2 SH2 domain. Using these CTT constructs, we performed multiple biophysical analyses, including CD spectroscopy, size exclusion chromatography with multiangle light scattering (SEC-MALS), dynamic light scattering (DLS), analytical ultracentrifugation (AUC), and small angle X-ray scattering (SAXS). The results of these methods support the hypothesis that the EGFR and HER3 CTTs are IDRs with extended, non-globular structure in solution.

Results
Disorder Predictions on EGFR, HER2, and HER3-We used computation algorithms to provide an initial survey of disordered regions in the EGFR/ErbB family of RTKs. Predictor of natural disordered regions (PONDR) is a collection of algorithms that use an amino acid sequence to predict native disor-der (29,30). The VL-XT algorithm assigns a disorder propensity score for disorder on a residue by residue basis (29 -31). The VL-XT prediction for EGFR, HER2, and HER3 can be seen in Fig. 1. PONDR scores greater than 0.5 indicate predicted disorder, and scores less than 0.5 indicate predicted order. Kinase domains of the three EGFR/ErbB family members show mostly predicted order, whereas the CTT in each shows large regions of predicted disorder. The amino acid composition of the CTT and kinase domains of both EGFR and HER3 is shown in Table 1. When comparing the CTTs with the kinase domains in both proteins, we observe that the CTTs are more enriched in polar, uncharged residues and prolines but relatively depleted in hydrophobic residues. For EGFR, polar, uncharged residues make up 30.1% of the CTT versus 14.9% of the kinase domain. Prolines make up 10.6% of the EGFR CTT versus 5.4% in the kinase domain, and hydrophobic residues (excluding tyrosine, which can become phosphorylated) are 27.4% of the EGFR CTT versus 38.9% of the kinase domain (Table 1). Similar values are seen with HER3 CTT (Table 1) despite its primary sequence divergence from EGFR CTT (22% sequence identity between EGFR and HER3 CTT; calculated with Clustal Omega) (74). The lowered hydrophobicity provides a simple explanation as to why the tails would not form a hydrophobic core and therefore be disordered in an aqueous environment. These predictions are a useful tool for gaining a FIGURE 1. Disorder prediction by PONDR VL-XT algorithm. Shown are the disorder predictions for EGFR, HER2, and HER3 ICD construct sequences. In all three graphs, results for the kinase domain residues are colored blue, and the tail domain residues are colored in red. A score above 0.5 indicates predicted disorder, whereas a score below 0.5 indicates predicted order.
general view of disordered regions in a protein sequence, but direct empirical evidence would better support the hypothesis of the CTTs as being IDRs.
Hydrogen/Deuterium Exchange Mass Spectrometry of EGFR/ ErbB Family RTKs-HDX-MS is a highly useful technique to identify and study IDRs (32)(33)(34)(35)(36). HDX-MS provides kinetic information about the exchange rate of amide hydrogen atoms along the backbone of a protein for solvent deuterons. This rate of hydrogen/deuterium exchange is affected by factors such as the presence of strong hydrogen bonds, secondary structure, and tertiary structure (37)(38)(39)(40). IDRs show very rapid hydrogen/ deuterium exchange rates on the order of milliseconds to seconds (32)(33)(34)(35)(36). HDX-MS provides local information about different regions of the protein and readily distinguishes folded, globular regions from IDRs. We performed HDX-MS experiments by exposing the protein to deuterium oxide (D 2 O) over two ranges of time points, 108 -2333 ms using quench flow and 5 s-2 h using manual labeling. HDX was quenched by lowering pH and rapidly cooling. This was followed by digesting the protein with immobilized pepsin and measuring the mass increase that results from incorporation of deuterons into peptic peptides by MS. In this study, we present HDX data as %n ex , a percentage of the maximum observable mass increase. %n ex is based on the number of amide hydrogens available for exchange and the H 2 O:D 2 O ratio during the labeling step.
We show that hydrogen/deuterium exchange occurs more rapidly in peptides from the CTT than those from the kinase domain (Fig. 2). Many peptides from the kinase domains (shown in blue) do not reach a maximum exchange plateau even within 2 h of exchange. However, peptides from the CTTs (shown in red) exchanged much more rapidly, often reaching a plateau before 5 s of exchange. This behavior was observed for all three ErbB ICH constructs analyzed. This indicates that the CTT region has faster hydrogen/deuterium exchange kinetics than the kinase domain with amide hydrogen protection from exchange being much weaker in the tail.
Comparisons between HDX data and the domain and secondary structure of EGFR, HER2, and HER3 are shown in Fig. 3. The heat map was constructed by measuring deuterium uptake in individual peptides and then on a residue by residue basis across the entire sequence, averaging the %n ex measurement of all peptides covering specific amino acid residues. Residues and time points with no acquired peptide coverage and proline residues are shown as gaps in the data. This averaging method simplifies the observation of variations in deuterium exchange protection for localized protein regions. In the EGFR heat map, for example, we observe low exchange across much of the kinase domain, indicated by blue coloring. Exceptions to this observation include regions within the ␣C-helix and activation loop (A-loop), which show more rapid exchange within the millisecond time scale of exchange, indicated by yellow colors. The most striking result can be seen within the EGFR CTT where we observe that exchange reaches a plateau very rapidly, often within 5 s of exchange, and the %n ex values are indicated by orange and red colors on the heat map. This rapid exchange indicates that the tail is unprotected from exchange when compared with the kinase domain. The observed rapid exchange behavior is characteristic of a highly dynamic and/or frequently exchange-competent conformational state, meaning amide hydrogen bonding between residues is transient or weak in these regions. Expression of EGFR and HER3 C-terminal Tails-To further characterize the conformational and physical state of the EGFR and HER3 CTTs, we expressed and purified EGFR and HER3 CTT-only constructs in a bacterial Escherichia coli system. EGFR residues 961-1186 with a C-terminal His tag were purified using nickel-nitrilotriacetic acid column chromatography and gel filtration. Estimated purity by SDS-PAGE was Ͼ95% (Fig. 4A) with a concentration of 5 mg ml Ϫ1 . The HER3 CTT, HER3 residues 981-1342, was also expressed in E. coli and purified to a concentration of 2 mg ml Ϫ1 with Ͼ95% purity in the same manner (Fig. 4B).
Confirmation of EGFR C-terminal Tail Function-Phosphorylation of the CTT region and subsequent binding of SH2 domain-containing proteins are essential steps in EGFR signal-ing. We first validated the ability of our CTT constructs to be both recognized and phosphorylated at specific tyrosine sites by separate EGFR family kinase domains in solution just as they would be if they were part of their respective ICH constructs. To test whether the E. coli-expressed EGFR CTT construct is functional, we incubated it with recombinant EGFR kinase domain and measured tail phosphorylation using reaction with ATP and blotting with phosphospecific antibodies. We show, through Western blotting identification of specific phosphorylated tyrosine residues (EGFR Tyr-1068 and Tyr-1173) in the EGFR CTT (Fig. 4C), that the EGFR CTT is recognized and phosphorylated by the EGFR kinase domain even when these two domains are expressed as separated constructs. Furthermore, we used anti-HER3 Tyr(P)-1289 and general PY20 antibodies to verify that the HER3 CTT was phosphorylated by both the EGFR kinase domain and HER2 kinase domain via Western blotting (Fig. 4D).
Next, we tested whether EGFR CTT can be bound by an SH2 domain-containing protein. We recombinantly expressed GST-tagged Grb2 SH2 domain and performed a GST pulldown assay (Fig. 4E). Incubating EGFR CTT with EGFR kinase CTT is recognized and phosphorylated by the EGFR and HER2 kinase domains just as it would be as if it was part of its ICH construct. Phosphospecific antibodies to HER3 Tyr(P)-1289 and general anti-phosphotyrosine PY20 were used to detect HER3 CTT phosphorylation. E, phosphorylated EGFR CTT binds to the Grb2 SH2 domain. GSH-agarose beads were preloaded with GST-tagged Grb2 SH2 domain protein. Phosphorylated or unphosphorylated EGFR CTT was then incubated with the beads for 2 h at 4°C, flow-through (FL) was collected and washed, and then the beads were boiled in sample buffer (Bound). Fractions were separated by SDS-PAGE and then transferred to nitrocellulose. Anti-His 6 -HRP antibody was used to detect EGFR CTT in the FL and bound fractions.
domain produced a phosphorylated CTT that appeared as multiple bands of higher molecular weight, and this phosphorylated EGFR CTT did not bind on its own to glutathione-agarose (Fig.  4E, flow-through in lane 1 and bound in lane 2). Similarly, incubating unphosphorylated EGFR CTT with the GST-Grb2 SH2 domain construct did not result in binding, and EGFR CTT eluted in the flow-through (lane 3). However, once phosphorylated, EGFR CTT was bound by GST-Grb2-SH2 and was pulled down by glutathione-agarose (Fig. 4E, lane 6). These experiments provide evidence for the functionality of the EGFR and HER3 CTT constructs.
Analysis of EGFR CTT by CD Spectroscopy-Far-UV CD spectroscopy provides information about the secondary structure characteristics of protein molecules in solution based on the absorption of circularly polarized light by amide bonds in the polypeptide backbone. Because secondary structure types absorb circularly polarized light differently and each amide bond contributes to the UV absorption, the resulting spectrum will reflect a global average of the secondary structure content in a protein. Due to a lack of ␣-helix and ␤-sheet content, mostly or fully disordered and unfolded proteins will have CD spectra that are distinct from highly ordered proteins. The CD spectra between 190-and 260-nm wavelengths for EGFR CTT and HER3 CTT are shown in Fig. 5, A and B, respectively. Distinctive features expected for a mostly ␤-strand protein would include a positive ellipticity maximum at 195 nm with a negative ellipticity minimum near 215 nm. For a mostly ␣-helical protein, one would expect two minima, one at 208 nm and the other at 224-nm wavelength, along with a pronounced maximum near 192 nm (15,41). However, the CD spectrum observed for EGFR CTT shows a minimum between 195 and 200 nm, which is a characteristic of disordered proteins (Fig.  5A). However, the slightly negative ellipticity at 222 nm indicates the possibility of residual secondary structure. Analysis via the CDSSTR algorithm, an algorithm used to assign secondary structure composition to CD spectra, shows 77% unordered character in the EGFR CTT spectrum (Fig. 5C). These results suggest that much of the EGFR CTT construct has an unfolded conformation in solution. The CD spectrum for HER3 CTT shows similar features as the EGFR CTT with a minimum between 195 and 200 nm and slightly negative ellipticity at 222 nm (Fig. 5B). The HER3 CTT has mostly unordered content (60%) but slightly higher ␤-sheet content than EGFR CTT: regular ␤-strand (␤ R ) ϭ 15% and distorted ␤-strand (␤ D ) ϭ 7% for HER3 CTT as compared with ␤ R ϭ 6% and ␤ D ϭ 3% for EGFR CTT (Fig. 5C).
Size Exclusion Chromatography with Multiangle Light Scattering-Another property of intrinsically disordered regions is a higher than expected apparent molecular mass during gel filtration separations (42). SEC can be used to separate proteins based on their hydrodynamic size, which is increased in IDRs depending on the degree of conformational extension. Proteins with higher hydrodynamic size elute earlier than smaller proteins, and accurate, absolute measurement of molecular weight can be obtained by coupling SEC to a MALS detector (43). Fig. 6A shows the SEC elution chromatograms for both the EGFR CTT and carbonic anhydrase with the coupled MALS detection inserted above. Our analysis shows that EGFR CTT and carbonic anhydrase share similar molecular masses at 26 and 29 kDa, respectively, and both are monomeric under these conditions. EGFR CTT elutes from the SEC column before 13 ml of buffer volume. Despite having a lower molecular weight, the EGFR CTT elutes earlier than carbonic anhydrase, which elutes just after 15 ml of buffer. The expected Stokes radius of carbonic anhydrase is 2.4 nm (44). This earlier elution indicates that the EGFR CTT conformational ensemble is more extended than that of carbonic anhydrase, which has a globular structure. Fig. 6B shows the SEC-MALS chromatogram profiles for HER3 CTT and EGFR kinase domain. The molecular masses, determined by MALS, of the HER CTT and EGFR kinase domain were 42 and 38 kDa, respectively. HER3 CTT eluted at about 14 ml of buffer, which is earlier than the EGFR kinase domain, which eluted between 16 and 17 ml of buffer. As with EGFR CTT, no multimeric peaks were observed in the HER3 CTT chromatogram. FIGURE 5. Circular dichroism spectroscopy. A, the CD spectrum for the EGFR CTT construct does not show prominent spectral features of ␣-helices and ␤-sheets. B, similarly, the HER3 CTT CD spectrum shows high unordered content with slightly higher calculated ␤-sheet content than in the EGFR CTT spectrum. C, secondary structure assignment was assigned the following symbols: regular ␣-helix (␣ R ), distorted ␣-helix (␣ D ), regular ␤-strand (␤ R ), distorted ␤-strand (␤ D ), turns (T), and unordered (U). deg, degrees.
Dynamic Light Scattering-The DLS technique is used to analyze hydrodynamic properties of proteins in solution. Using the hydrodynamic radius, or Stokes radius, derived from measurements of translational diffusion coefficients, we are able to differentiate between a protein ensemble with mostly compact, globular conformations and one with more extended conformations as would be seen for IDRs (45). DLS is complementary to the static light scattering used in SEC-MALS in that DLS provides information on molecular size, whereas MALS provides a molecular weight and verification of a monomeric state. A histogram showing a distribution of measured hydrodynamic radii within the EGFR CTT and bovine serum albumin (BSA) populations is shown in Fig. 6C. The EGFR CTT, which has a molecular mass of 26 kDa, was determined to have a hydrodynamic radius of 5.3 nm. For a comparison, we also analyzed BSA, a globular protein with a molecular mass of 66 kDa, and the hydrodynamic radius of BSA was measured to be 4.8 nm. This demonstrates that EGFR CTT has a larger hydrodynamic radius than a globular protein more than 2.5 times its size. Similarly, HER3 CTT has a very large hydrodynamic radius (Fig.  6D). The hydrodynamic radius (R H ) of HER3 CTT is 6.4 nm, although it has higher polydispersity than EGFR CTT.
Analytical Ultracentrifugation-We used sedimentation velocity AUC to determine the sedimentation coefficient, s, of EGFR CTT, which provides information about the molecular weight of the tail, M f , as it relates to its R H (Fig. 6E). We also determined that our sample is homogenous due to the absence of secondary peaks in Fig. 6C as secondary populations or impurities would have different sedimentation velocities. From these AUC measurements, we obtained the frictional coeffi- Again, the two molecules share similar molecular weights, but HER3 CTT elutes sooner than the EGFR KD. C, hydrodynamic radius determination of the EGFR CTT and BSA by dynamic light scattering. The measured DLS hydrodynamic radius distribution for EGFR CTT is shown as yellow bars. The measured hydrodynamic radius of EGFR CTT is 5.3 nm. The histogram for BSA (monomeric fraction obtained by gel filtration) is shown as red bars. The R h of monomeric BSA is 4.8 nm. D, hydrodynamic radius determination of the HER3 CTT. The measured DLS hydrodynamic radius for HER3 CTT is 6.4 nm. E, frictional coefficient, frictional ratio, and molecular weight determined by analytical ultracentrifugation. The sedimentation coefficient was used to measure an apparent molecular mass, M f , of 23.5 kDa. The frictional ratio for the EGFR CTT is 1.77, indicating a large Stokes radius relative to globular proteins. F, frictional coefficient, frictional ratio, and molecular weight determined by analytical ultracentrifugation. The measured apparent molecular mass of HER3CTT is 31.6 kDa. The frictional ratio is 1.52. cient of the tail, f, which can be used to relate its molecular shape to that of a globular protein. The frictional ratio, f/f min , provides an indication of the shape of a macromolecule based on the ratio of its measured frictional coefficient to a theoretical minimum frictional coefficient for a particle with minimum Stokes radius at a given molar mass (46). For a globular protein, the expected f/f min would fall between 1.15 and 1.3 (47)(48)(49). We determined the f/f min value for EGFR CTT to be much greater, at 1.77, which is consistent with this being an IDR. Similarly, HER3 CTT has a f/f min value of 1.52, which is also greater than the expected frictional ratio for globular proteins (Fig. 6F).
Small Angle X-ray Scattering-Small angle X-ray scattering methods have been particularly useful in analyzing conformation ensemble properties of IDRs (50). In small angle X-ray scattering analyses, we measured the isotropic scattering intensity, I(s), as a function of the momentum transfer, s, of monochromatic X-ray light diffracted by protein macromolecules in solution. In protein structure studies, SAXS has been used to characterize structure and dynamics of monodisperse molecular species with a dominating structural configuration. IDRs show larger average sizes compared with globular proteins that contain tightly packed cores. By comparing the measured R g and D max from SAXS with other known structured globular proteins, together with Kratky plots (I(s)⅐s 2 as a function of s), which characterizes the flexibility state of the proteins, one is able to identify the flexibility of the IDRs (51).  Table 2). The Kratky plot (Fig. 7C) of BSA shows a parabolic shape with a well defined peak. In contrast, EGFR CTT shows a hyperbolic shape on the Kratky plot, indicating an intrinsically disordered region (Fig. 7C). Fig. 7D shows the pair distribution function (P(r)) of EGFR CTT generated by Datgnom (52). The D max of EGFR CTT is 156 Ϯ 6 Å, which is much larger than that of BSA (Table 2). Thus, the R g and D max measurements, Kratky plot, and P(r) distribution of EGFR CTT indicate that the conformational ensemble adopted by EGFR CTT is extended and flexible.
When we analyzed the HER3 CTT with SAXS, we also included samples containing urea to determine whether a chemical denaturant would have an effect on the conformational ensemble. It is also important to note that, unlike in EGFR CTT buffer, 5% glycerol was also included in the HER3 CTT buffer. The glycerol was necessary to maintain protein solubility and stability of the HER3 CTT, but it may also act to stabilize conformations with diminished extendedness. We do not directly compare derived R g values between EGFR CTT and HER3 CTT because of this difference in buffer conditions. The scattering profiles for HER3 CTT without urea and with 4 M urea are shown in Fig. 8A, and Guinier plot regions are shown in Fig. 8B. From these data, we calculated an R g of 45.4 Ϯ 0.4 Å in the absence of urea, and that increases slightly to 50.5 Ϯ 1.2 Å upon the addition of 4 M urea ( Table 2). The derived Kratky profiles for these two conditions are shown in Fig. 8C. Without urea present, the HER3 CTT produces a slightly parabolic Kratky profile, with a defined local maximum observed, at lower momentum transfer values. Moving toward greater momentum transfer values then gives a more hyperbolic increase in I(s)⅐s 2 . For HER3 CTT with 4 M urea, the Kratky profile has a hyperbolic shape indicative of disorder, similar to the profile observed for the EGFR CTT (Fig. 7C). The difference in Kratky plots between HER3 CTT with or without urea, together with increased R g , indicates that HER3 CTT possesses characteristics of IDRs but not of random coil structures when in solution.

Discussion
We have shown that the CTT domains in EGFR family proteins are intrinsically disordered regions that have an extended, highly dynamic conformational state. HDX-MS demonstrates that the CTT domains have a highly dynamic and exchangecompetent conformation. CD spectroscopy shows that they have high unordered character as expected for IDRs. SEC and DLS demonstrate that hydrodynamic size of the CTT is larger than protein standards of comparable molecular weight, which is another property of IDRs. AUC and the Kratky plot from SAXS demonstrate that the CTTs are IDRs, and R g and D max measurements from SAXS demonstrate that the CTTs have an extended conformation. The R g and D max measurements of the EGFR kinase domain were estimated from crystal structure (Protein Data Bank code 1XKK), and we observed that the EGFR CTT had R g and D max values that were ϳ2.4 times larger than those of the EGFR kinase domain.
A striking finding on the EGFR and HER3 CTTs is that despite their low primary sequence homology (22% sequence identity) the structural features of the CTTs are strongly conserved. This implies that the intrinsic disorder of these tails is important for their function, and the multiple conformers that a disordered region can adopt could reduce the thermodynamic barriers for protein binding and thereby increase both the onand off-rates for binding of SH2 domain-containing proteins. Furthermore, the extended conformation of the CTT could increase the capture radius for recruitment of these signaling proteins (53,54). Both of these mechanisms would facilitate signal propagation by these receptor tyrosine kinases.
The intrinsically disordered properties of these CTTs also explain why these regions are mostly absent from crystallographic studies on EGFR (7,55). Portions of the EGFR tail located close to the kinase domain have been shown to be  responsible for EGFR autoinhibition (56). The deletion of residues 982-1054 in the EGFR vIVb mutant is of particular importance in downstream signaling and kinase activation and has been shown to promote tumor formation. This region has been characterized via crystallography as containing an AP2-helix and other secondary structure elements, which appear to interact with the kinase domain in a manner that inhibits activation (4,55,56). In a study by Kovacs et al. (55), serial deletions of tail regions revealed specific regions important for autoinhibition as they relate to the oncogenic vIVb mutation. We observe increased protection from deuterium exchange for the EGFR 982-1054 region in the HDX-MS heat map (Fig. 3) relative to the rest of the C-terminal tail. The application of HDX-MS allows us to locally characterize conformational properties for protein regions that may not be easily analyzable with crystallography. In conclusion, these methods demonstrate that the EGFR and HER3 CTTs are IDRs with extended, non-globular structure in solution, and this finding may have important implications for the recruitment of downstream signaling proteins and signal propagation from these RTKs.
Hydrogen/Deuterium Exchange Mass Spectrometry-In HDX-MS experiments, we used a database of pepsin-digested peptide fragments found and identified via mass spectrometry to measure localized deuterium uptake in a protein. EGFR, HER2, and HER3 ICH construct stock solutions were each transferred into 200 l of 0.1% TFA to a concentration of 1 M and passed through an immobilized pepsin column (59). 30 l of each collected digested protein eluent was analyzed by electrospray tandem mass spectrometry (MS/MS) using an LTQ FT Ultra mass spectrometer (Thermo Finnigan). A Mascot peptide search was used to build three peptic peptide databases, one for each protein, for subsequent MS analyses.
In the HDX experiments, EGFR, HER2, and HER3 ICH stocks were diluted into an H 2 O solvent protein storage buffer of 20 mM HEPES, 150 mM NaCl, pH 7.4. The concentration of protein prior to labeling was between 0.1 and 0.2 mg ml Ϫ1 for all experiments. A matching D 2 O solvent labeling buffer with 20 mM HEPES, 150 mM NaCl, pD 7.4, was used to label the protein samples for a set exposure time at room temperature.
HDX samples were prepared in triplicate as two sets. The first set was prepared via quench flow (QF) using a 1:5 volume ratio between protein sample in H 2 O solvent protein storage buffer and D 2 O labeling buffer with D 2 O exposure times between 108 and 2333 ms (36). For the second set of samples with D 2 O exposure times between 5 and 7200 s, protein samples were diluted in a 1:20 ratio of 2 l of H 2 O protein storage buffer and 40 l of D 2 O labeling buffer via manual labeling (ML).
In each sample, following the measured exposure time for exchange, a quench solution of equal volume to the D 2 O buffer of 0.4% formic acid and 3 M urea was used to lower the sample pH to about 2.6 to halt the forward exchange reaction and minimize back-exchange of deuterium-labeled amides for solvent hydrogen. Quenched samples were then flash frozen in liquid nitrogen and stored at Ϫ80°C until we were ready to run online digestion and HPLC separation. Pepsin digestion was performed via an immobilized pepsin column. Separation of peptides was accomplished using a water/acetonitrile gradient across a Poroshell 120 EC-C 18 2.1 ϫ 50-mm analytical column (Agilent Technologies) prior to mass spectrometry analysis on a Bruker MaXis 4G electrospray ionization quadrupole timeof-flight mass spectrometer (Bruker Corp.). All chromatography solvents contained 0.1% formic acid.
HDX-MS data were analyzed using HDExaminer software (Sierra Analytics). A measurement of deuterium uptake, ⌬m, was made by calculating the average mass for a specific peptide's isotopic envelope at each HDX time point, m t , and subtracting the average mass of an undeuterated control's isotopic envelope, m und . Samples were analyzed in triplicate, and mass increase measurements presented throughout this work are the average uptake measurements for available data from these triplicate sample runs, ⌬m avg . The maximum number of exchangeable amide hydrogens, n amide , was calculated for each observed peptic peptide in the peptide databases. The first two residues are not included among the exchangeable amide hydrogens due to rapid back-exchange of the deuterium label for these residues (32,60,61). Similarly, proline residues are excluded because of their lack of peptide chain amide hydrogen.
Because A deuterium uptake heat map for EGFR, HER2, and HER3 was generated using the residue by residue average, ͗%n ex ͘, of representative peptides' %n ex measurements at each time point (Fig. 3) (36, 62). This residue-specific averaging shows more localized HDX information across entire ICH protein sequences and accounts for overlapping peptide data. Peptides were not weighted based on their respective lengths. The sequences of EGFR, HER2, and HER3 were each aligned to crystallographic secondary structure assignments of respective kinases domains. The crystal structures used were Protein Data Bank codes 3PP0 for HER2, 3KEX for HER3, and 2GS2 for EGFR (63)(64)(65).
Expression and Purification of EGFR CTT and HER3 CTT-DNA encoding residues 961-1186 of the human EGFR were cloned into a pET30b vector using NdeI and XhoI restriction sites in-frame with a C-terminal His tag in the vector. The E. coli strain BL21(DE3) was used for expression of the construct. Transformed cells were grown in Luria-Bertani medium containing 50 g ml Ϫ1 kanamycin at 37°C until an OD 600 of between 0.5 and 0.8 was reached. Expression was induced through the addition of 0.5 mM isopropyl ␤-D-1-thiogalactopyranoside (IPTG) for 3 h at 37°C. Bacterial cells were centrifuged, and the pellets were frozen. Cell lysis was performed at 4°C with a probe sonicator on frozen pellets using 50 mM sodium phosphate, pH 8.0, 300 mM NaCl, 20 mM imidazole, 1 mM phenylmethylsulfonyl fluoride (PMSF), 1 mM ␤-mercaptoethanol, Roche Applied Science protease inhibitor mixture tablets, 4 mM benzamidine⅐HCl. The lysate was clarified by centrifugation at 48,000 ϫ g for 30 min at 4°C. Purification was performed using Ni 2ϩ -chelating chromatography followed by gel filtration chromatography on a Superdex-75 column. The protein was concentrated to 5 mg ml Ϫ1 in 20 mM phosphate, 150 mM sodium chloride, pH 8.0. Aliquots were snap frozen in liquid N 2 and stored at Ϫ80°C. Protein concentration was determined using UV-visible spectroscopy with a molar extinction coefficient of EGFR CTT at 280 nm of ⑀ ϭ 18,910 M Ϫ1 cm Ϫ1 . For specific subsequent applications, the addition of DTT or tris(2-carboxyethyl)phosphine to EGFR CTT samples was critical. A similar procedure was followed to express and purify a HER3 CTT construct with residues 981-1342. The HER3 CTT was purified at a concentration of 2 mg ml Ϫ1 in 20 mM Tris, 300 mM sodium chloride, 5% (w/v) glycerol, pH 8.0. HER3 CTT concentration was determined using a molar extinction coefficient at 280 nm of ⑀ ϭ 25,245 M Ϫ1 cm Ϫ1 . Two EGFR tail region-specific phosphotyrosines were analyzed by Western blotting. Following the phosphorylation reaction, EGFR CTT samples were separated by SDS-PAGE on a 10% polyacrylamide gel followed by electroblotting onto nitrocellulose membrane. Two anti-EGFR phosphotyrosine primary antibodies were used in separate Western blotting analyses: 1:1000 anti-EGFR Tyr(P)-1068 and 1:5000 anti-EGFR Tyr(P)-1173. Secondary antibody was 1:5000 anti-rabbit in both blots. The results of the Western blotting analyses are shown in Fig.  4C. The positive control was also analyzed on the same two Western blots as the EGFR CTT samples. Bands from this positive control were observed on both Western blots but are not shown herein.

C-terminal Tail Function via Western Blotting Analysis-
HER3 CTT was phosphorylated via the same buffer conditions as the EGFR CTT above. HER3 CTT at 0.50 M was incubated in the presence of 0.50 M EGFR or 0.50 M HER2 kinase domains. Negative controls were incubated under the same conditions where only CTT or kinase domain was incubated in each case. Positive controls pairing HER3 ICH with EGFR kinase domain or HER2 kinase domain were also incubated. Samples were separated by SDS-PAGE and electroblotted onto nitrocellulose membrane. 1:1000 anti-HER3 Tyr(P)-1289 primary with 1:10,000 anti-rabbit secondary antibodies were used for the first blot. A 1:1000 general anti-phosphotyrosine antibody, PY20, was used for the second blot with 1:10,000 antimouse secondary antibody. The two Western blots are shown in Fig. 4D.
Interaction of Phosphorylated EGFR CTT and GST-Grb2-SH2 by Pulldown Assay-EGFR CTT was phosphorylated by EGFR KD in vitro for 60 min at room temperature. 10 g of GST-Grb2-SH2 was prebound to a 10-l bed volume of glutathione-agarose beads. 2 g of either phosphorylated CTT or non-phosphorylated CTT was added to the beads in a volume of 100 l of buffer and incubated for 2 h at 4°C. Beads were then spun briefly and the flow-through (FL) was collected. This was followed by two washes with buffer. Beads (bound) and aliquots of the FL were then boiled in SDS-PAGE sample buffer. Western blotting was done on the fractions using anti-His 6 -HRP antibody (Thermo Fisher, MA1-21314-HRP) as a probe.
Circular Dichroism Spectroscopy-EGFR CTT protein was prepared at a concentration of 5 M in 10 mM sodium phosphate, pH 7.4, 5 mM NaCl. HER3 CTT was also prepared at the same conditions. A 0.1-cm pathlength quartz cell was used for samples and buffer blanks. Cells were placed in a Jasco CD spectrometer set at room temperature. The cell in the CD spectrometer was allowed to equilibrate for 15 min with nitrogen flowing into the CD spectrometer before scans. The CD spectrometer scanned from 260 to 190 nm at 20 nm min Ϫ1 . Five individual scans were averaged, and the buffer blank was subtracted before analysis.
The CD spectra were analyzed on the DichroWeb online analysis website (41,66). The method chosen for this analysis used the CDSSTR algorithm with reference database 7 as the reference protein set (67)(68)(69)(70). Secondary structure assignment was divided into six categories: regular ␣-helix (␣ R ), distorted ␣-helix (␣ D ), ␤ R , ␤ D , turns, and unordered. CDSSTR structure assignment results are presented in Fig. 5C.
Size Exclusion Chromatography with Multiangle Light Scattering-100 l of EGFR CTT or HER3 CTT (A 280 ϳ 1) was loaded onto a pre-equilibrated Superdex-75 10/300 GL (GE Healthcare) column. Elution buffer of 20 mM sodium phosphate, pH 8.0, 150 mM NaCl, 0.5 mM DTT was used for EGFR CTT, and 20 mM Tris, 300 mM sodium chloride, 5% (w/v) glycerol, pH 8.0 was used for HER3 CTT. The buffer flow rate was 0.5 ml min Ϫ1 for the EGFR CTT and carbonic anhydrase elutions. For the HER3 CTT and EGFR KD elutions, the buffer flow rate was 0.3 ml min Ϫ1 . The column was coupled to a Wyatt Optilab rEX and Dawn Helios II, which are refractive index and multiangle light scatter detectors, respectively. Carbonic anhydrase was used as a globular reference for the EGFR CTT elution because of the similar molecular weights of the proteins. Similarly, the EGFR KD was used as a reference for the HER3 CTT elution. Astra (Wyatt) software was used to calculate the molar mass.
Dynamic Light Scattering-EGFR CTT samples at a concentration of 0.5 mg ml Ϫ1 in 20 mM Tris, 150 mM NaCl, 1 mM DTT, pH 8.0, were centrifuged for 5 min. 12 l of sample was aliquoted into a clean microcuvette, which was placed into a Wyatt Dynapro MSX. The cuvette was equilibrated for 5 min at 25°C before data acquisition. BSA was used as a standard at 1.0 mg ml Ϫ1 in the same buffer. Dynamics software was used to determine the radius of the protein.
Analytical Ultracentrifugation-Samples were dialyzed versus buffer (20 mM sodium phosphate, pH 8.0, 150 mM NaCl, 1 mM DTT) at 4°C overnight. The samples and buffer were recovered and briefly centrifuged before loading into centrifuge cells. The cells were balanced and then placed in an eight-position rotor, which was put into a Beckman XLA ultracentrifuge. The speed was 42,000 rpm, and the temperature was 25°C. The SEDFIT program was used to analyze the data, and the resulting c(s) analysis is shown in Fig. 6, E and F (71).
Small Angle X-ray Scattering Analysis-SAXS experiments were performed on the SIBYLS beamline 12.3.1.2 at the Advanced Light Source, a national user facility operated by the Lawrence Berkeley National Laboratory (Berkeley, CA) and supported by the Director, Office of Science, Office of Basic Energy Sciences of the United States Department of Energy under Contract DE-AC02-05CH11231 (72). To optimize the data quality and minimize radiation damage, exposure series of 0.5, 1, 2, and 5 s were performed. The concentration of EGFR CTT (20 mM phosphate, pH 8.0, 150 mM NaCl) used for these experiments ranged from 0.9 to 3.7 mg ml Ϫ1 . BSA in 20 mM phosphate, pH 8.0, 150 mM NaCl was used as a standard in the range of 1.0 -3.0 mg ml Ϫ1 . The concentration of HER3 CTT (20 mM Tris-HCl, pH 8.0, 300 mM NaCl, 5% glycerol) ranged from 0.9 to 1.5 mg ml Ϫ1 . A minimum of three different concentrations for each protein were used for data collection, and five different concentrations were used for EGFR CTT.
The data sets were processed using standard procedures for ATSAS programs (52). At low angles, the scattered intensities of EGFR CTT and HER3 CTT were very well approximated by the Guinier law, whereas HER3 CTT showed signs of radiation damage at 5-s exposures. Before further analysis, the scattering curves, free of radiation damage, from a given type of sample with the same protein concentration but different exposure times were merged with either PRIMUS (52) or SCÅTTER (SIBYLS Beamline at Lawrence Berkeley National Laboratory) and then averaged among the same type of sample after normalization using their individual protein concentration. The average scattering curves are presented in Figs. 7A and 8A. The R g from data sets free of radiation damage was calculated using AutoRg and is presented in Table 2. We evaluated the molecular weight of the sample by comparing the forward scattering I(0) with that from a reference solution of BSA. The R g and D max values for EGFR kinase domain (Protein Data Bank code 1XKK) were calculated using the program CRYSOL (73). The pair distribution functions for EGFR CTT and BSA were generated using Datgnom (52).