Hairpin RNA-induced conformational change of a eukaryotic-specific lysyl-tRNA synthetase extension and role of adjacent anticodon-binding domain

Human lysyl-tRNA synthetase (hLysRS) is essential for aminoacylation of tRNA Lys . Higher eukaryotic LysRSs possess an N-terminal extension (Nterm) previously shown to facilitate high-affinity tRNA binding and aminoacylation. This eukar-yote-specific appended domain also plays a critical role in hLysRS nuclear localization, thus facilitating noncanonical functions of hLysRS. The structure is intrinsically disordered and therefore remains poorly characterized. Findings of previous studies are consistent with the Nterm domain undergoing a conformational transition to an ordered structure upon nucleic acid binding. In this study, we used NMR to investigate how the type of RNA, as well as the presence of the adjacent anticodonbinding domain (ACB), influences the Nterm conformation. To explore the latter, we used sortase A ligation to produce a segmentally labeled tandem-domain protein, Nterm – ACB. In the absence of RNA, Nterm remained disordered regardless of solution was assessed by sedimentation velocity analytical ultracentrifugation. For these experiments, NtermW, a UV-visible form of Nterm with a single Trp residue appended to its C terminus, was used (see “ Experimental procedures ” ). (64). Folded, labeled RNAs (20 n M ) were incubated with increasing amounts (0 – 2000 n M ) of ACBCAT, Nterm – ACB native or FL-hLysRS in 20 m M Tris-HCl, pH 8, 15 m M NaCl, 35 m M KCl, and 1 m M MgCl 2 . The reactions were incubated at room temperature in the dark for 30 min. The samples are excited at 485 nm, and FA and fluorescence intensity at 525 nm was measured using a SpectraMax M5 plate reader (Molecular Devices). The data and

Human lysyl-tRNA synthetase (hLysRS) is essential for aminoacylation of tRNA Lys . Higher eukaryotic LysRSs possess an N-terminal extension (Nterm) previously shown to facilitate high-affinity tRNA binding and aminoacylation. This eukaryote-specific appended domain also plays a critical role in hLysRS nuclear localization, thus facilitating noncanonical functions of hLysRS. The structure is intrinsically disordered and therefore remains poorly characterized. Findings of previous studies are consistent with the Nterm domain undergoing a conformational transition to an ordered structure upon nucleic acid binding. In this study, we used NMR to investigate how the type of RNA, as well as the presence of the adjacent anticodonbinding domain (ACB), influences the Nterm conformation. To explore the latter, we used sortase A ligation to produce a segmentally labeled tandem-domain protein, Nterm-ACB. In the absence of RNA, Nterm remained disordered regardless of ACB attachment. Both alone and when attached to ACB, Nterm structure remained unaffected by titration with single-stranded RNAs. The central region of the Nterm domain adopted a-helical structure upon titration of Nterm and Nterm-ACB with RNA hairpins containing double-stranded regions. Nterm binding to the RNA hairpins resulted in CD spectral shifts consistent with an induced helical structure. NMR and fluorescence anisotropy revealed that Nterm binding to hairpin RNAs is weak but that the binding affinity increases significantly upon covalent attachment to ACB. We conclude that the ACB domain facilitates induced-fit conformational changes and confers high-affinity RNA hairpin binding, which may be advantageous for functional interactions of LysRS with a variety of different binding partners.
Aminoacyl-tRNA synthetases (ARSs) catalyze the addition of amino acids to their cognate tRNA substrates. These essential enzymes are critical for maintaining high fidelity in protein synthesis. Human lysyl-tRNA synthetase (hLysRS) is responsi-ble for aminoacylation of all tRNA Lys isoacceptors with lysine. This enzyme is part of a large multi-ARS complex (MSC) containing nine synthetase functions and three other cellular factors known as ARS complex-interacting multifunctional protein 1 (AIMP1), AIMP2, and AIMP3 (1,2). Within the MSC, LysRS is present as a tetramer (dimer of dimers) but is a functional dimer in its aminoacylation activity (3).
In addition to their essential role in protein synthesis, most eukaryotic ARSs both inside and outside the MSC have been shown to play critical roles in a wide variety of noncanonical cellular functions (4)(5)(6)(7)(8)(9). Stimulation of mammalian cells by a variety of signals has been shown to trigger release of hLysRS from the MSC and relocalization. For example, Shiga toxins trigger secretion of LysRS, which increases the production of proinflammatory molecules (10); IgE stimulation of mast cells triggers phosphorylation of LysRS on Ser 207 , release from the MSC, and trafficking to the nucleus where it activates transcription of specific genes (11); a similar transformation of LysRS has recently been reported to occur upon epidermal growth factor receptor signaling in human lung cancer (12). Following stimulation of human cells with laminin, LysRS is phosphorylated on Thr 52 , released from the MSC, and trafficked to the plasma membrane, where it interacts with the laminin receptor, protecting it from ubiquitin-mediated degradation (13).
Human LysRS also plays a role in the HIV-1 life cycle (14). HIV-1 reverse transcriptase uses human tRNA Lys3 as a primer to initiate reverse transcription. This tRNA, as well as the other Lys isoacceptor, tRNA Lys1,2 , are selectively incorporated into HIV-1 particles (15) together with hLysRS (16)(17)(18). Recently, it was reported that HIV-1 infection results in a free pool of phosphorylated LysRS/tRNA Lys that is available for interaction with the retroviral packaging machinery (19). A portion of the free LysRS is also relocalized to the nucleus of target cells, which is consistent with the proposed Ser 207 phosphorylation site, although the function of nuclear LysRS in the HIV-1 life cycle is not yet clear. Collectively, these findings indicate the potential for hLysRS and other ARSs to serve as therapeutic targets for infectious disease (20,21). Human LysRS is a class IIb ARS consisting of three domains. The tRNA anticodon sequence is a major recognition element for hLysRS, and the anticodon-This article contains supporting information.
binding domain (ACB) is responsible for specific tRNALys binding (22). The C-terminal catalytic domain carries out amino acid activation and tRNA aminoacylation. Sequence alignments of LysRS from all domains of life reveal a high degree of conservation, with a 70-residue N-terminal extension (Nterm) found only in higher eukaryotic enzymes (23,24).
Nterm plays key roles in both canonical and noncanonical hLysRS functions. The N-terminal extension was shown to be responsible for increased aminoacylation activity, primarily because of improved tRNA-binding affinity, although it is not directly involved in catalysis (3,25). An acceptor stem-derived minihelix Lys is only charged by the full-length enzyme, not by a variant lacking Nterm; thus, it was proposed that the function of the Nterm is to improve the docking of the CCA end of tRNA into the active site (25). Moreover, covalent continuity between Nterm and the core catalytic domain is required to confer robust binding (26). With respect to noncanonical functions, Nterm contains the nuclear localization signal (27), and deletion of this domain negatively impacts the recruitment of tRNA Lys3 into HIV-1 particles (17). These data suggest that this polypeptide extension plays a key role in HIV-1 infectivity and other nuclear functions of hLysRS. The potential function of this domain as a mediator of synthetase interactions within the MSC has also been proposed previously (28,29).
Previous NMR and x-ray structures of eukaryotic Nterms of other class IIB synthetases have been reported (30)(31)(32)(33). A 23residue peptide corresponding to residues 30-52 of Saccharomyces cerevisiae aspartyl-tRNA synthetase (AspRS) was shown to transition from a random coil to a more a-helical structure only after the addition of polyphosphate or trifluoroethanol (30). A similar structural transition was reported for a 21-residue peptide derived from human AspRS Nterm (31). An NMRderived structure of a 110-residue polypeptide corresponding to the N-term of Brugia malayi asparaginyl-tRNA synthetase (AsnRS) indicated that this peptide extension adopts a mostly a-helical fold with some b-strand structure (32). The x-ray structure of the Nterm of human AsnRS reveals that it is structurally quite similar to that of the folded structure of the B. malayi AsnRS, despite low sequence similarity (,27%) (33). Aside from the SKXXLKKXXK motif reported previously (34), there is little sequence identity between the above-mentioned class IIb Nterm domains by global alignment (,31%) (35).
Despite the functional significance of the hLysRS Nterm domain in both canonical and noncanonical roles, its threedimensional structure has remained poorly characterized. This domain was absent from the crystallized form of hLysRS used for X-ray structure determination (3). Thus, the structure of hLysRS Nterm, free or in complex with binding partners, is still unknown. Although this extension is predicted to adopt a helical structure (26), spectroscopic studies have shown that the domain is mostly disordered when free in solution and only becomes helical upon interaction with trifluoroethanol (36), a short RNA hairpin (37), or polyphosphate (38).
Nterm binding to a short RNA derived from the anticodon domain of human tRNA Lys3 (anticodon stem-loop, or ACSL) was determined previously by NMR to induce helical Nterm structure; however, the RNA specificity of this effect has not been extensively examined (37). Additionally, the impact of appending Nterm to the adjacent ACB domain of hLysRS, which is the naturally occurring context for Nterm functional interactions, is unknown. Here, we investigated these questions through use of a tandem domain comprised of Nterm linked to the ACB domain (Nterm-ACB).
To specifically monitor the NMR signals from the Nterm domain only within the two-domain construct, a segmental labeling procedure must be employed. An enzymatic ligation procedure using sortase A was chosen for this purpose (39,40) because sortase A ligation is carried out using gentle and physiologically compatible conditions. Furthermore, several enzymatically optimized sortase A variants have been produced to facilitate increased efficiency of this type of protein ligation (41)(42)(43). The overall ligation process involves the expression and purification of the individual protein domains, wherein the domain of interest is isotopically enriched, followed by ligation of them in vitro to produce a segmentally labeled protein construct (44,45).
In this paper, the structural transition of this eukaryotic extension is investigated to determine its dependence upon the type of RNA, as well as its covalent attachment to its native neighboring domain, the ACB domain. In this way, we can evaluate whether the Nterm structural change is truly nonspecific and whether it depends upon its adjacent ACB domain. Overall, this study provides new insights into RNA-induced structural changes in a eukaryotic-specific ARS domain, with implications for both canonical function and novel therapeutic applications.

Nterm assembly state in solution
All of the hLysRS-derived constructs used in this work are shown in Fig. S1. The assembly state of hLysRS Nterm in solution was assessed by sedimentation velocity analytical ultracentrifugation. For these experiments, NtermW, a UVvisible form of Nterm with a single Trp residue appended to its C terminus, was used (see "Experimental procedures"). NtermW sedimented as a single monomeric species, with a corrected sedimentation coefficient s 20,w of 0.98 S and an estimated molecular mass of 7.1 kDa (Fig. 1A). The fitted frictional ratio was 1.36, indicating that NtermW sediments as a relatively compact protein. No significant population of higherorder multimers or aggregate species was observed.
Nterm conformational changes upon RNA binding  (37). In contrast, the resonances corresponding to the ACB domain are well-dispersed and highly resolved, consistent with a well-structured a/b OB-fold protein (3,46). NMR studies were next conducted to assess the effects of RNA binding upon the backbone structure of Nterm. Titration of Nterm with the nonanucleotide U9 resulted in only minor perturbations of the Nterm resonances ( Fig. 2A). Similarly, titration of NtermW (functionally similar to Nterm, see "Experimental procedures") with C9 resulted in only very small changes (Fig. S2C). Thus, binding to single-stranded RNAs has little impact on overall Nterm structure. We next tested the effect of hairpin RNAs derived from human tRNA Lys3 on Nterm structure. One RNA was derived from the acceptor-TCC stem (ACC), and the other was derived from the ACSL region of this tRNA (Fig. 3). In contrast to the results obtained for U9 and C9, titration of Nterm with the ACC and ACSL RNAs resulted in significant changes in the Nterm HSQC spectrum (Fig. 2, B and C). As previously reported (37), nearly half (47%) of the Nterm residues exhibited significant backbone resonance changes (D obs  Nterm titration with ACC yielded similar effects (Fig. 2C); nearly half of the Nterm backbone resonances exhibited fast exchange effects upon ACC titration and the residues most strongly affected include Lys 17 -Leu 48 (D obs .0.02). Some of the most resolved resonances in this region are indicated in the HSQC spectra (Fig. 2C, arrows). Thus, despite their sequence and length differences, both of the tRNA Lys3 -derived hairpins resulted in a similar pattern of chemical shift perturbations (CSPs) upon Nterm binding. Within the region of residues Lys 17 -Ala 51 , the ACC CSP values were relatively larger than those of the corresponding ACSL values.
An overall summary of the patterns of Nterm resonance chemical shift perturbations caused by ACC and ACSL RNA binding is provided in Fig. 4A. In this figure, the deviations of the Nterm resonance frequencies caused by ACC or ACSL binding from their random coil values are plotted for all the observable resonances. For these resolvable resonances, positive frequency deviations were observed, most notably involving residues 17-51, regardless of which RNA was used.
Psipred v3 was used to predict the secondary structure of Nterm based on primary sequence alone (49). The results of this analysis indicate that Nterm may adopt a long helix comprised of residues 20-54 or 52% of the sequence (Fig. 4B, top line). The average confidence of this helical assignment is 6 of 9, indicating only a midrange confidence score for formation of this long helix. This central helix was also previously predicted, along with a short helix and strand near the beginning of LysRS protein using an earlier version of Psipred (37).
NMR chemical shift frequency data provide semiempirical estimates of protein secondary structure via use of database algorithms such as CSIpred (50) and TALOS1 (51). CSIpred is based on comparison of the experimentally determined NMR backbone a and carbonyl carbon resonance chemical shifts versus those in a database of known protein structures. TALOS1 is another NMR-based algorithm that predicts protein , and ACC RNA hairpin (C, blue). All three panels show the same amide backbone region of the spectra. In A, one chemical shift perturbation is visible with a probable identity based on free *Nterm assignment. In B and C, selected chemical shift perturbations, all confirmed by NMR assignment of *Nterm/RNA complexes, are shown by black arrows.
Hairpin RNA-induced conformation change of LysRS extension backbone torsion or (c,f) angles based on protein backbone atom resonance chemical shift values. Using the NMR experimental data for these two Nterm-RNA complexes (previously deposited in the Biological Magnetic Resonance Bank (BMRB); see "Experimental procedures") and each of these algorithms, the structural propensities of Nterm in its free and RNA-  bound states were estimated. CSIpred results indicate that Nterm alone in solution contains a small percentage of a-helical structure (Ser 19 -Leu 26 , black) but no other types of secondary structure (Fig. 4B). TALOS1 also indicates that this extension adopts only a very short region of helical structure (residues Glu 22 -Arg 27 ; Fig. 4B). Both NMR predictions (TALOS1 and CSIpred) indicate that the helical region begins at a similar location, but the helix itself is 30 residues shorter relative to the Psipred prediction (Fig. 4B).
The NMR CSP values that were measured for the Nterm-ACC and Nterm-ACSL complexes were next used in conjunction with these algorithms to estimate the overall fold of the Nterm backbone in the presence of these RNAs. Both TALOS1 and CSIpred analyses revealed a significant increase in the percentage of helical secondary structure for Nterm when it forms a complex with either RNA hairpin (Fig. 4B). In the case of CSIpred and relative to Nterm alone, this helical structure is extended by at least 20 (ACSL) or 24 (ACC) residues. This helix is extended by 19 (ACSL) or 22 (ACC) residues in the case of TALOS1. Because only a few CSPs were observed in the case of Nterm titration with either U9 or C9, the calculations of Nterm secondary structure using TALOS1 and CSIPred were not performed. These data suggest that the secondary structure of the extension is unchanged relative to free Nterm upon addition of these linear RNAs.
To investigate whether Nterm is affected by other, non-tRNA Lys -derived hairpin RNAs, NMR titration of Nterm with a 23-nucleotide RNA hairpin termed TLE4C (tRNA-like element 4C) was also carried out. This RNA is derived from an HIV-1 genomic RNA element previously shown to bind hLysRS lacking the N-terminal domain (DN65-LysRS) (52) (Fig. S3A). The wild-type HIV-1 TLE contains four U residues in the loop that mimic the tRNA Lys3 anticodon loop and contribute to DN65-LysRS binding. These U residues have been mutated to four C residues in TLE4C to abolish specific DN65-LysRS binding (53). HSQC overlays of free Nterm versus Nterm-TLE4C are shown in Fig. S3 (B and C). Interestingly, the trend of perturbed resonances was quite similar to that observed for Nterm binding to the ACC and ACSL RNAs (Fig. S3D). There are some differences in terms of the overall magnitudes of the measured D obs values, which follow the trend ACSL D obs , TLE4C D obs , ACC D obs .
Structural models of the ACSL-and ACC-bound forms of Nterm were calculated using the program CS-Rosetta (54). This program uses NMR chemical shift data as a constraint to calculate structures that are the most consistent with the sequence of the protein and known protein structures using Rosetta methods. The 10 best structures obtained for Nterm bound to ACC or ACSL are shown in Fig. 5 (B and C), respectively. For comparison, the corresponding free Nterm structures calculated using CS-Rosetta are provided in Fig. 5A. The calculated structures for Nterm alone were mostly random coil, because no persistent backbone secondary structure was predicted, based upon the chemical shift perturbations. Thus, NMR changes associated with titration of Nterm with the hairpin RNAs reflect environmental changes of the protein backbone that are consistent with increased helical structure. To better visualize the physicochemical properties of such a folded helix involving these residues, a helical wheel projection was generated (Fig. 5D). This projection shows that positively charged residues within the Nterm sequence are aligned along one face of the helix.
CD spectroscopy also has the potential to monitor protein conformational changes caused by nucleic acid binding (55). It was previously observed by CD that Nterm structural changes occur upon addition of ;1000-fold molar excess of polyphosphate (38). Here, we used CD to also monitor the effects on Nterm structure caused by binding of specific RNAs: ACC, ACSL, and U9. In the absence of RNA, the Nterm CD spectrum is similar to that expected from a random coil polypeptide with a deep, negative minimum at ;204 nm ( Fig. 6). In the presence of an up to 2-fold molar excess of U9 RNA, the spectrum changes only slightly with the negative peak shifting to a longer wavelength (;206 nm), indicating a largely unchanged conformation of Nterm. ACSL addition resulted in a much larger change; a significant increase in ellipticity of the negative peak was observed along with a further shift in the minimum to longer wavelength (;208 nm). The observed differences in the 208-nm peak are consistent with the increased a-helical character of Nterm in the presence of ACSL relative to U9. Of all the RNAs tested, ACC addition resulted in the most dramatic CD spectral changes. In this case, two minima were observed: a weaker band at ;209 nm and a more intense signal at ;220 nm along with a maximum at ;200 nm. Overall, these CD changes are consistent with a conformational shift toward greater a-helical structure in the Nterm peptide as a result of RNA hairpin binding. The longer ACC hairpin induced a greater conformational change than the shorter ACSL hairpin. Comparison of these results to the previous polyphosphate CD study indicates that the effects caused by polyphosphate addition were relatively weak, especially given the much larger (;1000-fold) molar excess used in the latter studies (38).
NMR was used to estimate Nterm dissociation constants (K d ) with respect to the three hairpin RNAs studied (ACC, ACSL, and TLE4C). The binding constants for all of these Nterm-RNA complexes were calculated using the CSPs of four of the best-resolved Nterm-RNA-bound NMR resonances: the upfield Asn 21 side-chain amide and the backbone amide resonances of Lys 17 , Als 29 , and Als 34 . To accommodate the sigmoidal character of the CSP versus ligand concentration curves shown in Fig. 7, the data were fit using the Hill equation, as described under "Experimental procedures." The average Hill coefficient value was 2 for every RNA-binding partner studied, indicating positive cooperativity of binding. Based upon these results, the trend from highest to lowest binding affinity is Nterm-ACC (9.

NMR studies of segmentally labeled *Nterm-ACB
To establish how the structure and RNA-binding properties of Nterm are affected by the adjacent hLysRS ACB domain, segmental 15 N-labeling of the Nterm domain in the context of the two-domain Nterm-ACB protein construct was performed (Fig. S1). The NMR study of *Nterm-ACB ligated (where the asterisk indicates the labeled domain) allows us to selectively Hairpin RNA-induced conformation change of LysRS extension examine the effects of RNA binding on Nterm structure in the presence of the adjacent domain. Ligation of *Nterm to the unlabeled ACB domain to produce *Nterm-ACB ligated was accomplished via sortase A. The protein substrate sequence requirements of sortase A mean that this ligation is best performed on proteins with domains separated by flexible linkers. This reaction also requires a pentapeptide motif, LPXTG, on the Nterm for recognition and at least one N-terminal Gly on the C-terminal substrate (ACB) for nucleophilic attack (40). Based upon the primary sequence of hLysRS and the X-ray crystal structure of N-terminally truncated LysRS (3), residues 65-69 were identified as part of the interdomain linker between the Nterm and ACB domains. Therefore, a variant of the N-terminal extension containing three mutations involving this linker region (GPEEE to LPETG) was prepared to produce a sortase-compatible substrate (see "Experimental procedures"). In addition, an extra Gly was introduced into the C-terminal Figure 5. A-C, the 10 best CS-Rosetta structures calculated for Nterm alone (A) and Nterm bound to ACC (B) and ACSL (C) RNA hairpins. The backbone structure of Nterm is mostly random coil in all three structures (gray). The central helix adopted by Nterm (orange) upon binding to RNA is orientated the same way in both B and C. Residues 19-21 (near the start of the helix) and residues 45-47 (near the end of the helix) are shown in blue and green, respectively. The Nterm backbone structures shown here are rendered as ribbons using Chimera. D, helical wheel projection of Nterm residues Ser 19 -Glu 45 with residue type indicated by color and shape as indicated. A highly polar and mostly positively charged surface is formed as a result of helix formation involving residues Arg 25 , Lys 32 , Asn 21 , Lys 28 , Lys 24 , Lys 31 , and Lys 20 in particular. The helical wheel was generated using NetWheels. substrate (i.e. the ACB domain) to improve sortase efficiency. Thus, the Nterm-ACB ligated construct differs from the native two-domain protein by four amino acids (GPEEE ! LXXTGG, where the remaining native residues are represented by X, and the new ones are in bold and underlined) within the interdomain linker region (see "Experimental procedures" and Fig. S1).
Overall, the Nterm resonances in the two-domain ligated con-struct were broadened slightly (typically ,5 Hz) relative to those of Nterm alone. The most significant change observed upon ligation is the disappearance of 9 of the 76 original Nterm resonances; this is a result of their cleavage by sortase A (His 76 , Gly 70 , and Gly 69 ) or additional resonance broadening of already broadened resonances (Thr 60 , Asn 21 , Gly 63 , Thr 68 , Lys 20 , and Asn 21 ).
Aside from the above-mentioned NMR resonance broadening effects, the frequencies of most of the ligated Nterm resonances remained very similar to those of the unligated, free form of Nterm. Approximately 90% of the Nterm resonances exhibited little to no frequency shift after ligation to ACB, based upon the HSQC overlay of the central spectral region (Fig. 8). Thus, in the absence of RNA, the HSQC spectra and structure of free Nterm and Nterm in the construct *Nterm-ACB ligated are very similar. The most highly resolved *Nterm-ACB ligated backbone and side-chain resonances (Lys 17 , Leu 18 , Ser 19 , Asn 21 side chain, Lys 24 , Arg 25 , Lys 28 , Ala 29 , Glu 30 , Lys 32 , Val 33 , Ala 34 , Glu 35 , and Ala 38 ) were followed by NMR to monitor RNAbinding effects upon ligated Nterm.
Analyses of *Nterm-ACB ligated interactions with both linear and hairpin RNAs were conducted via NMR titrations. The HSQC spectra of *Nterm-ACB ligated titrated with either of the two linear RNAs, U9 or C9, resulted in very little change of the Nterm resonances (Fig. S2, A and B). No significant CSPs resulted even after titration of *Nterm-ACB with up to a 1.5fold molar excess of each of these linear RNAs.
In contrast to the U9 and C9 linear RNAs, titration with hairpin RNAs ACC and ACSL resulted in significant CSP changes. In Fig. 9, the HSQC regions corresponding to some of the bestresolved resonances (corresponding to residues Lys 17 , Val 33 , and Glu 35 ) are shown. The entire HSQC spectra (corresponding to the regions displayed in Fig. 9, A and B) are shown in Fig.  S6. These Nterm and Nterm-ACB ligated resonances were shifted upon ACC and ACSL RNA binding. Perturbations of the Lys 17 backbone amide proton resonance caused by ACSL  Hairpin RNA-induced conformation change of LysRS extension and ACC binding are shown in Fig. 9 (A and B), respectively. The initial and final resonances measured upon saturation of each protein with these hairpin RNAs are shown. Lys 17 is strongly affected by ACSL based on the significant changes involving an upfield shift of the proton concurrent with a downfield shift of the Lys 17 resonance (Fig. 9A, arrow). In the case of titration with ACC, a similar but more pronounced frequency perturbation of the Lys 17 resonance was observed (Fig. 9B, arrow).
A similar CSP pattern was observed for the Val 33 residue resonance of Nterm and Nterm-ACB ligated (Fig. 9, C and D). Saturation with ACSL resulted in an upfield shift of both the proton and nitrogen frequencies of the Val 33 resonance as indicated by the arrow in Fig. 9C. As observed for Lys 17 , the CSP pattern of the Val 33 resonance is similar whether the titrant is ACSL or ACC, although a somewhat larger perturbation was observed for the latter (Fig. 9D). The pattern of ligated Nterm Glu 35 resonance frequency changes (Fig. 9, E and F) is similar to those of Lys 17 and Val 33 ; the overall directions of the 1 H and 15 N chemical shift perturbations were the same for ACC, and once again, the overall magnitude of the observed CSP changes was greater in the case of ACC relative to ACSL binding. A summary of the CSP changes observed for all of the resonances monitored is provided in Fig. 10. Relative to ACSL, ACC binding to the Nterm and Nterm-ACB ligated resulted in larger D obs values for the majority of the monitored backbone resonances that did not disappear because of intermediate exchange (see below). In general, the lower frequency, upfield (Fig. 10, sc right) Asn 21 side-chain amide proton was more greatly shifted relative to the downfield amide proton upon titration with both hairpin RNAs.
Another important NMR difference between Nterm versus Nterm-ACB ligated binding to the hairpin RNAs involves the observed rates of chemical exchange, one source of which potentially includes a change in the rate of Nterm conformational exchange. Another source involves the process of exchange between the free versus the RNA-bound forms of Nterm (56). The type of chemical exchange behavior exhibited by each of the different Nterm and *Nterm-ACB ligated resonances monitored is also summarized in Fig. 10

FA-binding studies
To correlate the observed patterns of chemical exchange behavior and CSPs of Nterm and Nterm-ACB upon hairpin RNA binding with the RNA-binding affinities of these proteins, FA binding assays were carried out. For these experiments, unlabeled forms of Nterm and the two-domain 2D construct Nterm-ACB native were used. The native two-domain construct, Nterm-ACB native , was prepared via direct expression rather than sortase A ligation and differs from the ligated construct by 4 amino acids within the linker region (Fig. S1). The RNA-binding affinities of the Nterm-ACB native and Nterm-ACB ligated proteins are very similar (within 3-fold; data not shown). Binding studies were also carried out with full-length hLysRS (FL-hKRS) and another two-domain construct that lacks Nterm but includes the ACB and catalytic domains (ACBCAT; Fig. S1). Representative FA data obtained for Nterm-ACB native binding to ACC, ACSL, and U9 RNAs are shown in Fig. S5.
The FA-derived RNA-binding data are summarized in Table  1. Because of the relatively weak (micromolar) binding of Nterm to ACSL and ACC, these data could not be reliably determined by FA. Thus, NMR-derived binding constants (as described above) were calculated and included in this table. One general trend that is apparent from these data is that attachment of the ACB domain significantly improves binding of Nterm to these RNAs. The NMR-estimated K d for Nterm binding to ACSL is 45.8 mM, whereas the K d determined for Nterm-ACB binding by FA is 54 nM, corresponding to an 851fold improvement in binding affinity. Significant but less dramatic changes were observed upon binding to ACC with 34-fold increased affinity to the two-domain construct (summarized in Table 1). ACB alone bound to these RNAs with intermediate affinity, with a 3-fold tighter binding to the ACSL relative to the ACC, as expected (Table 1). Whereas binding of U9 to ACB alone was substantial (53), there was no detectable binding of this RNA with the Nterm alone, although binding was observed after covalent attachment of Nterm to the adjacent ACB domain. Binding to C9 was not detected for any of the constructs tested. Appending the catalytic domain to the ACB domain resulted in only modest (up to 2-fold) changes in binding affinity to ACSL and ACC relative to ACB alone (Table 1).

Discussion
The hLysRS extension was previously shown to undergo a disordered to helical transition upon addition of polyphosphate and RNA (37,38). Previous studies of hLysRS have determined its 3D structure lacking the Nterm domain (3), and structural studies of Nterm alone and in the presence of a small RNA have also been carried out (37). Here, we used NMR to specifically probe the effects of different types of RNA molecules on the structure of hLysRS Nterm alone. The effects of these RNAs on Nterm in the context of a two-domain construct containing the adjacent ACB domain via use of a segmentally labeled protein were investigated for the first time.

Effect of RNA binding on Nterm
Consistent with previous studies, we found that hLysRS Nterm alone is mostly unstructured and monomeric in solution but adopts helical structure upon interaction with a hairpin RNA (ACSL) derived from tRNA Lys (37). In this study, we demonstrated that the induction of Nterm helical structure occurred upon binding to several different RNA hairpins, which differed in their sequences and ranged in size from 17 to 35 nucleotides. Although two of the three hairpins were previously demonstrated to bind to LysRS (22), TLE4C does not bind to a LysRS variant lacking the Nterm domain (52) but binds to the N-terminal extension studied here. Nterm remained unstructured when titrated with single-stranded RNAs U9 and C9. Based upon the various RNAs studied here, the induction of

Hairpin RNA-induced conformation change of LysRS extension
Nterm structure appears to depend on the presence of a structured RNA. The induction of Nterm helical structure upon RNA hairpin binding is also consistent with the observation of cooperative Nterm-binding behavior (Fig. 7). Although positive cooperativity suggests the possibility of either a multimeric protein complex or multiple ligand-binding sites, positive cooperativity between monomeric proteins with a single ligand-binding site has been previously reported (57,58). The observed structural shift of Nterm upon RNA hairpin binding is key to our model of binding. The initial binding event occurs between mostly disordered Nterm and RNA hairpin, followed by formation of an a-helix in the center of the protein. The a-helix formation allows increased binding affinity between protein and RNA hairpin.
This cooperativity is present in all three RNA hairpins studied, but based on the observed trend in relative CSP magnitudes caused by ACC, ACSL, and TLE4C binding, there appears to be some correlation with the relative length of the stem portion of these RNA structures; the longer the stem, the greater the observed CSP and induced Nterm helicity ( Fig. 4 and Fig. S3D). Additional studies with a wider variety of nucleic acid types and sequences will be needed to further elucidate the trigger for these conformational effects.
The greatest experimental Nterm changes that were observed resulted from binding to the ACC/ACSL hairpin RNAs. In both cases, the central helix involved residues 19-47, whereas the rest of the protein chain remained mostly disordered. For both RNA hairpins, the same unstructured to helical backbone structural transition involving the Nterm central region occurred; only small RNA-dependent differences were observed in terms of the Nterm residues affected and the overall length of the adopted helix.
Based upon previous studies conducted with the hamster LysRS extension, residues Lys 19 , Lys 23 , Arg 24 , and Lys 27 (corresponding to human LysRS residues Lys 20 , Lys 24 , Arg 25 , and Lys 28 , respectively) were found to be critical for RNA binding and aminoacylation (26). Mutations of the equivalent residues within the human extension resulted in decreased binding to bulk DNA (calf thymus) and total tRNA (38). The capacity of Nterm to adopt helical structure in the presence of polyphosphate was also improved upon replacement of these and other basic residues within the 19-40-residue segment of human Nterm (38). Our NMR studies revealed that these same residues are part of the RNA-induced helix. Based upon their alignment along one face of the helix (Fig. 5D), favorable electrostatic interactions between these positive side chains and the negatively charged RNA backbone likely induce formation of the helix. Furthermore, both the backbone and side-chain group of Asn 21 are strongly affected by interaction with the ACC/ACSL RNAs. This residue is in the vicinity of the basic residues Lys 20 , Lys 24 , Lys 28 , Lys 32 , and Arg 25 , and thus its perturbation may help to localize the Nterm RNA interface.

Effect of RNA binding on Nterm in the context of Nterm-ACB ligated
The NMR titrations of *Nterm-ACB ligated resulted in a pattern of RNA-dependent CSPs that were highly similar to those observed for Nterm alone. Nterm, within the two-domain construct, remained unstructured in the absence of RNA, as well as after titration with the C9 and U9 linear RNAs (Fig. S2). In contrast, major shifts in the NMR spectra of *Nterm-ACB ligated were observed upon titration with ACC and ACSL hairpin RNAs ( Fig. 9 and Fig. S6).
There were several NMR changes observed for Nterm after its ligation to ACB. One was slight broadening of Nterm resonance linewidths, most likely caused by the increase in mass and overall reorientation correlation time (t c ) of the two-domain construct. Another important difference between the hairpin RNA titrations of Nterm versus *Nterm-ACB ligated is the observed shift from exclusively fast (observed for Nterm) to intermediate and slow chemical exchange behavior (ligated Nterm). Such a shift in the chemical exchange rate to slower time scales is consistent with the observed increase in RNAbinding affinity of *Nterm-ACB versus Nterm for these hairpin RNAs (Table 1). In general, slow chemical exchange rates correspond to K d values in the 0.5 to 250 nM range. Higher K d values (400-2000 nM) correlate with intermediate exchange rates, whereas fast exchange is observed in the case of relatively weak binding (.15,000 nM) (59). These chemical exchange rate time scale changes reflect the existence of stronger interactions between the binding partners (Nterm and RNA) that result from ligation of Nterm to ACB. Indeed, the ACC and ACSL hairpin RNA-binding affinity of Nterm was increased significantly (from micromolar to submicromolar) upon ACB attachment as determined by FA (Table 1) (53). Although this increased affinity may be due, at least in part, to the addition of another RNA-binding domain, for both hairpin RNAs, the twodomain construct displayed higher-affinity binding than either domain alone, suggesting a synergistic effect.
High affinity and productive binding to cognate tRNA Lys by LysRS depends both on correct recognition of the anticodon (22) and on the RNA-binding properties of Nterm in the context of full-length LysRS (25). It was previously proposed that hamster Nterm behaves as an independent domain (26) but that the weak RNA binding observed for LysRS-DNterm is enhanced by appending Nterm to ACB-CAT to generate FL-hLysRS (25). We now show that the weak RNA binding observed for human Nterm alone is enhanced to the level of FL-hLysRS simply by adding the adjacent ACB domain (Table  1). Interestingly, we find that Nterm-ACB binding to the ACSL is even tighter than FL-hLysRS binding. Although the reason for this is unclear, it may be due to more effective binding of ACSL to the ACB domain in the shorter construct.
The Nterm extension has been proposed to function to enhance tRNA binding in a "nonspecific" fashion (25,30). The work reported here is not inconsistent with this conclusion. We find that the longer ACC hairpin resulted in more dramatic CSPs and CD changes than the ACSL, especially in the case of the two-domain construct. In addition, single-stranded RNAs do not bind to Nterm in either construct. Thus, Nterm binding appears to depend on double helical structure. Further structural studies of the specific interactions between Nterm with these hairpin RNAs, as well as the precise region of these RNAs affected by Nterm, are underway.
Based on structural studies of eukaryotic class IIb tRNA synthetases, hLysRS is similar to yeast AspRS in that the N-terminal extension also exhibits significant disorder in the absence of ligand (30,31,37,38). The intrinsic disorder of the N-terminal extensions may be required to facilitate interactions with a variety of different binding partners within the MSC and beyond. In contrast, the extensions of B. malayi and human AsnRS adopt a mixed structure even in the absence of RNA. Unlike hLysRS, hAsnRS is not part of the MSC, but its N-terminal extension has been implicated in chemokine interactions (32,33). A conformational trigger involving the hLysRS N-terminal extension, which also contains a nuclear localization signal, is likely to have broad biological implications. Several nuclear roles for LysRS have been reported, including activation of gene transcription upon IgE stimulation of mast cells (11) and epidermal growth factor receptor signaling in human lung cancer cells (12). Nuclear LysRS may also play a role in HIV-1 replication (19). A very recent report suggests that the hLysRS N-terminal extension interacts with RNA-DNA hybrids, thereby delaying activation of the STING (stimulator of interferon genes) protein and attenuating inflammatory responses (60). Thus, Nterm represents a novel therapeutic target, and understanding its conformation and nucleic acid binding properties in a more native context, as reported here, may facilitate future drug discovery efforts.

Protein preparation
The proteins studied herein correspond to single or multiple domains of hLysRS (Fig. S1). All proteins were produced via overexpression in Escherichia coli BL21 (DE3) cells transformed with the following plasmids: (a) GED_rrACBcs (ACB), (b) revmodN (Nterm), (c) pNtermW (NtermW), (d) pNterm-ACB (2D), (e) pACB-CAT (N-terminally truncated LysRS), and (f) pFL-hLysRS (full-length human LysRS). Each of these proteins was expressed as a fusion protein consisting of an Nterminal His 6 tag, a small solubility S tag, and a tobacco etch virus protease recognition site allowing for cleavage of the His 6 and S tags. The various proteins prepared for this study are described briefly below.
Nterm-Plasmid revmodN (37) codes for the following 76residue protein: MAAVQAAEVKVDGSEPKLSKNELKRRL-KAEKKVAEKEAKQKELSEKQLSQATAAATNHTTDNGVL-PETGGHHHHHH. These residues correspond to the first 69 residues of LysRS plus a -G(H) 6 sequence appended to the C terminus to facilitate purification. In addition, residues 65-69 of this peptide (underlined) were modified from GPEEE to LPETG because of sortase A sequence preference.
NtermW-Plasmid pmodNW is the same as rmodNhKRS except that a Trp residue is appended to the C terminus instead of the -GHHHHHH sequence: MAAVQAAEVKVDGSEPKLS-KNELKRRLKAEKKVAEKEAKQKELSEKQLSQATAAATNH-TTDNGVLPETGW. The Trp residue allows the Nterm protein to be monitored by UV at 280 nm for the analytical ultracentrifugation experiments.
2D-Plasmid p2D encodes for the first two domains of hLysRS modified at the C terminus so that a Leu is replaced with two residues: Thr and Gly.
For the NMR experiments, uniformly 15 N-labeled (for RNA titrations) and 13 C, 15 N-doubly labeled (for NMR resonance assignments) forms of the Nterm domain (Nterm and NtermW) were prepared. This was accomplished by transforming cells with rmodNhKRS in M9 minimal medium containing 15 N-labeled ammonium chloride as the sole nitrogen source (singly labeled) or 15 N-labeled ammonium chloride and 13 C-labeled glucose as the sole carbon source (doubly labeled). The remainder of the purification was as described previously (61). The cleaved, purified protein was then concentrated to 50 mM for RNA titrations via ultrafiltration using a final NMR buffer consisting of 20 mM HEPES, pH 6.8, 20 mM NaCl, 1 mM EDTA, 10% D 2 O (v/v), and 0.02% NaN 3 (w/v).
Preparation of sortase A for enzyme ligation was achieved using a plasmid encoding an N-terminally truncated Histagged variant of the enzyme (SrtA DN59 CHis 6 ) and conferring carbenicillin resistance (obtained from Dr. H. Mao, Ansata Pharmaceuticals). E. coli BL21/DE3 cells were transformed with this plasmid, and the enzyme was purified as previously described (44).

Nterm ligation to ACB using sortase A
The procedure used for sortase A ligation to generate the tandem 15 N-labeled *Nterm-ACB ligated two-domain protein was previously described (44). This method is based on the original method described by Mao et al. (40) but optimized for NMR sample preparation. The component proteins ( 15 N-labeled Nterm and unlabeled ACB) were incubated with 5 mM sortase A enzyme at concentrations of 60 and 120 mM of each domain, respectively. The proteins and sortase A were combined and transferred to a 3.5-kDa molecular mass cutoff dialysis tube and dialyzed against the sortase reaction buffer consisting of 50 mM Tris-HCl, pH 7.5, 150 mM NaCl, 10 mM CaCl 2 , and 2 mM b-mercaptoethanol buffer to initiate the reaction at room temperature. The reaction was monitored by SDS PAGE and quenched when the amount of Nterm protein had decreased to a steady-state value, as determined by the relative intensity of the gel band (Fig. S4A). The ligation reaction mixture was then applied to a 6-ml Hi-Trap SP column (GE Healthcare) to separate the components and recover the desired product (Fig. S4B). The fractions containing the desired product were pooled and confirmed to be Nterm-ACB ligated via MALDI-mass spectrometry (Fig. S4C).

Analytical ultracentrifugation
NtermW was employed for the analytical ultracentrifugation study. A 400-ml solution of NtermW (91 mM protein in 20 mM sodium phosphate, pH 8, 20 mM NaCl, 1 mM EDTA) was spun at 48,000 rpm at 25°C in an An-60Ti rotor, and absorbance data were collected at 280 nm. The data were analyzed using Sedfit (62) using the c(s) model to generate a sedimentation coefficient distribution; the molecular weight of the sedimenting species was estimated after fitting for the frictional ratio. Using the program Sednterp (63), a partial specific volume of 0.7382 ml/g was calculated for NtermW based on the amino acid sequence, and the buffer density and viscosity were calculated to be 1.0007 g/ml and 0.9014 cP, respectively, at 25°C.

Preparation of RNA
Hairpin RNA molecules studied by NMR and FA include the ACSL and acceptor stem minihelix (ACC) of human tRNA Lys3 (22), and TLE4C derived from a sequence in the HIV-1 genomic RNA (53). The sequences of these RNAs are shown in Fig. 3 and Fig. S3. Synthetic oligoribonucleotides used for NMR studies were purchased from Integrated DNA Technologies (ACSL, ACC, and U9) and Midland Certified Reagent Company (C9) and stored in diethylpyrocarbonate-treated distilled, deionized water. RNAs used for FA assays were purchased from Dharmacon and stored in diethylpyrocarbonate-treated water. RNAs were refolded in 20 mM HEPES, pH 6.8, 15 mM NaCl, and 35 mM KCl by heating at 80°C for 2 min, 60°C for 2 min, followed by addition of Mg 21 to 10 mM before cooling on ice for at least 30 min prior to measurements.

FA studies
For the FA studies, RNAs were labeled on the 3´-end with fluorescein-5-thiosemicarbazide as described (64). Folded, labeled RNAs (20 nM) were incubated with increasing amounts (0 -2000 nM) of ACBCAT, Nterm-ACB native or FL-hLysRS in 20 mM Tris-HCl, pH 8, 15 mM NaCl, 35 mM KCl, and 1 mM MgCl 2 . The reactions were incubated at room temperature in the dark for 30 min. The samples are excited at 485 nm, and FA and fluorescence intensity at 525 nm was measured using a SpectraMax M5 plate reader (Molecular Devices). The data were fit to the binding quadratic equation, and the dissociation constants were determined as described (65).

CD spectroscopy
CD measurements were performed on a Jasco J-815 spectrometer. Quartz cells (1-mm path length) were used, and the spectra were recorded from 200 to 260 nm at a scanning speed of 100 nm/min. Nterm was dialyzed into CD buffer (30 mM sodium phosphate, pH 7.5) prior to measurement. Nterm (50 mM) was incubated alone or with 25 mM folded RNA (ACC, ACSL, or U9) in the CD buffer at room temperature for at least 30 min to reach equilibrium. The spectra for Nterm alone, RNAs alone, and Nterm-RNA complexes were obtained separately. The effect of RNA binding on Nterm structure was determined by subtracting CD spectra of RNAs alone from the spectra of complexes. The subtracted spectra were then compared with the spectrum of Nterm alone. All the ellipticity data were converted to molar residual ellipticity using deg·cm 2 ·dmol 21 ·res 21 as the unit. The change of the helicity of Nterm upon binding to different RNAs was determined from the shift of the peak near 220 nm (66).

Assignment of Nterm-ACC RNA complex by NMR
The 13 C, 15 N-doubly labeled Nterm protein was concentrated to 200 mM using ultrafiltration (Millipore Ultra-filter 4, 3500-kDa molecular mass cutoff) using a final buffer of 20 mM sodium phosphate, 15 mM sodium chloride, 35 mM potassium chloride, and 10% deuterium oxide at pH 6.0. The renatured ACC RNA was added to a final concentration of 400 mM. The assignment of Nterm protein NMR resonances to their correct Nterm residues within the Nterm-ACC complex was accomplished using a combination of three-dimensional NMR experiments as described previously for Nterm and Nterm-ACSL (37). A total of 87% of all backbone and side-chain resonances of the Nterm-ACC complex were assigned unambiguously (deposited to the BMRB entry 28113). All NMR experiments were conducted at 298 K using a Varian Inova 600 MHz spectrometer.
All 1 H resonance frequencies were directly referenced to internal 4,4-dimethyl-4-silapentane-1-sulfonic acid, an NMR standard used commonly in aqueous NMR studies, whereas the 13 C and 15 N resonances were indirectly referenced (67). For assignment of the Nterm-ACC resonances, the following twoand three-dimensional NMR experiments were recorded using a Varian Inova at 600 MHz: 1 H-15 N HSQC, 1 H-13 C HSQC, HCCH-total correlation spectroscopy, H(CCO)NH, HNCACB, HNCA, HNCO, CBCA(CO)NH, and HN(CO)CA (56). All NMR data were processed using the program NMRPipe (68) and analyzed with the spectra visualization program Sparky (69,70) and NMRview (71). The CS-Rosetta calculations were conducted using Rosetta version 3.8 and version 3.3 of the CS-Rosetta toolbox available at the BMRB server (72). All protein backbone structures were displayed using Chimera (73).

NMR titration of Nterm and *Nterm-ACB with RNA
Experiments were conducted at 298 K using a Bruker DMX-500 MHz NMR spectrometer for all HSQC and RNA titration experiments. The 15 N-labeled protein (either Nterm, NtermW, or *Nterm-ACB ligated ) was titrated with a given RNA by adding the RNA (ACC, ACSL, or U9) sequentially over a series of RNA:protein ratios ranging from 0:1 to 1.5:1. Both Nterm and NtermW have been used for NMR RNA titrations because both bind similarly to RNA, and their HSQC spectra are 98% similar (data not shown). In the case of NtermW titration with C9 RNA, the RNA:protein ratio ranged from 0:1 to 2.4:1. An HSQC spectrum was recorded after each addition of RNA.
CS-ROSETTA (54) was employed to calculate the 100 most probable folded structures of Nterm free and bound to the various RNAs studied. In addition, NetWheels was employed to generate a helical wheel projection of the central Nterm helix (74).

Binding constant derivation from NMR data
Chemical shift perturbations for free Nterm, as well as the ACC or ACSL RNA complexes with Nterm were based upon the frequency differences observed between the free versus RNA-bound Nterm resonances. All of these frequency differences were then catalogued and compared using the equation D obs = [(Dd HN 2 1 (DdN/5) 2 )/2] 1/2 , where D obs can be fit relative to the amount of ligand in solution, as described previously (47). The Fielding equation used for fitting the formula is as follows.
In addition to fitting the NMR data using the hyperbolic fielding equation above, we performed nonlinear fitting using the sigmoidal Hill equation as described below.
Origin 7.5 (OriginLab, Northampton, MA, USA) software was used to fit all of these data using both the Fielding and Hill equations as fitting models.

Data availability
The assignments of the Nterm-ACC complex are deposited to the BMRB entry 28113. The Nterm-ACSL resonance assignments were previously deposited to the BMRB entry 18696. All other data are contained within this article.