Cocrystal Structures of Glycyl-tRNA Synthetase in Complex with tRNA Suggest Multiple Conformational States in Glycylation*

Background: The molecular basis for catalysis by human glycyl-tRNA synthetase (hGlyRS) is unclear. Results: hGlyRS-tRNA complex structures are reported, and the molecular details for enzymatic recognition are elucidated. Conclusion: hGlyRS catalysis involves multiple conformational changes, and insertions 1 and 3 may facilitate tRNA binding. Significance: Understanding the catalytic pathway also provides insights into the role of hGlyRS in disease. Aminoacyl-tRNA synthetases are an ancient enzyme family that specifically charges tRNA molecules with cognate amino acids for protein synthesis. Glycyl-tRNA synthetase (GlyRS) is one of the most intriguing aminoacyl-tRNA synthetases due to its divergent quaternary structure and abnormal charging properties. In the past decade, mutations of human GlyRS (hGlyRS) were also found to be associated with Charcot-Marie-Tooth disease. However, the mechanisms of traditional and alternative functions of hGlyRS are poorly understood due to a lack of studies at the molecular basis. In this study we report crystal structures of wild type and mutant hGlyRS in complex with tRNA and with small substrates and describe the molecular details of enzymatic recognition of the key tRNA identity elements in the acceptor stem and the anticodon loop. The cocrystal structures suggest that insertions 1 and 3 work together with the active site in a cooperative manner to facilitate efficient substrate binding. Both the enzyme and tRNA molecules undergo significant conformational changes during glycylation. A working model of multiple conformations for hGlyRS catalysis is proposed based on the crystallographic and biochemical studies. This study provides insights into the catalytic pathway of hGlyRS and may also contribute to our understanding of Charcot-Marie-Tooth disease.

aminoacylation of tRNA substrates in a two-step reaction by juxtaposing ATP, amino acids, and tRNAs, and the produced aminoacylated tRNAs are used in protein synthesis by the ribosome. In the first step the specific amino acid is activated by reaction with ATP to produce an aminoacyl-adenylate intermediate (aa-AMP); in the second, the amino acid is covalently linked to the terminal adenosine residue of the cognate tRNA acceptor stem. The 24 aaRS families can be partitioned into two classes, mainly distinguished by their oligomeric structures (1)(2)(3)(4). Class I enzymes possess the Rossman fold as well as two highly conserved sequences, whereas class II enzymes are characterized by three conserved signature motifs at the active site. Glycyl-tRNA synthetase (GlyRS) belongs to class II type, but unlike other aaRS members, the quaternary structure of GlyRS is not conserved phylogenetically. Specifically, eukaryotic and archaebacterial GlyRSs mainly form ␣ 2 homodimers and belong to subclass IIA, and eubacterial GlyRSs mainly form ␣ 2 ␤ 2 heterotetramers and belong to subclass IIC (5)(6)(7)(8)(9). These two distinct types of enzymes do not share significant sequence homology. The IIA aaRSs are specific for hydrophobic and small polar amino acids. Their ␣ 2 homodimeric structures generally share a C-terminal anticodon binding domain with the exception of seryl-tRNA synthetase (SerRS). In contrast, the ␣ 2 ␤ 2 tetrameric structures vary greatly within the IIC subclass, and GlyRS is one of the most divergent synthetases among all class II aaRSs (10). In addition, GlyRSs only aminoacylate tRNA molecules from their own domains of life and do not function across species. Interestingly, the only major difference in these tRNAs is the discriminator base at position 73, the base preceding the 3Ј-CCA end. Eukaryotic tRNA Gly substrate always has an adenosine at this position, whereas their prokaryotic counterparts always have a uridine (5,6,11). Mutational studies demonstrated that other than the discriminator base, the first three base pairs in the acceptor stem (especially the G1-C72 base pair) as well as the anticodon nucleotides C35 and C36 contribute greatly to glycylation activity and serve as the identity elements of tRNA Gly in bacteria and yeast (8,12). Despite growing research interest in this protein, the structural basis of its biochemical properties remains unexplained.
The first apoGlyRS crystal structure solved was Thermus thermophilus GlyRS (TtGlyRS, Protein Data Bank (PDB) code 1ATI) (13). Although of bacterial origin, TtGlyRS forms the ␣ 2 homodimeric structure. The catalytic domain of TtGlyRS contains a core antiparallel ␤-sheet flanked by ␣-helices and is identified by three diagnostic sequence motifs. The ␣-subunit structures of the ␣ 2 ␤ 2 GlyRSs from Thermotoga maritima (deposited in the PDB without a publication, code 1J5W) and Campylobacter jejuni were also reported (PDB codes 3RF1, 3RGL, and 3UFG) (14). C. jejuni is a human pathogen that causes diarrhea and enteritis. C. jejuni GlyRS consists of an N-terminal catalytic domain, a C-terminal three-helix bundle, and a linker in between. The catalytic domain resembles the typical active site of class II aaRSs, and the three-helix bundle domain may contribute to the formation of the heterotetramer. It was proposed that a stable ␣ 2 ␤ 2 tetrameric structure may require extensive interactions between the ␣and ␤-subunits, and thus both subunits are required for full enzymatic activities (14).
In the past decade, missense mutations of human GlyRS (hGlyRS) were found to be associated with Charcot-Marie-Tooth (CMT) subtype 2D (CMT-2D) and distal hereditary motor neuropathy-V (dHMN-V), both of which are hereditary diseases of the peripheral nervous system. They are characterized by progressive weakness and atrophy in the hands and feet, but the latter is distinguished from CMT (especially CMT2) only by the absence of sensory loss (15,16). CMT is one of the most commonly inherited neurological disorders, affecting ϳ1 in 2500 people (17). CMT can be further divided into two categories; type 1 is a demyelinating neuropathy, whereas type 2 is axonal (18). CMT-2D begins only after young adulthood, and unlike other CMTs, it typically causes more severe symptoms in the hands (19). Recent advances in human genetics and mouse models have indicated that GARS is the disease gene (20). To date, 16 missense mutations have been discovered (20 -23), but the etiology is not clear. We previously solved the structures of apohGlyRS as well as a CMT-causing mutant G526R and studied their roles in the disease (PDB codes 2PME and 2PMF) (24,25). Structural analysis suggested that the CMT mutations may disrupt the dimer interface of hGlyRS, and this finding may be connected to disease pathogenesis. The catalytic domain of hGlyRS is conserved, with motif 1 forming part of the dimeric interface, and motifs 2 and 3 contributing conserved charged and polar side chains that recognize the substrates glycine and ATP. Additionally, hGlyRS possesses an N-terminal WHEP-TRS domain (an acronym for synthetases that carry this domain, TrpRS (W), HisRS (H), and GluProRS (EP)) as well as several insertion domains, although the WHEP-TRS and insertion 3 domains are not resolved in the structure. The WHEP-TRS domain is a unique aaRS domain in metazoans. This domain is highly flexible and folds into a helix-turn-helix structure. It is also found in other human tRNA synthetases and plays critical roles in a variety of processes (26,27). Insertion 1 (Ala-145-Asn-225) is a GlyRS-specific domain absent in other class IIA enzymes. It was proposed to interact with the minor groove of the acceptor stem of tRNA Gly (13). Insertion 1 is disordered in the apoTtGlyRS structure but well ordered in apoh-GlyRS. This domain in the long form of yeast GlyRS (GRS1) is rich in lysine, and its deletion from GlyRS1 reduced aminoacylation activity by up to 9-fold (28). Like the WHEP-TRS domain, this domain is highly flexible and undergoes conformational changes during catalysis when ATP or analogs are bound (29). Insertions 2 (His-318 -Asn-349) and 3 (Val-440 -Val-504) are extra domains missing from GlyRSs of lower organisms (24). They are rarely studied, and their functions are obscure.
The previous crystallographic structures provide structural information on the enzymatic recognition of glycine and ATP, but the recognition mechanism of the identity elements of their tRNA substrates is unknown due to the lack of cocrystal structures of GlyRSs bound with tRNA Gly substrate. Here, we report the crystal structures of hGlyRS in the tRNA-bound form and describe the recognition mechanism of these identity elements, glycine and ATP analogs. In addition, we proposed a working model for the aminoacylation pathway of hGlyRS. By studying the glycylation functions of hGlyRS, we hope to shed light on the disease mechanism.
The gene encoding the full-length TtGlyRS protein (Gen-Bank accession number AAS80523.1) was amplified from the genomic DNA of T. thermophilus strain HB27 using the primers 5Ј-GATAGGGCCATATGCCTGCGAGCAGCCTGGAC-GAA-3Ј and 5Ј-AATATGGCGGCCGCCCACCTAAGCCTC-TCCCGAAGGAA-3Ј. The DNA fragment was cloned into the expression vector pET-21b (ϩ) using the NdeI and NotI restriction sites. Eleven amino acids (AAALEHHHHHH), including a hexahistidine tag, were added to the C terminus.
The expression and purification of GlyRS was similar to the protocol described by Xie et al. (30) with a few modifications. Briefly, a 2-liter culture of Luria-Bertani broth containing 50 g/ml ampicillin was inoculated with a 20-ml overnight culture of Escherichia coli BL21 (DE3) and grown at 37°C to an A 600 of 0.8. The expression of GlyRS was induced by the addition of 0.5 mM isopropyl 1-thio-␤-D-galactopyranoside, and the cells were grown for 18 h at 37°C before harvest. For the final stage of purification, the concentrated protein was loaded onto a Superdex 200 column (GE Healthcare) and eluted with a buffer containing 20 mM HEPES (pH 7.5), 150 mM NaCl, and 1 mM DTT. The pure protein was concentrated to 6 mg/ml before being flash-frozen and stored at Ϫ80°C. The purification process for E71GSF⌬Ins1 and -3 was the same as described above except that the buffer pH values in the entire process were changed to 8.5. For aminoacylation activity assays of the mutants, 10% glycerol was added to the concentrated mutant proteins before they were frozen. Except for the nonexpressing mutants R633A, R633K, D619A, and D619N, all other mutants eluted with a symmetrical peak on the size-exclusion column, suggesting that they were well folded.
In Vitro Transcription of tRNA Substrate-Synthetic DNA oligos corresponding to the T7 promoter plus tRNA Gly(CCC)encoding sequences from humans and T. thermophilus as well as E. coli (supplemental Table S2) were ligated into the pUC18 vector using the restriction sites HindIII and XbaI. The transcription template was obtained by PCR amplification of the ligated DNA fragments. Each PCR product was extracted by phenol and precipitated with 95% ethanol after storage at Ϫ80°C for 2 h. The precipitated dry DNA pellet was redissolved in diethyl pyrocarbonate-treated water to a concentration ϳ400 g/ml. The in vitro transcription was carried out at 37°C in a buffer for 3 h containing 2.5 mM concentrations each of NTP, 20 mM Tris-HCl (pH 8.0), 150 mM NaCl, 20 mM MgCl 2 , 5 mM DTT, 1 mM spermidine, and 0.3 M T7 RNA polymerase. The tRNA transcript was purified by a 10% denaturing urea-PAGE gel, extracted, and precipitated by ethanol. The RNA pellet was washed and redissolved in Tris-EDTA buffer containing 20 mM Tris-HCl (pH 7.5) and 1 mM EDTA. The tRNA was annealed by heating to 65°C and allowed to cool to room temperature after the addition of 10 mM MgCl 2 . The annealed RNA was aliquoted and stored at Ϫ80°C for further use.
Crystallization, Data Collection, and Structure Determina tion-For formation of the complex, E71GSF was mixed with tRNA Gly at a 1:1.2 molar ratio, and 4 mM glycine, 4 mM adenosine 5Ј-(␤,␥-imido)triphosphate (AMPPNP), 5 mM ␤-mercaptoethanol, and 5 mM MgCl 2 were added. The complex was incubated on ice for 30 min and filtered before crystallization. Cocrystals were obtained in 32% PEG 600, 0.1 M NaCl, and 0.1 M MES (pH 6.5). After optimization, the best crystals were produced by mixing the sample of the complex, reservoir solution, and additive E9 from the Silver Bullets screen (31) (0.2% w/v 1, 4-diaminobutane, 0.2% w/v cystamine dihydrochloride, 0.2% w/v diloxanide furoate, 0.2% w/v sarcosine, 0.2% w/v spermine, and 0.02 M HEPES sodium (pH 6.8)) at a 2:1:1 ratio (v/v). GlyRSSF complex was crystallized under the same conditions except that 4 mM glycine and 4 mM AMPPNP were replaced by 1.5 mM glycine and 1.5 mM ATP, respectively. All crystals were grown at 25°C, and the fully grown crystals were soaked for 1-3 min in a cryoprotective solution containing all the components of the reservoir solution plus 20% glycerol (v/v). The soaked crystals were mounted on nylon loops and flash-frozen in liquid nitrogen.
Native data were collected from frozen crystals at Ϫ173°C using Beamline 17U (BL17U) at the Shanghai Synchrotron Radiation Facility (SSRF, Shanghai, China). The data were processed with the program HKL2000 (32), and the space group of the cocrystals belongs to P2 1 2 1 2. The structure of the complex was solved by molecular replacement using Phenix (33). Both the coordinates of wild type apoGlyRS (PDB code 2PME) (19) and the coordinates of tRNA Val (PDB code 1GAX) (34) were used as the models, and both components were searched simultaneously. The initial models generated by molecular replacement were manually built with the program Coot (35) and fed to the refinement program phenix.refine (36). Multiple cycles of refinement alternated with model rebuilding. Titration-Libration-Screw (TLS) refinement was carried out in the later stages of the refinement using nine TLS groups as defined by the TLS motion determination server for both complex structures (37). The final R factor was 23.50% (R free ϭ 28.60%) for the E71GSF-tRNA Gly -glycine-AMPPNP complex and 22.60% (R free ϭ 27.70%) for the GlyRSSF-tRNA Gly -AMP complex ( Table 1). The Ramachandran plots of the final models have 91.7, 6.55, and 1.75% residues in the most favorable, generously allowed, and disallowed region for the E71GSF complex and 94.89, 4.44, and 0.67% residues in the most favorable, generously allowed, and disallowed region for the GlyRSSF complex, respectively, as indicated by the program Molprobity (38). All the figures were created with PyMOL, and the charge distribution on the E71GSF surface was calculated by APBS (39). The domain architecture was prepared by DOG (40).
Aminoacylation Assay-The assay mixture contained 150 mM HEPES (pH 7.5), 20 mM KCl, 4 mM MgCl 2 , 2 mM DTT, 2 mM ATP, 20 M L-glycine, 2 M L-[ 3 H]glycine, and 5 M annealed tRNA Gly(CCC) . 0.5 M GlyRS or mutants were added to initiate the reaction. The reaction was carried out at ambient temperature, and aliquots were removed at the designated time points, spotted onto trichloroacetic acid (TCA)-soaked filter pads, and washed twice with 5% cold TCA. The filter pads were dried and measured by scintillation counting.

RESULTS
Overview of the Complex-We crystallized E71GSF in complex with tRNA Gly(CCC) in the presence of glycine and a nonhydrolyzable inhibitor AMPPNP and determined the structure to a resolution of 3.25 Å. We also determined the structure of the GlyRSSF-tRNA-AMP complex, crystallized under a similar condition but to a slightly lower resolution (3.30 Å, Table 1). The two complexes are structurally equivalent (supplemental Fig. S1), with a root mean square deviation of 0.50 Å over 457 C␣s. JULY 18, 2014 • VOLUME 289 • NUMBER 29

JOURNAL OF BIOLOGICAL CHEMISTRY 20361
We thereby describe the structure of the E71GSF-tRNA Gly -glycine-AMPPNP quaternary complex due to its overall lower temperature factor unless we specify otherwise.
To reduce the interference of the flexible N-terminal WHEP-TRS domain on crystallization, we truncated this domain and named the resulting fragment E71GSF (Fig. 1A). The WHEP-TRS domain is disordered in numerous reported hGlyRS crystal structures (24,29), and the removal of this domain does not affect enzymatic activity (24,30). The E71GSF construct without the cloning sites or the C-terminal His 6 -tag contains 627 residues. We solved the complex structure by molecular replacement and could trace the entire anticodon binding domain and most of the catalytic domain as well as tRNA Gly but failed to resolve insertions 1, 3, or the last 4 nucleotides of tRNA Gly . In addition, residues 382-386 and the C-terminal His 6 tag of the protein are also disordered. The asymmetric unit contains one protein and one tRNA molecule each (Fig. 1B). The protein structure in the complex is very similar to that of apoprotein in the catalytic core, which consists of the three characteristic motifs of class II aaRSs. The active site contains eight mixed ␤-strands, and the C-terminal anticodon binding domain is a globular ␣ ϩ ␤ domain. We were able to resolve a few more residues than in the apoprotein around insertion 3 region, most likely due to the binding of the tRNA molecule. The extra density forms two long antiparallel ␤-strands (Gln-508 -Val-516 and Ala-428 -Asn-439), extending to the solvent. Therefore, the protein appears to exhibit an elongated shape in the tRNA-bound form. tRNA mainly binds to the active site and the anticodon binding domain of hGlyRS. An analysis of the surface charges indicated that the two molecules complement each other well (Fig. 1C).
hGlyRS is an ␣ 2 homodimeric enzyme, and the molecular dimer axis coincides with the crystallographic 2-fold axis. The complex dimerizes the same way as the free protein, utilizing three regions as the interface, which includes the entirety of motif 1 and part of motif 3. tRNA Gly substrate binds to the protein dimer in a symmetrical fashion, interacting with both subunits (Fig. 1D). The majority of the contacts come from subunit 1, burying a surface area of 3243.5 Å 2 . Specifically, tRNA Gly forms contacts with subunit 1 of the dimer, through the anticodon loop region, the acceptor stem, and the D-stem ( Table 2). The contact area with subunit 2 is much smaller (1358.3 Å 2 ) and mainly occurs via the D-stem and the variable regions. All the cross-subunit interactions involve only the tRNA sugar rings or the phosphate backbone, and none of them is base-specific ( Table 2).
Recognition of Acceptor Stem-The 3Ј terminus of the tRNA Gly acceptor stem forms a stacked A-form conformation, but the last four residues, ACCA, are not visible. The enzyme accesses the acceptor stem of the tRNA from the major groove, as expected for class II aaRSs. The in vitro transcription was primed by a GTP nucleotide, and the first G shows clear electron density for a triphosphate group. The acceptor stem is positioned at the entrance of the active site, and hydrogen bonds are formed between the invariant residue Arg-283 and G1 and between the highly conserved Ser-281 and G1 ( Fig. 2A, supplemental Fig. S2, and Table 2). The R283A and R283K mutants retain only 2 and 20% glycylation activity, respectively, whereas S281A is 25% as active as E71GSF (supplemental Fig.  S3 and Fig. 2B), consistent with the findings of Nameki et al. (12). Other interactions between Gln-82, Ser-91, and nucleotides C70, G66, and C67 contribute little to aminoacylation activity (supplemental Fig. S3 and Fig. 2B).
Because tRNA Gly is missing the last four nucleotides in our cocrystal structure, whether its CCA end could reach the active site or not is still unclear. To address this question, we generated a model of the nonorthogonal GlyRS-tRNA Thr complex by superimposing tRNA Thr in its productive complex form (PDB code 1QF6) (41) with tRNA Gly in our cocrystal structure (supplemental Fig. S4). The backbone of the two tRNA molecules aligns well. The acceptor end of the full-length tRNA Thr comes into contact with the catalytic core of GlyRSSF, and A76 is positioned directly into the active site, ready for ligation to the substrates. The results of this modeling suggest our cocrystal structure represents a productive complex and that tRNA Gly does bind to GlyRS in a proper orientation ready for ami noacylation.
In addition to the RNA disorder, insertion 1 is almost completely missing from the structure. Modeling studies revealed that this domain poses steric clashes with the tRNA acceptor arm, suggesting large conformational changes of insertion 1 upon tRNA binding, which will be discussed later.
Recognition of Anticodon Loop-Human tRNA Gly harbors two identity elements in the anticodon loop, C35 and C36 (12). As a major region of contact, the buried surface area between the protein and RNA is 1820 Å 2 , accounting for 56% of total buried surface. Bases C34 -37 are flipped out of the loop and trapped in different isolated pockets (Fig. 2C). Unmodified C34 is the first base in the anticodon triplet of tRNA Gly and forms a base pair with the wobble base G of mRNA codons. Therefore, interactions with this base are not specific, and only one hydrogen bond is formed with the ribose (Table 2). In contrast, C35 and C36 establish a broad network of specific interactions with the surrounding residues, and their hydrogen bonding capacity has been almost fully reached. Specifically, C35 not only interacts with Gln-675 through its phosphate oxygen but also with Tyr-604, Thr-617, Asp-619, and Thr-631 through the pyrimidine ring. Mutations of these invariant residues reduced enzymatic activity by Ͼ10-fold except for the T617A mutation (Fig.  2B). Similarly, C36 hydrogen bonds with residues Gln-640, Arg-548, Met-638, and Arg-633 through its base ring as well as with residue Arg-548 through its ribose. To test the importance of these residues, we created the D619A, D619N, R633A, R633K, Y604F, T617A, Q640A, Q675A, and Q675N mutants and analyzed their activities. Except for the nonexpressing mutants R633A, R633K, D619A, and D619N, mutations of these invariant residues reduced enzymatic activities by Ͼ10fold (Fig. 2B). We did not mutate Met-638 because Met-638 interacts with C36 through the carbonyl oxygen, and we did not expect large changes in the backbone position from point mutations. A37 is usually unmodified in tRNA Gly substrate (42), and it forms hydrogen bonds with Glu-609, Arg-548, Gln-547, and Arg-602. Base C38 is stabilized by non-Watson-Crick hydrogen bonds with U32. Active Site-The substrate glycine and the ATP analog AMP-PNP fit snugly in the active site of the E71GSF protein. AMP-PNP adopts a bent conformation with the purine ring sandwiched between Arg-529 and Phe-292 (Fig. 2D). The N6 atom of the purine contacts Glu-279 and Val-289, whereas N1 contacts Ile-287. The sugar ring hydrogen bonds with Ile-404 and Ser-524, and the ␥-phosphate hydrogen bonds with His-378. Glycine is poised for attack on the ATP molecule, within hydrogen-bonding distance from the bridging oxygen of the ␣-phosphate group. The positively charged amino group of glycine forms hydrogen bonds with Glu-522 and Glu-245, whereas its carboxylate group accepts a hydrogen bond from Ser-524.
The active site of the GlyRSSF complex is very similar to that of E71GSF. We replaced AMPPNP with ATP for crystallization because the latter tended to generate better crystals and produced higher quality diffraction data. In the final refined struc-  JULY 18, 2014 • VOLUME 289 • NUMBER 29 ture, we found electron density only for AMP in the substrate binding pocket (supplemental Fig. S5). Therefore, the cocrystal structure of GlyRSSF complex is most likely in a product-bound form, representing the final stage of glycylation. Compared with the E71GSF complex, the positions of the key residues are well conserved, and the backbone interactions of the carbonyl oxygen from residues Val-289, Ile-287, and Ile-404 are retained (supplemental Fig. S6). The adenine ring is still stacked between Arg-529 and Phe-292, and the side chain of Arg-277 forms a salt bridge with the ␣-phosphate oxygen. The residues that interact with the substrate glycine as well as the ␤and ␥-phosphates in the E71GSF complex reorient their side chains except for Glu-245, and they are most likely induced by specific interactions with the small substrates.

Structures of Glycyl-tRNA Synthetase in Complex with tRNA
The interactions observed in both cocrystal structures are reminiscent of the GlyRS-glycine-ATP ternary complex (29). However, the glycine binding loop is ordered in the GlyRS-glycine-ATP ternary complex, whereas in our cocrystal structures this loop is unstructured even in the presence of AMP-PNP or AMP.
Conformational Changes of the Complex-Both the tRNA substrate and the protein undergo large conformational changes during catalysis. Insertion 3 was disordered in previously determined free GlyRS structures (24,29). Our structural analysis by superimposition of the protein molecules with and without bound tRNA suggests that this domain from subunit 2 interacts with the variable region and the D-stem of the tRNA molecule. Insertion 1 becomes dislodged and disordered to avoid possible steric clashes with the acceptor stem of tRNA (Fig. 3A). Additionally, the anticodon binding domain, especially ␣14 (Gln-571-Arg-586), ␣16 (Phe-620 -Val-623), and ␣17 (Leu-648 -Ala-656) also experiences substantial local structural reposition. This domain shifts without changing its fold, and the translation around residue Gln-569 is as large as 6.0 Å. Furthermore, the structural changes are not limited to one subunit. The two monomers apparently approach each other, and the tRNA-bound GlyRS dimer in the complex has a larger interface than that of the free GlyRS dimer (4581.1 Å 2 versus 2721.6 Å 2 ) (Fig. 3B) due to the movement of ␣ 2 , ␣12, ␣13, and ␣15. This stronger dimer interface is most likely induced by tRNA binding because the GlyRSSF-tRNA Gly -AMP complex forms a similar, larger interface.
In addition to the structural alterations observed for the protein, the tRNA molecule exhibits substantial deformation at the anticodon loop as well compared with free tRNA Phe (Fig. 3C). The loop goes through remarkable unwinding, and a rotation of ϳ85 o occurs.
Cross-aminoacylation by hGlyRS-The conservation of U73 among prokaryotic tRNA Gly substrate and A73 among eukaryotic tRNA Gly substrate suggests that base 73 is essential for species-specific recognition of tRNA by GlyRSs. From our cocrystal structure, it is not immediately clear how eukaryotic enzymes achieve selectivity for adenosine at position 73 due to the disorder in the 3Ј terminus of tRNA Gly and insertion 1 of hGlyRS. However, examination of the structure identifies a number of potential candidate residues for this specific interaction, including Ser-281, Arg-283, and Arg-288. All three residues are either invariant or highly conserved. Intriguingly, Ser-281 is substituted by a threonine residue in TtGlyRS, which is capable of charging tRNA Gly substrate from humans, T. thermophilus, and E. coli (8). We wondered if the Ser-to-Thr conversion could confer specificity changes during the charging reaction. An activity test demonstrated that S281T did improve the aminoacylation efficiency of hGlyRS by up to 74% toward E. coli tRNA Gly and also improve slightly toward T. thermophilus tRNA Gly substrate (supplemental Figs. S3 and S7) at the 15-min time point, suggesting that this residue plays a pivotal role in cross-aminoacylation. However, switching the substrate specificity could also rely on the local environment of Ser-281, which may generate synergistic effects.
Impact of Insertions 1 and 3 on Glycylation-We investigated the roles of insertions 1 and 3 in glycylation. After removing insertion 1 or 3 from the E71GSF protein, we tested the charging activities of the resulting deletion mutants E71GSF⌬Ins1 and E71GSF⌬Ins3. Various deletion lengths have been designed, and we found that the appropriate deletion is Ala-145-Asn-225 for insertion 1 and Val-441-Val-504 for insertion 3, respectively, in terms of protein expression and folding properties (supplemental Fig. S3). The charging assay revealed that although E71GSF⌬Ins3 retained ϳ70% activity of E71GSF, E71GSF⌬Ins1 only had residual activity (1.5% of E71GSF, supplemental Fig. S8). We further created the equivalent deletion of insertion 1 in TtGlyRS (TtGlyRS⌬Ins1), and the truncation also led to an ϳ80% loss of its activity toward T. thermophilus tRNA Gly (supplemental Figs. S3 and S8). These findings reflected the significance of insertions 1 and 3 in aminoacylation. Although both domains contribute to the aminoacylation activity of hGlyRS, the former plays a more important role than the latter.

DISCUSSION
Recognition of the Identity Elements of tRNA-Although it is responsible for the ligation of the simplest amino acid to tRNA, GlyRS is quite complicated in quaternary structure, falling into both IIA and IIC subclasses. To date, of the 13 class II aaRS families, only seven tRNA cocrystal structures have been solved, including the two with only partially resolved tRNA structures for ProRS and SepRS (43). To obtain a suitable construct for crystallization, we analyzed the properties of CMT-2D/dHMN-V-causing variants. Most mutations convert a negatively charged residue to a neutral one or a neutral residue to a positively charged one. We wondered whether these mutants could be capable of forming tighter complexes with the negatively charged tRNA Gly than WT enzyme to aid in crystallization. A Dali search of the hGlyRS (PDB code 2PME) for structural neighbors indicates that its closest structural homolog is threonyl-tRNA synthetase in its tRNA-bound form (41). Therefore, we generated the nonorthogonal GlyRS-tRNA Thr complex model by superimposing the two protein structures. Both enzymes belong to class IIA and share a conserved cata-FIGURE 2. Substrate recognition by hGlyRS. All the 2F o Ϫ F c maps were contoured at 1. The coloring scheme is the same as in Fig. 1A, and hydrogen bonds are shown as dotted lines. Residues from signature motifs 1-3 are colored hot pink, yellow, and wheat, respectively. A, the specific interactions of hGlyRS with the first base pair G1-C72 from the acceptor stem (purple). G1 is in the triphosphate form. B, time course of the relative aminoacylation activities of GlyRS mutants as well as GlyRSSF. Three sets of data are shown representing the measurements at 2 (brown)-, 5 (green)-, and 15 (purple)-min time points. The activity of E71GSF at the 15-min time point was regarded as 100%, and the activities of the mutants at all time points were normalized against this value. The readings at time point zero were used as blanks, and the error bars represent S.D. calculated from two measurements. C, the recognition of the anticodon loop bases C34-C38 (purple) by the ACBD residues (cyan) and catalytic domain residues (blue). Bases 34 -37 are splayed out of the loop. D, the interactions with AMPPNP and glycine substrates at the active site. Glycine is poised to attack the ␣-phosphate oxygen. lytic core. tRNA Thr is well accommodated in the core of hGlyRS except for some minor clashes with insertion 1 in the model. Superposition of the two proteins places four CMT-2D/ dHMN-V-causing residues, Glu-71, Ile-280, Gly-598, and Cys-157, in close vicinity to tRNA Thr (supplemental Fig. S9). Among these mutants, only two CMT mutations, E71G and C157R, might increase the electrostatic interactions with the tRNA substrate. Therefore, we chose to pursue E71G, a mutant with full glycylation capability (20), for crystallization trials of the complex. We obtained crystals of the E71GSF complex and the GlyRSSF complex and subsequently determined both cocrystal structures.
The tRNA molecule retains the general inverted "L" shape, and all the nucleotides are visible except for the last four nucleotides. To confirm that we have a productively bound tRNA Gly , we generated the nonorthogonal GlyRS-tRNA Thr complex model by superimposing the two tRNA molecules and validated the correct orientation of tRNA substrate. In addition, detailed structural information on the extensive interactions between hGlyRS and tRNA Gly followed by the activity assay further supported the productive conformation observed in the cocrystal structures. Specifically, the tRNA molecule binds to a large pos-itive region on the enzyme surface consisting of the anticodonbinding domain, the active site, and very likely insertion 3Ј from subunit 2 (across the subunit). The important identity elements on tRNA readily recognized by the enzyme include the first pair G1-C72 and the anticodon bases C35 and C36, which is in agreement with previous findings (8,12). Through activity assays, we found that the residues that make the most contribution to specific tRNA recognition are mainly located in motif 2, the N terminus, and the anticodon binding domain ( Table 2). The contact area from subunit 1 is much larger than that from subunit 2, partly due to the disorder of insertion 3Ј. Interestingly, all the interactions from subunit 2 are nonspecific interactions with the phosphate backbone and the ribose rings. However, via the specific recognition of the identity elements and the nonspecific cross-the-subunit interactions with the D-stem and variable region of tRNA, the two hGlyRS subunits steadily hold the inside and outside of L-shaped tRNA molecules (Fig. 1, B and D). Therefore, the binding mode provided by the dimer is necessary and sufficient for the correct tRNA orientation needed for glycylation and also justifies the ␣ 2 homodimeric structures of hGlyRS.

Roles of Insertions 1 and 3 in Catalysis and the Catalytic
Mechanism-aaRSs are ancient enzymes found in all forms of life. Through millions of years of evolution, the eukaryotic enzymes may have acquired many new domains to enhance their performance (26,27). There are three insertion sequences in hGlyRS, called insertion 1-3, and their roles have not been thoroughly investigated. Insertion 1 normally plays the role of a "gatekeeper" during glycylation; in the apoenzyme form, it partially covers the catalytic pocket and forms contacts with the glycine binding loop. Once the small substrates (ATP/glycine) bind, insertion 1 moves closer to this loop (29), possibly to prevent the hydrolysis of the high energy intermediate adenylate.
As soon as tRNA is bound, insertion 1 moves away to avoid steric clashes with the acceptor stem, switching to an "open" state (Fig. 3A). Additionally, this domain might also interact with the minor groove of the acceptor stem once tRNA binds. Therefore, insertion 1 adopts multiple conformations, and its flexibility is also indicated by the high B-factor of this domain, averaging 74.3 Å 2 in the apoWT enzyme structure compared with an average B of 55.0 Å 2 for the entire WT protein. Similarly, insertion 3 is flexible, and it is in a more elongated or open state when tRNA is bound (Fig. 1, B and D). Its role may be to provide support for tRNA binding through backbone interactions during catalysis. Aminoacylation assays of E71GSF lacking either domain show that deletion of insertion 1 has a significant impact on activity, whereas deletion of insertion 3 has less impact (supplemental Fig. S8). The effects of insertion 1 on charging efficiency may be due to its capacity in binding both ATP/glycine and tRNA. Therefore, both insertions contribute to aminoacylation by promoting local substrate binding but to different extents.
GlyRS belongs to class IIA enzymes. The superposition of GlyRS and threonyl-tRNA synthetase in their tRNA-bound forms reveals that these two enzymes utilize similar strategies when binding to tRNAs. The catalytic domains of both enzymes access the acceptor stem from the major groove, whereas the C-terminal anticodon binding domains approach the anticodon loops from the major groove (Figs. 1B and 3C). tRNAs are bound in a cross-subunit fashion with the dimer sitting on a crystallographic 2-fold axis. One subunit forms a large portion of the contacts with tRNA, and a small fraction of the contacts are from the counterpart of the other subunit. Additionally, none of the cross-subunit interactions is base-specific. The structures of class IIA enzyme are relatively conserved in threedimensional space, especially in the catalytic domains (Fig. 3A). In addition to the similar binding mode, both proteins contain aaRS-specific domains that may play important roles in catalysis. A characteristic N2 domain for editing in threonyl-tRNA synthetase complex makes contacts from one side of the acceptor stem, whereas the catalytic domain approaches the opposite side. The two domains clamp the acceptor stem of tRNA and hold it in place. In hGlyRS, two insertion domains, insertions 1 and 3, might serve the same purpose as the N2 domain, strengthening the interactions with tRNA in the acceptor stem and the variable region, respectively. This cross-subunit binding mode may also apply to other class IIA synthetases: the catalytic and the anticodon binding domains make major contacts with tRNA and serve as the binding scaffold (subclass specific); insertion domains of the catalyzing subunit or helper subunit (cross subunit) may assist in local binding and fine tune the interactions.
Based on the structural analysis and previous biochemical data, a pathway involving multiple conformational states for hGlyRS catalysis is proposed (Fig. 4). Initially, insertion 1 is partially closed, whereas insertion 3 is largely flexible. When ATP and glycine are bound, insertion 1 assumes a more closed state by moving to ␣9 (Ala-324 -Ser-329) (24). Upon the synthesis of adenylate, insertion 3 opens, and the RNA molecule binds by specifically interacting with the anticodon binding domain and nonspecifically with insertion 3Ј across the subunit. The anticodon loop rotates, and the bases are flipped out to maximize the contacts. The two subunits of the protein also associate with each other more tightly, generating a larger dimer interface. These interactions help the tRNA to orientate its 3Ј-CCA end toward the active site, and insertion 1 re-opens to allow tRNA to place base A76 precisely in the ideal position of the active site. Both the enzyme and tRNA undergo remarkable conformational changes to fit each other at this stage. Once the reaction is completed, the two insertions move again to release the product gly-tRNA Gly , and the enzyme regenerates for the next round of synthesis. This working model is consis- tent with current structural and biochemical data, but complete understanding of the pathway requires more structures of the intermediates as well as complementary activity assays.
Implications for Neurological Diseases-We solved the structure of both the GlyRSSF-and E71GSF-tRNA complexes but failed to discover a significant structural perturbation caused by this mutation. This is consistent with the previous finding that the CMT mutations generate very subtle effects that may induce alternative functions of aaRS (24). Minor local structural alterations in the presence of the tRNA substrate suggest that the diseases occur through a distinct mechanism that may be completely separate from the aminoacylation function of hGlyRS. Mapping of the CMT-2D/dHMN-V mutations demonstrated that they are concentrated at the dimeric interface and are likely to influence dimer formation. Two newly reported dHMN-V mutations, S265F and D200N (equivalent to of Ser-211 and Asp-146 in our structure) are located in insertion 1, which is also far from the interface (22). Therefore, the role of hGlyRS in diseases becomes more intriguing and warrants additional investigation. He et al. (44) discovered that conformational opening of hGlyRS could result from multiple CMT-causing mutations and proposed that the relatively stable neomorphic structure may be associated with certain pathological functions. In this study we also observed large conformational changes of insertions 1 and 3 during catalysis. Coincidentally, insertion 1 partially overlaps the "hot spot 2 (Leu-129 -Asp-161)" region (44). CMT mutations may cause these regions to be more solvent-exposed than in the WT protein. The large opening in the presence of the tRNA substrate suggested by the cocrystal structures may also act as a unique binding surface for potential novel protein-protein interactions, and the functions potentially "gained" by these conformationally labile domains are worth investigating.