Analysis of a Critical Interaction within the Archaeal Box C/D Small Ribonucleoprotein Complex*

In archaea and eukarya, box C/D ribonucleoprotein (RNP) complexes are responsible for 2′-O-methylation of tRNAs and rRNAs. The archaeal box C/D small RNP complex requires a small RNA component (sRNA) possessing Watson-Crick complementarity to the target RNA along with three proteins: L7Ae, Nop5p, and fibrillarin. Transfer of a methyl group from S-adenosylmethionine to the target RNA is performed by fibrillarin, which by itself has no affinity for the sRNA-target duplex. Instead, it is targeted to the site of methylation through association with Nop5p, which in turn binds to the L7Ae-sRNA complex. To understand how Nop5p serves as a bridge between the targeting and catalytic functions of the box C/D small RNP complex, we have employed alanine scanning to evaluate the interaction between the Pyrococcus horikoshii Nop5p domain and an L7Ae box C/D RNA complex. From these data, we were able to construct an isolated RNA-binding domain (Nop-RBD) that folds correctly as demonstrated by x-ray crystallography and binds to the L7Ae box C/D RNA complex with near wild type affinity. These data demonstrate that the Nop-RBD is an autonomously folding and functional module important for protein assembly in a number of complexes centered on the L7Ae-kinkturn RNP.

Many biological RNAs require extensive modification to attain full functionality in the cell (1). Currently there are over 100 known RNA modification types ranging from small functional group substitutions to the addition of large multi-cyclic ring structures (2). Transfer RNA, one of many functional RNAs targeted for modification (3)(4)(5)(6), possesses the greatest modification type diversity, many of which are important for proper biological function (7). Ribosomal RNA, on the other hand, contains predominantly two types of modified nucleotides: pseudouridine and 2Ј-O-methylribose (8). The crystal structures of the ribosome suggest that these modifications are important for proper folding (9,10) and structural stabilization (11) in vivo as evidenced by their strong tendency to localize to regions associated with function (8,12,13). These roles have been verified biochemically in a number of cases (14), whereas newly emerging functional modifications are continually being investigated.
Box C/D ribonucleoprotein (RNP) 3 complexes serve as RNA-guided site-specific 2Ј-O-methyltransferases in both archaea and eukaryotes (15,16) where they are referred to as small RNP complexes and small nucleolar RNPs, respectively. Target RNA pairs with the sRNA guide sequence and is methylated at the 2Ј-hydroxyl group of the nucleotide five bases upstream of either the D or DЈ box motif of the sRNA (Fig. 1, star) (17,18). In archaea, the internal CЈ and DЈ motifs generally conform to a box C/D consensus sequence (19), and each sRNA contains two guide regions ϳ12 nucleotides in length (20). The bipartite architecture of the RNP potentially enables the complex to methylate two distinct RNA targets (21) and has been shown to be essential for site-specific methylation (22).
In addition to the sRNA, the archaeal box C/D complex requires three proteins for activity (23): the ribosomal protein L7Ae (24,25), fibrillarin, and the Nop56/Nop58 homolog Nop5p (Fig. 1). L7Ae binds to both box C/D and the CЈ/DЈ motifs (26), which respectively comprise kink-turn (27) or k-loop structures (28), to initiate the assembly of the RNP (29,30). Fibrillarin performs the methyl group transfer from the cofactor S-adenosylmethionine to the target RNA (31)(32)(33). For this to occur, the active site of fibrillarin must be positioned precisely over the specific 2Ј-hydroxyl group to be methylated. Although fibrillarin methylates this functional group in the context of a Watson-Crick base-paired helix (guide/target), it has little to no binding affinity for double-stranded RNA or for the L7Ae-sRNA complex (22,26,33,34). Nop5p serves as an intermediary protein bringing fibrillarin to the complex through its association with both the L7Ae-sRNA complex and fibrillarin (22). Along with its role as an intermediary between fibrillarin and the L7Ae-sRNA complex, Nop5p possesses other functions not yet fully understood. For example, Nop5p selfdimerizes through a coiled-coil domain (35) that in most archaea and eukaryotic homologs includes a small insertion sequence of unknown function (36,37). However, dimerization and fibrillarin binding have been shown to be mutually exclusive in Methanocaldococcus jannaschii Nop5p, potentially because of the presence of this insertion sequence (36). Thus, whether Nop5p is a monomer or a dimer in the active RNP is still under debate.
In this study, we focus our attention on the Nop5p protein to investigate its interaction with a L7Ae box C/D RNA complex because both the fibrillarin-Nop5p and the L7Ae box C/D RNA interfaces are known from crystal structures (29,35,38). Individual residues on the surface of a monomeric form of Nop5p (referred to as mNop5p) (22) were mutated to alanine, and the effect on binding affinity for a L7Ae box C/D motif RNA complex was assessed through the use of electrophoretic mobility shift assays. These data reveal that residues important for binding cluster within the highly conserved NOP domain (39,40). To demonstrate that this domain is solely responsible for the affinity of Nop5p for the preassembled L7Ae box C/D RNA complex, we expressed and purified it in isolation from the full Nop5p protein. The isolated Nop-RBD domain binds to the L7Ae box C/D RNA complex with nearly wild type affinity, demonstrating that the Nop-RBD is truly an autonomously folding and functional module. Comparison of our data with the crystal structure of the homologous spliceosomal hPrp31-15.5K protein-U4 snRNA complex (41) suggests the adoption of a similar mode of binding, further supporting a crucial role for the NOP domain in RNP complex assembly.

EXPERIMENTAL PROCEDURES
Site-directed Mutagenesis, Expression, and Purification-Pyrococcus horikoshii aFib (GI 14590005) and L7Ae (GI 74407908) were cloned, expressed, and purified essentially as previously described (22). Small changes in the protocol for L7Ae purification were made to improve the removal of contaminating nucleic acids. The supernatant was incubated at 80°C for 30 min and centrifuged, and the resulting supernatant was adjusted to contain 6 M guanidinium hydrochloride, 25 mM sodium phosphate (pH 8.0), and 250 mM sodium chloride. This solution was loaded onto a nickel-nitrilotriacetic acid-agarose (Qiagen) column equilibrated in the same buffer. The protein was refolded on the column by washing the column with five column volumes of 50 mM Tris-Cl (pH 7.5), 300 mM sodium chloride, 10% glycerol, 5 mM ␤-mercaptoethanol, and 5 mM imidazole. L7Ae was eluted with five column volumes of 50 mM Tris-Cl (pH 7.5), 300 mM sodium chloride, 20 mM Na-EDTA, 10% glycerol, 5 mM ␤-mercaptoethanol, and 400 mM imidazole. L7Ae-containing fractions were subjected to TEV protease cleavage overnight to remove the hexahistidine tag followed by a second round of purification using nickel affinity chromatography.
P. horikoshii Nop5p (GI 14590006) and the mNop5p construct that contains a deletion of the coiled-coil were cloned, expressed, and purified as previously described (22). Alanine scanning was performed with the mNop5p protein previously shown to bind the L7Ae box C/D RNA complex with near wild type affinity (22) and has improved solution behavior. mNop5p alanine mutations were produced using sitedirected PCR mutagenesis with the QuikChange site-directed mutagenesis kit (Stratagene) using a staggered primer design to avoid the occurrence of primer dimers (42). The conserved NOP domain construct (referred to as Nop-RBD) consists of residues 119 -390 of P. horikoshii Nop5p, with the coiled-coil domain (residues 139 -242) replaced with a Gly-Ala-Gly-Gly linker to eliminate the possibility of dimerization. This protein was cloned into a pET30 vector such that it is fused to an N-terminal hexahistidine-tagged maltosebinding protein with a TEV protease cleavage site between the fusion and NOP proteins. TEV cleavage results in a single N-terminal glycine residue not present in the native sequence. Nop-RBD was purified using a protocol identical to that for mNop5p and Nop5p.
RNA Synthesis and Labeling-RNA was transcribed, purified, and labeled with fluorescein as previously described (22). The sequence of the box C/D motif RNA used in binding experiments is: 5Ј-GGCACUGACCUCGAAAGAGGAAUG-AUGAUU-3Ј.
Electrophoretic Mobility Shift Assays-Assays to measure the apparent equilibrium dissociation constant (K D ) of mNop5p alanine mutants and the Nop-RBD for the L7Ae box C/D RNA complex were performed with 12% acrylamide gels (acrylamide:bis-acrylamide, 29:1) as described previously (22). Binding reactions for both mNop5p and Nop-RBD were conducted at 25°C in a buffer containing 16 mM K ϩ -HEPES, pH 7.5, 80 mM KCl, 0.01% Nonidet P-40, 25 g/ml tRNA, and 2 nM labeled RNA. All of the gel shifts were performed in triplicate, and associated error was reported as one standard deviation from the mean. All of the alanine mutants of Nop5p reported in this work yielded an electrophoretic gel shift in the presence of the L7Ae box C/D RNA complex consistent with that of the wild type protein, indicating that each mutant was properly folded. Also, in calculating the K D for each mutant, we assumed that each protein was 100% active. Crystallization and Data Collection-Crystals of Nop-RBD were grown using the hanging drop vapor diffusion method. Crystallization was achieved by adding 2 l of a 14 mg/ml solution of Nop-RBD in 10 mM Na-MES, pH 6.0, to 2 l of mother liquor containing 500 mM potassium iodide, 2% polyethylene glycol 3350 and incubating at 17°C. Diffraction quality crystals grew to maximal size (ϳ400 m in each dimension) within a week. The crystals were cryoprotected in mother liquor containing 30% ethylene glycol for at most 10 min and flash frozen in liquid nitrogen. A 360°data set was collected using Cu-K␣ x-ray radiation using an R-AXIS IVϩϩ detector on RU-200/ Confocal blue optic source (Rigaku MSC).
Data Processing and Refinement of Structure-The diffraction images were indexed and integrated using MOSFLM and SCALA as part of the CCP4 suite (43). Analysis of the selfrotation function with MOLREP showed peaks at 0, 45, and 90°f or ϭ 180°and a single peak at 0.0°at ϭ 90°, suggesting a P422 point group. Analysis of the integrated intensities with POINTLESS indicated that the space group was either P4 3 2 1 2 or its enantiomer P4 1 2 1 2. The intensities were then merged to the P422 point group using SCALA. PHENIX (44) XTRIAGE was used to analyze the intensity statistics. Anomalous signal from the iodide was found to 3.2 Å, and there was no evidence significant pseudotranslation via the native translational Patterson function or twinning via the multivariate Z score L test. Because anomalous signal was found in the data set, PHE-NIX AUTOSOL was used to determine the heavy atom sub-structure and calculate single-wavelength anomalous dispersion experimental phases. Six refined iodide sites were found with a figure of merit of 0.28 and 0.48 (pre-and post-statistical density modification, respectively). Inspection of the experimental density map showed clear features corresponding to Nop-RBD secondary structure, and the correct enantiomorphic space group of P4 1 2 1 2 was determined from the maps. Initial model building using PHENIX AUTO-BUILD resulted in a model containing 50 sequence assigned residues and R work and R free values of 44 and 47%, respectively. Iterative model building in COOT (45) and phased maximum likelihood refinement in PHENIX resulted in a final model containing 115 of the 169 residues included in the construct and final R work and R free values of 31 and 34% at 2.5-Å resolution. Despite no additional unmodeled density in the ͉2F o ͉ Ϫ ͉F c ͉ maps greater than 2 , the R factors remained unacceptably high for this resolution. The data FIGURE 2. The residues in mNop5p important for L7Ae box C/D RNA binding localize to one region. The backside of Nop5p (A) is virtually devoid of residues important for the interaction. However, the front side (B) exhibits a well defined cluster of residues important for binding. Compared with wild type mNop5p binding affinity, alanine mutations that result in less than a 4-fold change are shown in green, and mutations resulting in a greater than 4-fold change in binding affinity are assigned to red. Residues with a greater than 2-fold change in binding affinity upon mutation to alanine are labeled. Several mutated residues listed in Table 1 are not visible in the A. fulgidus structure used for surface residue mapping. These residues either occur in disordered regions (Arg 299 and Lys 306 ) or lack an equivalent in the A. fulgidus structure (Lys 32 , Glu 44 , Lys 47 , and Glu 100 ). were then expanded into the P4 1 space group. Intensity statistics analysis by the Yeates Twinning server (46) suggested perfect merohedral twinning. The model built in the P422 point group was considered for molecular replacement generating two molecules in the asymmetric unit related by noncrystallographic symmetry. Manual model editing in COOT and coordinate, group atomic displacement factor, and twin fraction refinement with PHENIX lead to a final model with an R work value of 25.2%, an R free value of 28.6%, and a refined twinning fraction of 50%. Residues at both the N and C termini and the 15-amino acid linker region remained disordered in both the experimental and final model maps. Analysis of the final model in PROCHECK (47) showed 85.5% of the residues in the favored region, 13.5% of residues in the allowed region, and 1.0% of the residues in the outlier region. The model and structure factors have been deposited in the RCSB Protein Data Bank (3GQU and 3GQX).

Mapping the Binding Surface of Nop5p by Mutational Analysis Reveals a Distinct Region
Responsible for Binding to the L7Ae Box C/D RNA Complex-To identify the region of the surface of Nop5p responsible for high affinity interaction with the L7Ae box C/D complex, we employed a comprehensive mutational analysis approach in which likely surface residues were converted to alanine and changes in binding affinity were determined. The sequence and crystal structure of the Archaeoglobus fulgidus Nop5p-fibrillarin dimer (35) were used to determine which amino acids most likely reside on the surface by sequence alignment with our P. horikoshii variant. Alanine scanning was performed with the mNop5p construct containing an internal deletion that prevents its self-dimerization but still allows for it to bind the L7Ae box C/D RNA complex with near wild type affinity (22). The apparent equilibrium dissociation constant (K D ) for each mutant was determined by titrating the preformed L7Ae box C/D RNA complex with the mNop5p-fibrillarin heterodimer and quantifying the binding using an electrophoretic mobility shift assay.
Our mutational strategy involved changing conserved charged residues to alanine to find residues that potentially interact with the negatively charged phosphate backbone of the box C/D RNA. Residues were initially chosen to maximize surface coverage, but because the region responsible for binding to the L7Ae box C/D RNA complex emerged, our mutations became increasingly biased toward this part of the surface of the protein. In total, 31 mutations (Table 1) were made including two residues (Arg 299 and Lys 306 ) that resided within a 20-amino acid region disordered in the Nop5pfibrillarin crystal structure (35). Neither of these residues contributed to L7Ae box C/D RNA complex binding. The  region in direct contact with fibrillarin in the A. fulgidus structure was not explored except for one residue (Phe 347 ), because the function of this portion of the surface of Nop5p has been clearly established (35).
The mutational analysis revealed that the L7Ae box C/D RNA-binding surface is localized to a limited region of Nop5p (Fig. 2). One face of Nop5p does not contribute to the L7Ae box C/D RNA interaction, as evidenced by the observation that all alanine mutations associated with this region of the protein have no more than a 2-3-fold affect on binding affinity (Lys 32 , Glu 44 , Lys 47 , Glu 100 , Glu 104 , Asn 112 , Asp 253 , Lys 265 , Asn 320 , Arg 327 , Glu 350 , Glu 354 , Glu 361 , Lys 365 , and Lys 368 ; Fig. 2A and Table 1). On the opposite face of Nop5p, there is a clearly defined region where alanine mutations produce a pronounced reduction in L7Ae box C/D RNA complex binding affinity (S286, T287, Q289, Q326, R332, K337, R363, F347, and E366; Fig. 2B and Table 1). Residues peripheral to this place on Nop5p have little affect on binding (Phe 107 , Lys 131 , Glu 279 , Trp 324 , and Lys 329 ). A previous study identified a single residue (Arg 224 in A. fulgidus Nop5p) that abolished binding upon mutation to alanine (35). The equivalent residue in P. horikoshii Nop5p (Arg 332 ) is located in the region important for binding, and mutation of this residue results in a significant decrease in binding affinity, indicating that the two Nop5p variants behave similarly with respect to L7Ae box C/D RNA recognition.
This analysis revealed that all of the residues important for L7Ae box C/D RNA binding reside within the NOP domain (39,40) that has been previously implicated in L7Ae-sRNA (35) and 15.5K protein-U4 snRNA binding (48). These data suggest the possibility that the conserved NOP domain of Nop5p is a modular, independently functional domain in the absence of both the coiledcoil and the fibrillarin-binding regions. Therefore, we expressed a protein corresponding exclusively to this region (referred to as Nop-RBD) for further analysis.
The NOP Domain of Nop5p Exhibits near Wild Type Affinity for the L7Ae Box C/D RNA Complex-Nop-RBD was purified by the same protocol used for mNop5p and full Nop5p (22). This protocol requires high salt concentrations (1 M NaCl) to enhance the solubility of the mNop5p and full Nop5p proteins. Moreover, mNop5p requires very high salt concentrations to achieve activity in the context of the box C/D RNP (22). Removal of the fibrillarin-binding domain of mNop5p to create Nop-RBD alleviated this requirement for high salt resulting in a stable, well behaved protein under low ionic strength conditions. The NOP domain binds to the L7Ae box C/D motif RNA complex with nearly identical affinity as mNop5p (Fig. 3); mNop5p binds the complex with an apparent K D of 76 Ϯ 6 nM, whereas the NOP domain binds with an apparent K D of 106 Ϯ 4 nM. These data clearly show that residues in the fibrillarin-binding region of Nop5p (residues 1-118) are not contributing to the L7Ae box C/D RNA interaction. As expected, Nop-RBD has no detectable affinity for fibrillarin, as demonstrated through a gel mobility FIGURE 5. A, ribbon diagram of the crystal structure of the isolated Nop-RBD solved to 2. 5 Å resolution. The protein chain was colored as a gradient from blue (N terminus, residue 243) to red (C terminus, residue 373), and the ␣-helices were labeled using the nomenclature for the intact Nop5p protein (35). Two perspectives are shown with the right representing a 90°rotation of the structure on the left. B, the isolated Nop-RBD structure superimposes upon the P. furiosus Nop5p-fibrillarin structure (Protein Data Bank code 2NNW) with an RMSD of 0.77 Å over all amino acids common between the two structures. Fibrillarin and the fibrillarin-binding domain of Nop5p have been omitted for clarity. C, the isolated Nop-RBD structure superimposes upon the hPrp31 structure (Protein Data Bank code 2OZB) with an RMSD of 1.6 Å over all amino acids common between the two structures. Prp31 is shown in complex with 15.5K protein (an L7Ae homolog) and the 5Ј-stem-loop of the kink-turn containing U4 snRNA. D, model of the Nop-RBD interaction with the L7Ae box C/D RNA complex (Protein Data Bank code 1RLG) (35) using the hPrp31-15.5K-U4 snRNA complex (Protein Data Bank code 2OZB) (41) as a guide for orienting the box C/D components. L7Ae box C/D RNA complex (green) is docked against a surface representation of the Nop-RBD (yellow). Residues in mNop5p whose mutation to alanine strongly affects binding (K rel Ͼ 4) are highlighted in red, underscoring the consistency between the biochemical and structural data.
shift assay where fibrillarin is titrated against the L7Ae-Nop-RBD box C/D RNA ternary complex (data not shown).
The Conserved NOP Domain Is an Independently Folded Domain-The improvement in solution behavior of the P. horikoshii Nop5p NOP domain makes it an excellent target for crystallization both in isolation and in a variety of subcomplexes of the box C/D RNP complex. To validate this protein as a crystallization target, we crystallized and solved the structure of the Nop-RBD. We were able to obtain crystals that diffracted x-ray radiation to ϳ2.5-Å resolution and collected data of sufficient quality to yield a model of the NOP domain (Figs. 4 and 5A and Table 2). Notably, several regions of the protein including the first 25 residues of the construct (consisting of an N-terminal glycine, residues 119 -138, and the Gly-Gly-Ala-Gly linker replacing the coiled-coil domain), an internal region consisting of 16 amino acids (294 -309), and the 17 C-terminal residues (374 -390) were not observed in the resulting electron density map. Thus, approximately one-third of the protein mass could not be accounted for by the model, directly resulting in higher final R work and R free values for a structure at this resolution.
The resulting model when superimposed with the crystal structures of the full Nop5p protein from A. fulgidus (35) (Protein Data Bank code 1NT2) and P. furiosus (49) (Protein Data Bank code 2NNW) reveals an identical fold with RMSDs of 1.06 and 0.77 Å, respectively (Fig. 5B). Likewise, the P. horikoshii Nop-RBD superimposes well with the homologous region of the human Prp31 protein of the spliceosome (Protein Data Bank code 2OZB) with an RMSD of 1.61 Å (Fig. 5C). This structure thus reinforces the idea that the NOP domain is an independently folding and active RNP binding module (41).

DISCUSSION
In many RNP enzymes, there is a division of labor between the protein and RNA components. In box C/D, box H/ACA, telomerase, and the RNA-induced silencing complex, RNA serves to direct the enzyme to the appropriate target and thus dictates the specificity of the enzyme (37,50,51). On the other hand, the protein components of each of these enzymes perform the catalytic role by acting on the guide sRNA-target complex; in the box C/D RNPs this is the role of the S-adenosylmethionine-dependent methyltransferase fibrillarin (32)(33)(34). Thus, a central feature of these enzymes is the coordination of targeting and specificity determination by RNA and catalysis by proteins to achieve both high specificity and activity in the cell. In the case of the box C/D RNPs, this role appears to be played by the conserved protein Nop5p in archaea and the Nop56/58 proteins in eukaryotes (35,37,52,53).
Nop5p serves as the intermediary between specificity and catalytic activity by virtue of its ability to bind to both fibrillarin and the L7Ae box C/D RNP complex. Its interaction with the L7Ae-sRNA RNP is particularly notable as the kink-turn motif, formed by the box C/D sequence motif, is a fundamental building block of tertiary architecture within many biological RNAs (24,25,38,54). Despite its essential role in the function of the box C/D RNP (55)(56)(57), the basic details of how Nop5p serves as an intermediary between the box C/D motif of the sRNA and the active site of fibrillarin have only just begun to be revealed. In this study we have sought to improve our understanding of this interaction by an extensive mutational analysis aimed at identifying the region of Nop5p necessary for interaction with the L7Ae box C/D region of the archaeal box C/D small RNP complex.
The region of Nop5p important for binding to the L7Ae box C/D RNA complex has been suspected to occur within a highly conserved region called the NOP conserved domain that exists in other RNP complexes (35,40,48,55,58). These complexes exhibit diverse functionality but have in common an L7Ae homolog-kink-turn complex necessary for nucleation of complex assembly. In the spliceosome, for example, the conserved region of Nop5p is present in the hPrp31 (48) and 61K (58) proteins that recognize the 15.5K protein-U4 snRNA complex. Evidence for assigning an L7Ae box C/D RNA binding function to the conserved region of Nop5p is the presence of a large patch of surface residues whose identities are important for maintaining the affinity of Nop5p binding to the L7Ae box C/D RNP complex. Mutation of a single residue in this region (Arg 332 in this study) has a strongly deleterious effect on the ability of the Nop5p protein to interact with the L7Ae-kink-turn complex (35). Furthermore, in the spliceosome, a peptide within this conserved region cross-links to the kink-turn RNA, demonstrating direct interaction between the 61K protein and the 15.5K protein-U4 snRNA complex (48).
Our mutational analysis of the surface of Nop5p revealed that all residues shown to be important for high affinity and A crystal structure of a ternary spliceosomal subcomplex composed of hPrp31 (Nop5p homolog), the 15.5K protein (L7Ae homolog), and a kink-turn containing fragment of the U4 snRNA (Protein Data Bank code 2OZB) (41) has been solved. The conserved NOP domain of hPrp31 shares considerable sequence identity with the box C/D Nop5p protein examined in this study (Fig. 6). Superposition of the Nop-RBD structure solved in this study with the ternary spliceosomal RNP via the conserved NOP domain yields a very reasonable hypothetical model of the ternary interaction in the archaeal box C/D rRNP (Fig. 5D). Residues found important for the binding of Nop5p to the box C/D RNA complex correlate well with residues in hPrp31 directly in contact with the U4 snRNA. The precise details of the interaction, specifically how Nop5p discriminates between the U4 stem II and the box C/D stem, assuming an overall similar interface through the conserved NOP RNA-binding domain, must await further structural studies. Nevertheless, our data clearly support a composite protein-protein and protein-RNA interface similar to that observed in the homologous complex of the U4 snRNP (58).