Structure and Assembly of the RNA Binding Domain of Bluetongue Virus Non-structural Protein 2*

Bluetongue virus non-structural protein 2 belongs to a class of highly conserved proteins found in orbiviruses of the Reoviridae family. Non-structural protein 2 forms large multimeric complexes and localizes to cytoplasmic inclusions in infected cells. It is able to bind single-stranded RNA non-specifically, and it has been suggested that the protein is involved in the selection and condensation of the Bluetongue virus RNA segments prior to genome encapsidation. We have determined the x-ray structure of the N-terminal domain (sufficient for the RNA binding ability of non-structural protein 2) to 2.4 Å resolution using anomalous scattering methods. Crystals of this apparently insoluble domain were obtained by in situ proteolysis of a soluble construct. The asymmetric unit shows two monomers related by non-crystallographic symmetry, with each monomer folded as a β sandwich with a unique topology. The crystal structure reveals extensive monomer-monomer interactions, which explain the ability of the protein to self-assemble into large homomultimeric complexes. Of the entire surface area of the monomer, one-third is used to create the interfaces of the curved multimeric assembly observed in the x-ray structure. The structure reported here shows how the N-terminal domain would be able to bind single-stranded RNA non-specifically protecting the bound regions in a heterogeneous multimeric but not polymeric complex.

Bluetongue virus (BTV) 1 is a representative member of the Orbivirus genus within the Reoviridae family and has a tensegment double-stranded RNA (dsRNA) genome enclosed within a double capsid. The segments code for seven capsid and viral core proteins (VP1-VP7). The remaining three segments encode non-structural proteins (NS1, NS2, and NS3/NS3A) that are produced in the host cell at different stages of the infectious cycle and are presumed to be involved in the various steps of virus morphogenesis.
Bluetongue virus RNA segment 8 encodes non-structural protein 2 (molecular mass ϳ41 kDa), which is phosphorylated (1) and able to self-associate to form multimeric assemblies in complex with single-stranded RNA (ssRNA) (2). The protein is synthesized in large amounts throughout the replication cycle of Bluetongue virus and is associated with large dense perinuclear structures called viral inclusion bodies in BTV-infected mammalian cells (3,4). It is believed that BTV assembly occurs at the perimeter of these viral inclusion bodies (3). In insect cells, recombinant baculovirus-expressed NS2 has been shown to accumulate as multimers forming inclusion bodies, not unlike those observed in BTV-infected mammalian cells, even when synthesized independently of any other BTV-encoded proteins or viral RNA (5). Deletions at the C terminus of up to 130 amino acids do not affect the protein oligomerization (6). Compared with the other BTV proteins, NS2 appears unique by virtue of its ability to bind ssRNA (but not dsRNA). The protein is of particular interest because of its possible involvement in genome recognition (preference for viral ssRNA versus cellular ssRNA), which generally is known to be an early crucial step in the assembly process of the viruses. Two deletion mutagenesis studies (6,7) have indicated that the N-terminal half of the protein, which bears most of the homology within the NS2 family, is an RNA binding domain. Although the isolated Nterminal half of the protein can be functional, other RNA binding sites are required for the high affinity RNA binding (7). An analysis of the primary structure of NS2 does not reveal any sequence similarity to well known RNA binding motifs such as the ribonucleotide motif (8), the K homology (KH) domain originally described in the heterogeneous nuclear ribonucleoprotein (hnRNP) K protein (8), the arginine-rich motif (9), or the RGG and RS motifs (10).
NS2 shares a number of features with other non-structural proteins; for example, NSP2 of rotavirus (11) and NS of reovirus (12), the functions of which have been implicated in genome packaging. It has been shown that cells infected with temperature-sensitive mutants of NSP2 contain few replication assembly factories (viroplasms) and produce virus particles that are mostly empty (13).
To obtain insight into how NS2 and ssRNA interact and function, we have determined the crystal structure of the RNA binding domain of NS2 and examined it for potential binding surfaces as well as defined flexible protein regions that might become ordered upon RNA binding.

Protein Expression Studies of the N-terminal Domain Using Recombinant Baculovirus-infected Insect Cells and Escherichia coli-The S8
gene from BTV serotype 10 (encoding full-length NS2) contained in the recombinant plasmid pAcBTV-10.8 (5) was used to generate the NS2 1-177 construct. DNA coding for residues 1-177 was produced by PCR amplification on the NS2 1-354 sequence with the primers A (5Ј-C-GCGGATCCCATATGGAGCAAAAGCAACGTAG-3Ј) and B (5Ј-TGCG-CTCGAGCGGCCGCTTACGGCCGCGCCACGCTATGAACTTGAAG-3Ј) inserting BamHI and EagI sites before cloning the PCR product into a modified pVL1393 plasmid (Invitrogen) with a C-terminal His 6 tag. The plasmid for the untagged NS2 1-177 protein was produced using the BamHI and NotI sites. * This work was supported in part by the European Union Extension of Capabilities for Multiple Wavelength Anomalous Diffraction Project HPRI-CT-1999-50015. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The To produce C-terminal glutathione S-transferase-tagged protein, the primers C (5Ј-ATATACGGCCGGGATCTGAAAACCTGTACTTCCAGG-GCCATGGACATATGCATCACCATCACCAC-3Ј) and the reverse strand D (5Ј-CCGCTCGAGAGATCTTTATCCATGGGATCCGCCCTG-AAAATAAAGATTCTC-3Ј) were used with a modified pVL1393 plasmid (Invitrogen) with a His 6 glutathione S-transferase tag and a tobacco etch virus (TEV) site (plasmid and sequence available upon request) as a PCR template. The product was cut with EagI and ligated with the C-terminal His 6 -tagged pVL1393NS2 1-177 vector also cut with EagI. The constructs were checked by sequencing.
The NS2 1-177 constructs were expressed at 28°C in High 5 insect cells (Invitrogen) by means of infection with recombinant baculovirus at a multiplicity of infection of 10 plaque-forming units/cell. The recombinant baculoviruses for the expression of the desired protein were produced in Spodoptera frugiperda 9 cells. 65 h postinfection, the cells were harvested by centrifugation, and the protein samples were prepared to test for the solubility of NS2 1-177 . The expressed protein was found mainly in the pellet at 30 g/5 ϫ 10 7 cells.
The coding region for the 177-residue domain was excised from the pVL1393 vector using the BamHI and XhoI sites and ligated into the same sites of the N-terminal His 6 -tagged pBAT4 cloning vector (European Molecular Biology Laboratory, Heidelberg, Germany). The same 177-residue coding region was excised using the restriction enzymes NdeI and EagI to produce the appropriate restriction sites and was inserted into the C-terminal His 6 -tagged pET-21a vector (Novagen). The expression vectors were transformed into the BL21(DE3)pLys strain of E. coli. The expression levels of the NS2 1-177 construct were low (less than 1 mg/liter of culture), and the protein was found only in the pellet after the first centrifugation step, indicating that it was insoluble. Lowering the temperature during expression, using other salts during extraction, or adding up to 10% glycerol did not improve the solubility.
Expression and Purification of the TEV Insert Construct-The fulllength NS2 from Bluetongue virus serotype 10 was engineered with a TEV protease cleavage site between residues 182 and 183. The insert resulted in the sequence NS2 1-182 -Glu-Asn-Leu-Tyr-Phe-Gln-Gly-NS2 183-354 with the TEV cleavage occurring between Gln and Gly. The PCR amplification was performed in two steps. In the first step, fragments were generated using primers A and E (5Ј-GCCCTGAAAATAA-AGATTCTCCGACTCCTCCCTTGGCGC-3Ј) for the first fragment and F (5Ј-GAGAATCTTTATTTTCAGGGCCGCTGGATGGATGATGA-TGAG-3Ј) and G (5Ј-TGCGCTCGAGCGGCCGCTTACGGCCGA-ACGCCGACCGGCAATATG-3Ј) for the second fragment. The second PCR step used the two generated fragments together with primers A and G. The PCR product was ligated into expression vector pET-22b(ϩ) (Novagen) at the NdeI and EagI sites. The plasmid for untagged NS2 1-354 protein was produced using the NdeI and XhoI sites. The resulting C-terminal His 6 -tagged plasmid was transformed into E. coli BL21(DE3)pLys (Novagen), and 1-liter expression cultures were grown in Luria-Bertani medium containing ampicillin at a concentration of 50 g/ml and chloramphenicol at 34 g/ml. The cells were grown at 37°C to an A 600 of ϳ0.6, and expression was induced at 25°C with 0.4 mM isopropyl-␤-D-galactopyranoside. After 5-6 h of induction, the cells were harvested by centrifugation and lysed by sonication in 100 mM sodium phosphate buffer, pH 8.0, 1 M sodium chloride, 10 mM dithiothreitol in the presence of protease inhibitor mixture (Roche Applied Science). After the sonication step, the supernatant was clarified by ultracentrifugation (Sorvall SS34 rotor, 20,000 rpm, 4°C, 1 h), filtered through a 0.22-m membrane, and loaded on a Q-Sepharose column (Amersham Biosciences) equilibrated with 100 mM sodium chloride in 100 mM sodium phosphate buffer at pH 8.0. The protein eluted at a concentration of ϳ400 mM sodium chloride. Purity of the protein was monitored by SDS-PAGE. The protein fractions at the absorption peak were pooled and diluted by a factor of two prior to loading onto a HiTrap Heparin column (5 ml) (Amersham Biosciences) and equilibrated with 200 mM sodium chloride in 10 mM sodium phosphate buffer at pH 8.0. After a 4-column volume wash, a two step elution of 300 mM and of 600 mM sodium chloride was developed. The C-terminal His 6 -tagged fusion protein eluted at a concentration of 600 mM sodium chloride. To decrease the salt concentration, the eluted protein solution was dialysed against the storage buffer consisting of 10 mM sodium phosphate buffer, pH 8.0, 500 mM sodium chloride, 2 mM dithiothreitol at 4°C overnight. The selenomethionine-labeled protein was expressed in a methionineauxotroph cell host, the E. coli B834(DE3)pLys strain (Novagen), and purified in the same manner as the native protein. To avoid oxidation of selenomethionine, all buffers were flushed with nitrogen and supplemented with 10 mM dithiothreitol.
Crystallization and Data Collection-Diffraction-quality crystals of the N-terminal domain of NS2 protein were obtained at 20°C by the hanging drop vapor diffusion technique. The crystallization drops consisted of 1 l of mixed protein solution (8 -15 mg/ml) with TEV protease (50 g/ml) and 1 l of reservoir solution. Each drop was equilibrated against 1 ml of the reservoir solution, and conditions of 10 mM sodium phosphate buffer plus 0.35-0.65 M sodium chloride at pH 7.5 and 10 -25% (v/v) Jeffamine M-600 gave the best results. Selenomethionine derivative crystals could be grown by the hanging drop vapor diffusion method at 20°C from 10 mM sodium phosphate buffer, pH 7.5, sodium chloride between 0.4 M and 1.2 M, 20-25% Jeffamine M-600 (v/v), plus 10 mM dithiothreitol. Needle-like crystals with dimensions of 0.7 ϫ 0.01 ϫ 0.01 mm 3 were observed for the native protein. The crystals belonging to space group P6 5 diffracted to 2.4 Å resolution with cell dimensions of a ϭ b ϭ 102.29 Å and c ϭ 77.91 Å, and the value of the Matthews coefficient (14) suggests that there are two molecules in the asymmetric unit (V M ϭ 2.8 A 3 /Da with a solvent content of 54%). The morphology and the size of selenomethionine derivative crystals were similar to those of the native crystals, and they diffracted to a maximum resolution of 2.9 Å. The native data set was collected at the European Molecular Biology Laboratory beamline X13 (Deutsches Elektronen-Synchrotron, Hamburg, Germany), and selenomethionine derivative data sets were collected at the European Molecular Biology Laboratory beamlines BW7A and X11 as well as at the European Synchrotron Radiation Facility beamline BM14 (Grenoble, France). A selenomethionine-containing crystal soaked with ssRNA, as described under "ssRNA Soaking Experiments," was used for data collection on the European Molecular Biology Laboratory X11 beamline. Data were reduced, merged, and scaled using the programs DENZO and SCALE-PACK, respectively (15). Details of all crystallographic data collections are given in Table I.
ssRNA Soaking Experiments-A synthetic oligonucleotide having the rotavirus 5Ј consensus sequence 5Ј-GGCUUUAAAAG-3Ј was cleaved from a solid support, and all of the protecting groups were removed according to the supplier's protocol (CRUACHEM, UK). The concentration of the RNA solution was 42 M. Previous binding experiments showed that a suitable RNA binding buffer was 10 mM HEPES, pH 7.8, 40 mM potassium chloride, 1 mM EDTA, 1 mM dithiothreitol, and 5% glycerol. 2 Crystals of selenomethionine-containing protein were grown from 10 mM HEPES buffer, pH 7-8, 0.7-0.9 M potassium chloride, 10 mM dithiothreitol, and 20 -25% (v/v) Jeffamine M-600, and soaking experiments were performed by adding between 2 and 4 l of ssRNA solution to the crystal-containing drops.
Structure Determination and Refinement-The crystal structure of the RNA binding domain of NS2 ( Fig. 1) was solved by the SIRAS method using the 4.0 Å data set collected at the selenium absorption edge (0.973 Å). Fourteen of sixteen possible selenium sites were located. Eleven sites were identified using the program SnB (16), and three other selenium positions were found after analyzing the residual maps produced by SHARP (17). Positions and occupancies were initially refined with the program MLPHARE (18) and further in the program SHARP, which was also used to generate SIRAS phases. The final SIRAS phases had a figure of merit of 0.51 for the entire resolution range (18 -4.0 Å). Density modification procedures, including solvent flattening and two-fold non-crystallographic symmetry averaging of the experimental map were used to extend the phases to 2.4 Å in the program RESOLVE (19). The non-crystallographic symmetry operator was determined with RESOLVE from the identified selenium atom positions.
To obtain experimental phases to a higher resolution than 4.0 Å (thus improving the initial basis of the phase extension procedure and consequently obtaining a superior map) a three-wavelength anomalous dispersion data set was collected to a 2.9 Å resolution, and phases were determined using SOLVE (20). The initial phases had a figure of merit of 0.65 in the resolution range from 20.0 to 3.0 Å. Further phase improvement and extension to 2.4 Å using solvent flattening and twofold non-crystallographic symmetry averaging were carried out in the program RESOLVE. Two molecules in the asymmetric unit were built in the electron density map obtained from the three-wavelength anomalous dispersion data using the program O (21). Restrained refinement of coordinates and temperature factors was carried out using the program REFMAC5 (22). Bulk solvent correction, different types of noncrystallographic symmetry restraints between parts of the two monomers, as well as translation, libration, and screw-rotation refinement were used during the refinement calculations with REFMAC5. Waters were added to the model in an automated manner using the protocol implemented in the ARP/wARP software (23) and manually verified in O. The quality of the model was checked using the program PRO-CHECK (24). The figures were generated using the programs MolScript (25), ALSCRIPT (26), and PyMOL (pymol.sf.net).

RESULTS
Domain Identification-The full-length protein has been initially expressed using recombinant baculovirus according to the procedures described by Thomas et al. (5) and was purified to homogeneity as described under "Experimental Procedures." In the presence of protease inhibitors, crystallization trials failed either because the domain structure of the protein introduced conformational heterogeneity or more probably because the protein was heterogeneously aggregated. When protease inhibitors were not used, needle-like or chunky crystals were obtained and diffracted to a 3.7 Å maximum resolution. Data (not shown) were collected to low resolution and analyzed, and they revealed a hexagonal space group with unit cell dimensions a ϭ b ϭ 104 Å and c ϭ 81.7 Å. SDS-PAGE showed that different crystals contained varying amounts of full-length protein and proteolytic products, but that despite this, the diffraction pattern was the same. This suggested that only the major proteolytic product was ordered in the crystals. We also observed that proteolysis could be accelerated by adding trypsin. Mass spectroscopy analysis of the crystals indicated that the first 177 residues formed the major proteolytic product, which was also necessary and sufficient for RNA binding. Constructs of residues 1-177 were expressed with His 6 or glutathione S-transferase tags at the C terminus using recombinant baculovirus-infected insect cells or expressed in E. coli with a His 6 tag at the N and C termini for the purpose of crystallization. Although expression of the 178 -354-residue domain resulted in soluble material, 2 we could not obtain soluble protein for the   ϭ b ϭ 102.29 a ϭ b ϭ 102.30  a ϭ b ϭ 102.16 a ϭ b ϭ 101.85 a ϭ b ϭ 101.85 a ϭ b ϭ 102 1-177-residue domain. NS2 is a phosphoprotein, and we have shown by ion exchange chromatography the presence of two species, possibly in different phosphorylation states when expressed using recombinant baculovirus-infected insect cells.
Separation and crystallization of the individual species did not improve the crystal quality.
As it was not possible to obtain soluble protein for the 1-177residue construct, we decided that crystallization under proteo- Completely conserved residues are shown within a box, identical residues in BTV strains are colored light gray, and similar residues are dark gray. The first three sequences correspond to proteins from different Bluetongue virus serotypes, and the last two correspond to other orbiviruses namely, Epizootic Hemorrhagic Disease virus 2 (EHDV2) and African Horsesickness virus 9 (AHSV9). The secondary structure elements shown above the alignment are obtained from the x-ray structure. Residues that are likely to be involved in RNA binding are highlighted with an asterisk. lytic conditions would be the method of choice. Crystals of the N-terminal domain (NS2 N ) diffracting to higher resolution were obtained after engineering a TEV protease cleavage site between residues 182 and 183 of the full-length protein ( Fig. 2A). Because the polypeptide region beyond residue 177 was predicted to be unstructured, we decided to use the next five residues of the sequence as a linker to the engineered TEV cleavage site. The insertion of this cleavage site resulted in a better control of the proteolytic cleavage process during crystallization, perhaps because TEV is a highly specific and only moderately active protease. The construct was expressed in E. coli to avoid previously identified heterogeneities due to variable phosphorylation. It was shown that TEV protease cleaves this construct and that small crystals could be obtained when this was done.
The Overall Structure-The structure of NS2 N was determined by anomalous scattering methods using crystals of selenomethionine-containing protein. The model presented here was refined to a resolution of 2.4 Å with an R-factor of 21.1% and a free R value of 26.7%. The domain crystallized with a dimer in the crystallographic asymmetric unit. The refined model contained residues 8 -160 of both molecules and 103 water molecules. Residues 161-182, the six additional residues of the TEV cleavage site insertion, and residues 1-7 at the N terminus were not seen in the electron density map and were presumed to be disordered in the crystal. The refinement sta-tistics as well as an evaluation of the quality of the current model are shown in Table I.
Each monomer exhibits a mixed ␤ sandwich structure (Fig.  1A). One layer consists of ␤1, ␤2, ␤3, ␤4, ␤8, and ␤9. A short helical region (3 10 and ␣1) connects ␤1 with ␤2. Almost matching the length of one dimension of the monomer, the ␤8 strand intertwines with the ␤9 strand, and both are twisted. ␤8 forms a small antiparallel ␤ sheet with ␤10 from the C terminus. The other layer consists of ␤5, ␤6, and ␤7. The ␤4 strand links the two ␤ sheets forming a highly stable structural core (Fig. 1A). The C terminus of the construct (Met 134 -Arg 160 ) extends out ϳ22 Å from the body of the monomer making contacts with an adjacent subunit in the crystal. In this respect, NS2 N is similar to another RNA-binding protein, the ribosomal protein S15 from Bacillus stearothermophilus (28), which consists of a core and an N-terminal segment that protrudes 36 Å from the core of the protein and interacts with a neighboring monomer. No other proteins were found to exhibit a similar topology (Fig. 1B) when searching for structural similarity in the Protein Data Bank using the DALI server (29).
Structure of the Dimers and Dimer Interfaces-The two molecules in the asymmetric unit are related by a non-crystallographic two-fold axis and are superimposable with a root mean square deviation of 0.41 Å over all C␣ atoms of residues 8 -160. There are two possible descriptions for the homodimeric con- figuration: monomers interact 1) through their C termini giving rise to an interface, which buries almost 20% of the total monomer surface (1831 Å 2 ) (Fig. 3A), and 2) through a continuation of ␤ sheet structure (Fig. 3B), which results in a buried surface area of 1075 Å 2 per monomer.
The formation of the largest buried surface (Fig. 3A) involves an extension of the C terminus (residues 134 -160) folding over the adjacent molecule. The homodimer is maintained by hydrogen bonding interactions, hydrophobic packing, and a symmetrical salt bridge interaction. Asn 134 , which is absolutely conserved throughout the NS2 family, located on ␤10 interacts with ThrЈ 102 located on ␤Ј8 of the second monomer (Fig. 4A). Furthermore, the interaction between ThrЈ 102 and TyrЈ 121 (␤Ј9), which is mediated through a water molecule, places the C terminus of one monomer in interaction with the structural core of the adjacent monomer. The main chain of Pro 145 makes a hydrogen bond with the hydroxyl group of conserved TyrЈ 34 (␤Ј2) from the neighboring monomer (Fig. 4A). Met 148 lies be-tween 3 10 and ␣2 and interacts through main chain hydrogen bonds with ArgЈ 107 , which lies at the end of ␤Ј8 of the second monomer (Fig. 4B). In addition, the side chain of Ser 150 is engaged in a hydrogen bonding interaction with the side chain of AspЈ 116 , located on ␤Ј9 (Fig. 4B). The orientation of the conserved Arg 158 appears to be constrained by a salt bridge interaction with conserved GluЈ 118 (␤Ј9) (Fig. 4B). A number of hydrophobic interactions are involved in homodimerization. For example, conserved residue Leu 144 and conservatively replaced residue Val 147 are facing conserved residues MetЈ 104 (␤Ј8), ValЈ 57 (␤Ј4) and IleЈ 119 (␤Ј9). Of 27 residues involved in the interface formation, 11 are conserved, and 4 are conservatively replaced within the members of the NS2 family (Fig. 2B). A large pocket harboring water molecules is found at the 1831 Å 2 homodimeric interface. The network of hydrogen bonds between the water molecules and oxygens and nitrogens of residues pointing toward this interface strongly stabilizes the interaction between monomers. The second homodimeric interface (Fig. 3B), the driving force for propagating the NS2 N assembly, is mediated by formation of continuous ␤ sheets. Upon dimerization, extended ␤ sheets formed by eight (5 ϩ 3) strands are observed on both sides of the homodimer. Although the monomer area buried at this interface is less than the buried area at the first interface, the strength of the interactions at the second homodimeric interface is substantial. The homodimerization determinants reside within the hydrogen bonding interactions between the main chains of residues located on ␤3 and ␤7 of one monomer with the main chains of residues located on ␤Ј7 and ␤Ј3 of the second subunit, resulting in a very stable structure (Fig. 4, C and D).
Oligomeric State-According to recent sedimentation experiments (2), the full-length protein in solution exists as an 8 -10 S multimer with a molecular mass between 140 and 250 kDa and assembles from 6 Ϯ 2 subunits. Our preliminary small angle-scattering experiments (data not shown) on the fulllength protein suggest a molecular mass corresponding to about a decamer. In the crystal of the N-terminal domain, repetition of the two monomer-monomer interfaces, described in the previous section under "Structure of the Dimers and Dimer Interfaces," gives rise to a helical structure in the crystal with a pitch equal to the length of the c axis (77.91 Å). The helical arrangement is generated through the application of the crystallographic 6 5 symmetry operator on two non-crystallographic symmetry-related monomers that form the asymmetric unit. Of the entire surface area of a single subunit, which is ϳ9150 Å 2 , 31% is used to create the interfaces that bring the subunits together. The helical structure in the crystal is infinite and has a central channel with a diameter of 76.7 Å (Fig.  3, C and D). Assuming that the volume occupied by the Cterminal domain is about equal to the volume of the present N-terminal construct, we find that the interior channel of the helical structure has insufficient volume to accommodate the C-terminal domains of each of the monomers, let alone the C-terminal domains plus some RNA. It could, however, accommodate some C-terminal domains, explaining why we observed the N-terminal domain with varying amounts of full-length protein on the SDS-PAGE of our original crystals. It seems likely that the N-terminal domain of NS2 alone drives the oligomerization, a conclusion further substantiated by the observation that the C-terminal domain is monomeric in solution. 2 A notable feature of the helical structure is the presence on its surface of a high number of conserved solvent-accessible residues. Only 30% of the absolutely conserved residues are involved in the hydrophobic core of the protein. Large clusters FIG. 5. Delineation of the conserved solvent-exposed residues on the threedimensional surface representation of the NS2 N dimer. A, representation in which conserved solvent-exposed residues are mapped on the concave surface of the NS2 N dimer. B, another orientation of the NS2 N dimer identical to that shown in Fig. 3A, which highlights the conserved residues.
of proximally located conserved residues are observed on the exposed area of a single subunit and on the interior surface of the helical assembly. One area of conserved residues, located on the interior surface of the monomer and projecting into the solvent channel, consists of residues Lys 38 , Pro 53 , Lys 54 , Tyr 56 , Asp 70 , Gly 71 , Asp 73 , Glu 118 , and Arg 120 as well as residues which have been shown to be implicated in RNA binding, namely Phe 8 , Thr 9 , and conservatively replaced Lys 10 . Positively charged RNA binding residues, ArgЈ 155 , ArgЈ 158 , and LysЈ 160 from a neighboring monomer are located nearby (Fig.  5A). Another conserved patch includes acidic residues Glu 85 , Glu 92 , and Glu 93 but also Thr 87 , Arg 90 , Pro 130 , Tyr 131 , Asn 134 , Ile 37 , Ile 39 , and Arg 41 as well as two solvent-exposed tryptophan side chains, Trp 91 and Trp 94 (Fig. 5B). Asn 134 bridges two water molecules infiltrated at the interface, interacting with Trp 94 as well as with the conserved but not solvent-exposed Arg 67 . It is frequently reported that conserved and solventaccessible residues are involved in interactions with ligands or other proteins. NS2 may interact with other BTV proteins, namely with the proteins from the transcriptase complex (VP1, VP4, VP6) (30) of the virus before genome encapsidation, and it is tempting to speculate that the solvent-exposed conserved residues will mediate those interactions.
RNA Interaction Surface-NS2 is reported to bind BTV ss-RNAs as well as other ssRNA species in a sequence-independent manner indicating that the binding motif of the ssRNA is non-specific (7, 31), although not necessarily independent of the RNA structure (30). It has been demonstrated that multiple recombinant NS2 molecules bind to transcripts of rotavirus gene 8 ssRNA, suggesting that the RNA binding may require homomultimer formation (2).
In the crystal structure, both the N and C termini of the construct extend into the center of the helical structure, projecting into the central solvent channel (Fig. 3D). Although these termini are disordered, they contain residues that have been identified by mutational analysis as important for RNA binding. The absence of electron density for residues 1-7 and 161-182 implies that these residues have conformational flexibility. Such conformational flexibility would be in agreement with other data suggesting that 11 residues (2-11) (6) at the beginning of the construct and 14 residues (153-166) (7) at the C terminus of the construct are important for the RNA binding function of the protein. We therefore propose that RNA binds to the inner concave side of the helical assembly, as all the missing residues important for the RNA binding are exposed in this area (Fig. 3D). They could conceivably become more ordered upon RNA binding. The RNA soaking experiment of a crystal does not show increased order, but it does show a substantial change (Table I, 5 Å) in the pitch of the helix, suggesting that there is a degree of flexibility and consequent relative domain movement in the solid state. The region determining the relative positions of molecules within the helical structure is around residues 134 -138 and is located before the largest homodimerization interface. These findings strengthen the previous observation that the largest homodimerization interface does not provide the same rigidity as the extended ␤ sheet interface.
Guided by the distribution of the residues, which are likely to be involved in RNA binding, the putative RNA interaction surface on a homodimer can be delimited into two symmetrical regions made up of the N-terminal part of ␤1, the C-terminal part of ␤2, the loop connecting ␤2 with ␤3 (where completely conserved Arg 41 is placed), and two residues from ␤9, ␣Ј2 and the CЈ-terminal extension (ArgЈ 158 -LysЈ 160 ) from the adjacent monomer (Fig. 6). It is conceivable that an extended region of ssRNA will non-specifically interact with the two sites symmetrically placed on the homodimer and will flexibly fit into the curved area between the two sites.
Mutagenesis of Glu 2 -Lys 11 points to the importance of this region for RNA binding (6). Two arginines, Arg 6 and Arg 7 , when mutated to leucines, result in a reduction in the affinity of the protein toward ssRNA, whereas when Lys 4 is mutated to leucine, this results in a total abrogation of the ssRNA binding (6). The glutamic acid Glu 2 was not required for the ssRNA binding (6). Analysis of sequence conservation in this RNA binding region among the members of the NS2 family (Fig. 2B) revealed that, of the four residues, only Lys 4 was invariant. Arg 6 , Arg 7 , and Glu 2 were conserved in four members of the family with the single exception of NS2 from AHSV9, where they were replaced by two glutamines (Gln 6 and Gln 7 ) and a valine (Val 2 ). It is also important to note that in the RNA binding region between residues 153 and 166, the basic residues Arg 155 , Arg 158 , Lys 160 , and Arg 165 are absolutely conserved in all the members of the NS2 family. Arg 162 is conservatively replaced, except in BTV1s where it is replaced by a FIG. 6. Stereo ribbon diagram indicating secondary structure elements with residues likely to contribute to ssRNA binding. Some of the amino acids known to be essential for ssRNA binding are drawn in ball-and-stick representation on ␤1 (Thr 9 -Asn 11 ) and CЈ-terminal tail (ArgЈ 158 -GlnЈ 159 ). On ␤2 (Arg 38 ), the ␤2/␤3 (Arg 41 ) loop and ␤9 (Glu 118 , Lys 120 ) are located putative ssRNA interaction residues. The salt bridge between Glu 118 (␤9) and ArgЈ 158 brings the N terminus of one monomer in interaction with the CЈ terminus of the neighboring monomer.
glutamine. As most of the conserved residues involved in RNA binding are positively charged, it appears that the interaction with the nucleotide bases of the RNA is unimportant relative to the interaction with the sugar-phosphate backbone. DISCUSSION Recent crystallographic studies (32,33) have provided insight into the structural organization of proteins and the dsRNA genome within Bluetongue virions. The relationship between the assembly process, the viral structure, and the genome replication and packaging is, however, not understood as yet. A critical question that cannot currently be answered is how viral RNAs are selectively packaged so that each progeny virion acquires the correct set (one copy of each segment) of the dsRNA genome.
We have studied here the crystal structure of the RNA binding domain of Bluetongue virus non-structural protein 2 that is presumed to be involved in genome replication and packaging. The presented work demonstrates that this domain of NS2 forms homomultimers, revealing the molecular basis for NS2 oligomerization. In the solid state, molecules of NS2 N form infinite helical structures, the helical repeat of which determined the crystallographic c axis. For full-length protein, sedimentation and small angle x-ray scattering experiments showed that the extent of the oligomerization is limited. This may be explained by noting also the requirement of limited proteolysis for crystal formation when the size of the C-terminal domain prevents the helix from forming a complete turn. The intermolecular interfaces revealed by the crystal structure, coupled with this information, suggest that the protein can exist as a stable homomultimer consisting maximally of 11 subunits.
Hitherto there were not too many structural studies reporting protein interactions with single-stranded RNA, as in general proteins tend to interact with more structured RNA elements such as loops, bulges, and branched helices formed by internal base pairing; for example, the interaction between the glutaminyl-tRNA synthetase (34) and aspartyl-tRNA synthetase (35) with the anticodon loops of tRNA or the extra loop of tRNA Ser recognized by seryl-tRNA synthetase (36). In singlestranded RNA structures, the bases are not paired but fully accessible for interactions with the protein side chains, and consequently the base recognition strategies are more diverse than in the case of double-stranded nucleic acid structures.
A detailed description of the way in which single-stranded RNA binds to NS2 N cannot be given because crystals of a complex with ordered RNA have not yet been obtained. However, from both the biochemical and the structural evidence presented here, it seems reasonable to hypothesize that the oligonucleotide binding site is present on the inner concave surface, which is formed upon oligomerization. We suggest that homomultimers are necessary for binding to ssRNA. The concave surface of the NS2 homomultimers, which we expect to be the region for RNA interaction, is reminiscent of the concave surface of the human Puf helical repeat protein (37), but in this case, each of the eight repeats of the protein makes contact with the bases of its ssRNA. Archaeoglobus fulgidus Sm1 (38) (AF-Sm1), in a similar manner to NS2, uses multiple copies to bind ssRNA. AF-Sm1 forms a heptameric ring with the RNA bases bound via stacking and specific hydrogen bonding contacts inside the central cavity, whereas the phosphates remain solvent-accessible. Unlike the previous proteins, where multiple protein copies recognize RNA, resulting in multiple RNA binding surfaces, NSP3 (39) from rotavirus uses an unusual RNA recognition strategy. Asymmetric homodimerization of NSP3 creates a single RNA binding surface with a hexanucleotide enclosed in an RNA binding tunnel. In no case does it seem likely that the mode of ssRNA binding would be similar to that of NS2.
This structural study increases the number of known ssRNA binding scaffolds and contributes to the characterization of the ssRNA-binding proteins. Known ssRNA binding folds are the ribonucleoprotein fold, observed in sex-lethal protein (40) and human polyalanine binding protein (41) proteins or the oligonucleotide and oligosaccharide binding fold, observed in the transcription factor Rho (27). All of these proteins have a common mode of ssRNA recognition, namely through ␤ sheet binding, where one or multiple ␤ sheets form binding pockets for the nucleotide bases.
Sequence comparison among the members of the NS2 family indicates that the first half of the protein (residues 1-182) is very well conserved. Conservation of NS2 N within different orbiviruses (EHDV2, AHSV9, BTV) and different serotypes of BTV (BTV10, BTV17, BTV1S, BRV1X) might be a reflection of the constraints on the protein to maintain the NS2 N function. It is highly likely that the other members of the NS2 family have three-dimensional structures similar to that reported here.