A Map of Protein-rRNA Distribution in the 70 S Escherichia coli Ribosome*

Neutron scattering exploits the enormous scattering difference between protons and deuterons. A set of 42 x-ray and neutron solution scattering curves from hybrid Escherichia coli ribosomes was obtained, where the proteins and rRNA moieties in the subunits were either protonated or deuterated in all possible combinations. This extensive data set is analyzed using a novel method. The volume defined by the cryoelectron micro-scopic model of Frank and co-workers (Frank, J., Zhu, J., Penczek, P., Li, Y. H., Srivastava, S., Verschoor, A., Rad-ermacher, M., Grassucci, R., Lata, R. K., and Agrawal, R. K. (1995) Nature 376, 441–444) is divided into 7890 densely packed spheres of radius 0.5 nm. Simulated annealing is employed to assign each sphere to solvent, protein, or rRNA moieties to simultaneously fit all scattering curves. Twelve independent reconstructions starting from random approximations yielded repro-ducible results. The resulting model at a resolution of 3 nm represents the volumes occupied by rRNA and protein moieties at 95% probability threshold and displays 15 and 20 protein subvolumes in the 30 S and 50 S, respectively, connected by rRNA. 17 proteins with known atomic structure can be tentatively positioned into the protein subvolumes within the ribosome in agreement with

Ribosomes are supramolecular complexes responsible for protein synthesis in all organisms. Each of the two unequal ribosomal subunits is a complicated assembly of proteins and nucleic acids. Thus, the prokaryotic 70 S ribosome from Escherichia coli has a total molecular mass of about 2.3 ϫ 10 6 Da and consists of a 30 S and 50 S subunit (1). The 30 S subunit contains 21 individual proteins and a single 16 S rRNA molecule, and the 50 S subunit, 33 proteins and two rRNA molecules (5 S rRNA ϩ 23 S rRNA). In both subunits the rRNA moieties account for about two-thirds of the mass. A better description of the interactions between proteins and rRNA in the ribosome is of paramount importance for the understanding of the structural mechanism of protein synthesis.
The structure of the ribosome has been extensively studied for decades using various methods. In the last years, cryoelectron microscopy (cryo-EM) 1 yielded models of the overall shape of the ribosome at a resolution of 1.5 nm (2)(3)(4)(5). Recent tremendous progress in the x-ray crystallography provided electron density maps down to 0.78 nm for ribosomal complexes containing tRNAs (6) and to 0.45 nm for the individual ribosomal subunits (7)(8)(9)(10). Despite these remarkable achievements, limited information has so far been obtained about the spatial distribution of rRNA and proteins in the ribosome. Using the x-ray maps, six ribosomal proteins were positioned in the small (7,10) and four in the large subunit (8). The low contrast between ribosomal proteins and rRNA for both electrons and x-rays makes it difficult to distinguish between the two components even at relatively high resolution. Moreover, disordered or flexible domains do not show up in the electron density maps obtained by the x-ray crystal analysis (6,8).
Neutron scattering in solution, although yielding only lowresolution information (11), is a powerful tool to study complexes of proteins and nucleic acids. Scattering from a particle or its component is proportional to the squared contrast (difference between the scattering density of the particle/component and that of the solvent), and measurements at different contrasts provide additional information about the object (12). As the neutron scattering lengths of hydrogen and deuterium differ drastically, isotopic H/D substitution is widely employed for the contrast variation. In earlier studies using H 2 O/D 2 O mixtures (13,14), integral parameters of the protein and rRNA moieties were established. Yet more information is provided by selective deuteration of the ribosomal components. Triangulation of labeled protein pairs in the ribosomal subunits so far provided the most comprehensive information about the protein positions (15,16).
In a recent study (17,18), 42 solution scattering curves from the hybrid E. coli ribosomes were measured, where the proteins and rRNA moieties in the subunits were selectively deuterated. This is probably the most extensive set of consistent x-ray and neutron contrast-variation data collected on a single object. The curves permitted to validate cryo-EM models of the ribosome (2, 4) and they were further interpreted in terms of a solid body model constructed from the envelopes of the subunits and those of the rRNA moieties. Here, we exploit the data set further using a new analysis method (19), where an overall shape of the object provided by cryo-EM is filled by densely packed small spheres. The algorithm assigns each sphere to the rRNA or protein moieties (or to the solvent) to simultaneously fit the available scattering data. Its efficiency was illustrated (19) by an ab initio restoration of a model structure represent-ing the envelope of the 30 S ribosomal subunit with several embedded proteins. The application of the new method significantly improves the resolution of the neutron-scattering maps of the ribosomal structure. A three-dimensional map of the protein-rRNA distribution is established which, in particular, reveals the likely positions of individual ribosomal proteins or protein complexes in the 70 S E. coli ribosome.

EXPERIMENTAL PROCEDURES
Experimental-The sample preparation and scattering experiments are described in detail elsewhere (17,18). The 70 S samples of the E. coli ribosome were made by in vitro association of selectively deuterated ribosomal subunits and checked for structural integrity and biological activity. The neutron experiments were performed at the Risø Laboratory (Roskilde, Denmark (20)) and at the GKSS Research Center (Geesthacht, Germany (21,22)). Using two sample-detector distances ("low angle" and "high angle" setting), the range of momentum transfer 0.09 Ͻ s Ͻ 2.3 nm Ϫ1 was covered (s ϭ (4/)sin, where is the wavelength, and 2 is the scattering angle). The x-ray synchrotron radiation scattering data were collected on the EMBL beamline (23)(24)(25) at HASYLAB (Hamburg, Germany) in the range 0.08 Ͻ s Ͻ 1.6 nm Ϫ1 . The full list of samples is presented in Table I.
Dummy Atoms Model-The 70 S ribosome consists of four components representing the total proteins (TP) and the rRNA moieties in the 30 S and 50 S subunits (denoted below as TP30, TP50, RNA30, and RNA50, respectively). The search volume was defined by drawing a regular grid of points in space corresponding to dense hexagonal packing of spheres of radius r 0 ϭ 0.5 nm inside a parallelepiped determined by the gabarites of the cryo-EM reconstruction of the 70 S E. coli ribosome (2). If the distance between a point and the closest pixel in the EM reconstruction did not exceed 1.5 r 0 , a dummy atom is placed at this point. This procedure expands the original model and thus reduces the bias and allows for deviations from the EM shape. The structure of the four component dummy atoms model (DAM) is defined by assigning a number (component index) 0 Յ X j Յ 4 to each dummy atom. As the subunits are well resolved in the cryo-EM model, the 30 S and 50 S subvolumes are defined where the dummy atoms may have the component index 0,1,2 (solvent, TP30 or RNA30) and 0,3,4 (solvent, TP50 or RNA50), respectively. The atoms at the subunit interface can belong to any component. The DAM in Fig. 1 contains 2644 atoms in the 30 S subvolume, 5020 atoms in the 50 S subvolume, 196 atoms at the interface, and encloses the volume of 5560 nm 3 (that of the original cryo-EM model is 3920 nm 3 ).
In keeping with the low resolution of the solution scattering data, the model must be constrained to have low resolution with respect to r 0 . For this, a list of neighbors (i.e atoms at an offset 2r 0 ) is defined for each dummy atom. Looseness (degree of isolation) of a non-solvent atom is calculated as P(N e ) ϭ exp(Ϫ0.5N e ) Ϫ exp(Ϫ0.5N c ), where N e is the number of neighbors having the same index and N c ϭ 12 is the coordination number for hexagonal packing. Looseness of the configuration X (i.e. its non-compactness) is characterized by the average value P(X) ϭ ϽP(N e )Ͼ over all non-solvent atoms. Another condition requires connectivity, i.e. a possibility to connect two arbitrarily selected atoms belonging to a component by successively connecting neighboring atoms belonging to this component. The measure of connectivity of the k-th component is computed as G k (X) ϭ ln(N k /M k )Ն0, where N k and M k are the numbers of dummy atoms in the entire component and in the longest connected fragment, respectively.
Using the multipole expansion, the scattering intensity from a four component ribosomal DAM in solution is (17,18,26), where ⌬ k and A (k) lm (s) denote the contrast and the partial amplitudes of the k-th component, respectively. The contrasts of the "dry" protonated and deuterated ribosomal components in different solvents (Table I) were computed from their chemical composition as described elsewhere (17,18). The partial amplitudes are expressed as (19), where the sum runs over the atoms of the k-th component, (r j j ) ϭ r j are their polar coordinates, v a ϭ (4r 0 3 /3)/0.74 is the displaced volume per dummy atom, j l (x) and Y lm () denote the spherical Bessel function and the spherical harmonics, respectively. Equations 1 and 2 permit one to rapidly compute the scattering curves from the DAM for an arbitrary configuration X under the given contrasts of the components.
Minimization Procedure-To fit the data with a low resolution model one should find a configuration X minimizing f(X) ϭ 2 ϩ␣P(X), where the overall discrepancy is, M ϭ 42 is the number of experimental curves, N(i) is the number of points in the i-th curve, I exp (s), I calc (s), and (s) are the experimental and calculated intensity and the experimental errors, respectively, and ␣Ͼ0 is the weight of the looseness penalty. The experimental data were scaled to the total dry volume of the ribosome 2350 nm 3 expected from its molecular weight and fitted in the range up to s max ϭ 0.2 nm Ϫ1 . Series (1) over spherical harmonics was truncated at l ϭ 14 and the calculated intensities were appropriately smeared to account for instrumental effects (17,18). The minimization was performed using simulated annealing (SA (27)(28)(29)). Starting from a random configuration X 0 , the assignment of a single atom is changed randomly (a move from X to XЈ). If ⌬ ϭ f(XЈ) Ϫ f(X) Ͻ 0, the move is accepted, if not, it can be accepted with a probability exp(-⌬/T), where T is annealing temperature. The latter is held constant for 70N moves or 7N accepted moves, whichever occurs first, then it is decreased (TЈ ϭ 0.9T). The procedure starts at a sufficiently high temperature T Х 10 1 and runs down to T Х 10 Ϫ3 until no further reduction in f(X) is observed. A complete run takes about 3 weeks CPU time on a 180 MHz SGI machine. After preliminary calculations, the penalty weight ␣ Х 20 was selected so that the penalty term amounted to about 5-10% of f(X) at the end of minimization (independent runs yielded discrepancy about Х 4.1 and looseness P(X) Х 0.08).

RESULTS AND DISCUSSION
Establishing a Model of Protein-rRNA Distribution-Each of the dummy atoms filling the search volume in Fig. 1 was assigned to a specific component (solvent, TP30, TP50, RNA30, or RNA50) by simultaneously fitting the available scattering curves in Table I. Already preliminary computations without using connectivity constrain yielded well separated volumes for Yellow pixels, cryo-EM model of (2) (pixel size corresponds to the original cryo-EM reconstruction); red and blue circles, dummy atoms belonging to the 30 S and 50 S subunits, respectively; green circles, dummy atoms that may belong to either subunit. The coordinates given in the inset correspond to the top left orientation. Top right and bottom orientations are rotated counterclockwise by 45°around y and by 90°around x, respectively. All three-dimensional images here and below are prepared on a SUN Workstation using the program ASSA (60). the protein and rRNA moieties. For all independent restorations starting from random initial approximations, proteins in both subunits converged to several isolated subvolumes, whereas the distribution of the dummy atoms belonging to the rRNAs was more uniform and featureless (see example in Fig.  2, top). To further constrain the model, the connectivity requirement was imposed on the rRNA moieties in both subunits and also on the entire subunit (see details under "Experimental Procedures"). Corresponding connectivity factors G k (X) were added to the looseness penalty P(X) during minimization and 12 independent restorations yielded very consistent results. The positions of the individual protein subvolumes differ slightly in different restorations (typical deviations are illustrated in Fig. 2, bottom). Given that the restorations provide the same overall discrepancy Х 4.1 (typical fit to the experi-mental data is presented in Fig. 3), this uncertainty has to be attributed to the low resolution of the solution scattering data. A similar ambiguity has been observed in model calculations on a ribosome-like structure (19).
The reliability of the results is assessed by the analysis of the 12 independently obtained configurations. For each dummy atom, the number of repetitions (i.e. the number of times a given position was attributed to the same specific component) was counted. The frequency histogram (numbers of atoms having the given number of repetitions) is presented in Fig. 4 along with the histogram computed from 12 randomly generated DAMs. The two distributions are principally different and that obtained from the SA has much higher proportion of atoms with high repetition rates. In particular, the random generations indicate that the probability for a dummy atom to be TABLE I Samples measured in this study and the contrasts of ribosomal components Sample abbreviations: H, protonated; D, deuterated; first letter, TP, second letter, rRNA (e.g. DH30 ϩ HH50 denotes hybrid ribosome with the TP30 deuterated, the rest protonated). For neutron scattering experiments, D 2 O concentration in the solvent Y (conventional scattering) or sample polarization P (spin dependent scattering) is indicated. The contrasts are given in units 10 10 cm Ϫ2 . Contrasts for the spin dependent scattering and for the x-ray scattering are scaled to those of the conventional contrast variation data as described (17,18 occasionally assigned to one and the same component 12 times out of 12 is about 2 ϫ 10 Ϫ6 . Alone the fact that the frequency histogram contains 1755 such atoms underlines the statistical significance and reproducibility of the results. The uncertainty illustrated in Fig. 2 (bottom), indicates that the assignment of the dummy atoms close to the protein-rRNA interface or to the subunit border may differ in different reconstructions. Given also that closely positioned proteins may be joined in single globules at low resolution, the method does not yield exact positions and shapes of all individual proteins. Nevertheless, the analysis of the independent reconstructions permits one to distinguish between the volumes occupied by proteins and rRNA in the ribosomal subunits with a reasonable probability. For each of the four ribosomal components (TP30, RNA30, TP50, and RNA50) we build a volume from the atoms that were ascribed to a specific component at least once in the 12 reconstructions. It is conceivable that these 4 volumes enclose the true volumes of the corresponding ribosomal components. In each of these volumes we now discard the atoms that appeared only once and keep those atoms ascribed to the given component twice and more. Simulations with randomly generated structures show that the remaining volumes would enclose the true volumes with the probability of 95%. The rRNA and protein moieties thus obtained are well separated in both subunits: the overlap (relative number of dummy atoms found in the two moieties simultaneously) is less than 17 and 20% in the 30 S and 50 S subunit, respectively. The overlapping positions are resolved by selecting the component with higher number of repetitions. The final map of the protein-rRNA distribu-tion in the ribosome at a resolution of about 2/s max Х 3 nm is presented in Fig. 5.
General Features of the Model-Given that no distinction is made between proteins and rRNA at the first stage (see Fig. 2,  top), the mere fact that the proteins reproducibly converge to separated volumes connected by the rRNA moiety underlines the soundness of the restoration. The integral parameters of the ribosomal components in Table II are in a good agreement with those reported in (17,18). It should be noted that the solid body model used in Refs. 17 and 18 could not fully account for intermingling of the ribosomal components and an artificial term from internal density fluctuations has to be added to the calculated scattering curves. The representation using the DAM is free from this limitation. The solutions are rather stable to the changes in the weight of the looseness penalty and also to the changes in the packing radius (restorations with r 0 ϭ 0.6 and 0.7 nm yielded very similar results).
The cryo-EM model (2) has a nominal resolution of about 2.5 nm. As the construction of the search volume involved expansion of this model and (implicitly) smoothing of its surface, the use of the higher resolution reconstruction published later (3) would not result in significant difference. An attempt to use the search volume built from the cryo-EM model (4) yielded the spatial distribution of proteins similar to that presented in Fig.  5 but the agreement to the experimental data was worse ( Х 4.5 for all independent runs). This corroborates previous conclusions (17,18) that the model (4) is less consistent with the neutron scattering data.
In each of the independent SA reconstructions (e.g. those displayed in Fig. 2), the enclosed volumes of the four ribosomal components are equal to those computed from the chemical composition. As the final model in Fig. 5 is obtained by merging a dozen independent restorations, the enclosed volumes of the components should be higher, and the increase of the volume provides an estimate of the uncertainty in the spatial distribution of the component. As seen from Table II, the protein moieties in both subunits are better defined by the scattering data than the rRNA moieties. This should be expected as both TP30 and TP50 are organized in separate globules and are located closer to the periphery of the subunits. The map in Fig.  5 displays 15 and 20 protein islands in the 30 S and 50 S subunits, respectively, i.e. about 60 to 70% of the total number of ribosomal proteins.
Small Subunit-For the 30 S subunit, the comparison with the triangulation map of the protein centers (15) in Fig. 6 (top  row) is of special interest. The radius of gyration of the TP30 evaluated from the triangulation map (6.36 nm) has been the object of a long standing controversy with significantly higher values obtained in numerous contrast variation studies (14,17,18,30). Positioning of the triangulation map within the models of the 30 S (31-33) predicted no proteins in the bottom of the subunit. The TP30 in the SA model is indeed concentrated more at the top of the subunit, whereas the 16 S rRNA predominantly fills its bottom ( Fig. 5; this is also in agreement with the recent crystallographic map (10)). The overlap in Fig.  6 displays a reasonable agreement between the triangulation map and the SA distribution: most of the protein globules in the latter could be assigned to a number (or numbers) from the triangulation map. However, the SA map displays at least two protein globules at the bottom of the subunit. One of them could be S20 that has been detected (34) at the bottom of the 30 S subunit (see also next paragraph). It follows that, in agreement with the results reported in Refs. 17 and 18, a complete absence of proteins from the bottom of the 30 S subunit is not compatible with the neutron scattering data. The separation between the centers of mass of TP30 and RNA30 was reported in the  Fig. 1.  FIG. 3. Typical fit of the neutron (A) and x-ray (B) scattering data by the SA models. Successive curves are displaced up by one logarithmic unit corresponding to the distance between the ordinate tick marks (in panel A, also by ⌬s ϭ 0.05 nm Ϫ1 along the abscissa) for better visualization. For the neutron scattering data in panel A, fits to the low angle and high angle settings are displayed separately, and the sequence of curves, from bottom to top, corresponds to that in Table I. earlier contrast variation studies not to exceed 2 nm (in the present model a value of 5.5 nm is found, see Table II). It should be noted that most of the earlier studies were performed on protonated samples. The natural contrast between rRNA and proteins was demonstrated (35) to be insufficient to reliably estimate the separation. Experiments on specifically labeled samples are required as applied here.
The atomic models of several ribosomal proteins deposited in the Protein Data Bank (36) were interactively positioned in the SA map as shown in Fig. 6. From comparison with the triangulation map (15) in Fig. 6, top row, likely positions of S4 (37), S8 (PDB code 1SEI (38)), S15 (1AB3 (39)), and S17 (38) are easily established (Fig. 6, middle row). The globules accommodating S5 (1PKP (38)), S6 (1RIS (40)), and S7 (1RSS (41)) are selected in agreement with the relative positions and orientations of the proteins in the crystallographic map (10) and with the data (42). The form and orientation of the protein globule just below S17 is very similar to the tentative model of S20 as displayed in Ref. 10. Moreover, as in the latter map, the intersubunit interface of the 30 S is relatively protein-free, and most of the proteins are concentrated in the head of the subunit.
Large Subunit-The prominent ribosomal features, e.g. the arms of the 50 S subunit containing the L1 and L7/L12 proteins, respectively, are adequately represented in the SA map. The appearance of the central protuberance of the 50 S subunit containing three protein globules and an rRNA fragment loosely connected with the rest of the rRNA moiety is worth noting. The shape of this rRNA fragment resembles that of the models of the free 5 S rRNA (43,44). It is also well known that the 5 S rRNA forms stable complexes with the three ribosomal proteins L18, L25, and L5 (45). This suggests that the fragment in the head of the 50 S subunit represents the 5 S rRNA (possibly in a more compact form than the free molecule), whereas the main rRNA moiety in the body of the subunit is the 23 S rRNA molecule. A possible tunnel in the body of the 50 S subunit (reported e.g. by Refs. 3, 8, 9, and 46) would just be a higher resolution feature in the central part of the 23 S rRNA moiety. This feature has little impact on the scattering pattern without specific labeling, and it would thus be unrealistic to search for it in Fig. 5.
A comprehensive immunoelectron microscopy model of the protein distribution in the large subunit (47) was used to place the available atomic models (Fig. 6, bottom row). The most obvious position is that of L1 (1AD2 (48)). The semi-concave shape of the globule formed by dummy atoms yields an orientation of the protein similar to that reported (3). A tentative tetramer of the C-terminal of the L7/L12 protein (1CTF (49)) is well accommodated within the upper globule in the L7/L12 stalk. The lower globule should in this case contain the Nterminal domains of L7/L12, and the distance between the two globules corresponds approximately to the length of the flexible hinge region between the N-and C-domains of L7/L12 (50,51). The position of the RNA-binding domain of L2 (1RL2 (52)) near the peptidyl transferase center, as well as those of L6 (1RL6 (53)), L11 (1ACI (54)), and L14 (1WHI (55)) proximal to the GTPase associated center were selected following the x-ray map (8). The highly anisometric proteins, L9 (1DIV (56)), L11 (1ACI (54)), and L30 (1BXY (57)), can be appropriately positioned to agree with the map (47).
The three proteins bound to the 5 S rRNA in the head of the subunit are of special interest. The atomic model of L25 (58) is neatly fitted into the lower globule in the head in agreement with Ref. 47. The two upper globules should correspond to L5 and L18; the larger size of the globule on the cytoplasmic side suggests that the latter is L5, whereas the L18 contacts the 30 Top, protein map in the 30 S subunit (semi-transparent spheres) superimposed with the triangulation map of (15) (numbered asterisks). Middle and bottom, tentative positions of the atomic models of proteins presented as colored C ␣ traces. Dummy atoms representing protein components are shown as semi-transparent spheres. Middle, 30 S subunit; bottom, 50 S subunit. For both subunits, interface orientation is displayed. For clarity, the shapes of the subunits in this orientation taken from the cryo-EM reconstruction of 2, are shown (red, 30 S; blue, 50 S).

TABLE II
Integral parameters of the model of the 70 S ribosome V SA /V dry is the ratio of the enclosed volume of the component in the SA model to its dry volume computed from the chemical composition; ⌬ is the distance between the centers of mass of TP and rRNA in the given subunit. S subunit. Among the proteins positioned in the large subunit, L2, L9, L25, and L30 are on the subunit interface, while L6, L14, and L22 are closer to the cytoplasmic side.
Conclusions-The new approach presented here significantly expands the potential of structural analysis of biological macromolecules by neutron scattering techniques. The method is not limited to ribosomes but can be used to elucidate lowresolution in situ structures of various complexes in solution, especially when the components can be selectively labeled. Its sensitivity is enhanced if the components are organized in separated globules like ribosomal proteins.
What is the significance of the results presented in view of the recent high resolution x-ray maps (6 -10)? First, the SA map yields an independently obtained model of the entire biologically active 70 S ribosome in solution at nearly physiological conditions. Second, one can expect to find extensive flexible or disordered domains in the ribosome (in particular, on its periphery) that do not contribute to the crystal diffraction, whereas neutron scattering is especially sensitive to the peripheral structural elements and provides their average structure. This is clearly seen in the rather flexible L7/L12 stalk of the 50 S subunit. This prominent ribosomal feature yields no density in the x-ray maps of both isolated 50 S subunit (8) and programmed 70 S ribosomes containing tRNAs (6), but is unequivocally identified as a proteins globule in all 12 SA reconstructions. Furthermore, neutron scattering can be applied to ribosomes from virtually all species, in particular, to E. coli, where most of the genetic, biochemical, and structural data have been accumulated. In contrast, the crystallographic studies are obviously limited to those organisms that provide good crystals (the crystals of the E. coli ribosomes are not sufficiently good). Finally, the established protein islands inside the ribosome can further help to localize proteins in the crystallographic maps (examples are the two protein globules at the bottom of the 30 S subunit and the three globules in the central protuberance of the 50 S subunit).