Protein-Nucleic Acid Interactions and the Expanding Role of Mass Spectrometry*

Mass spectrometry methods are being developed that enable detection of protein interactions with nucleic acids. By mass measuring complexes a direct determi-nation of the stoichiometry of protein-nucleic acid interactions is revealed. For more complex assemblies, using a different approach it is possible to gain information about subcomplexes and even the spatial arrange-ment of proteins in macromolecular machines. To illus-trate these different approaches we review progress and problems encountered in evaluating complexes from bimolecular interactions through to macromolecular ma-chines such as ribosomes. used to establish relative binding affinities of wild type Tus and mutational variants. MS oligonucleotide binding protein dilution dissociation of the com- solution

With the development of the gentle ionization techniques of electrospray ionization (ESI) 1 and matrix-assisted laser desorption ionization (MALDI) MS is now well established for characterizing both proteins and nucleic acids (1,2). MALDI-MS of proteins produces predominantly singly charged ions and therefore is often combined with proteolytic digestion prior to analysis, resulting in a mass fingerprint for the identification of proteins in data bases (1). In contrast, ESI-MS produces multiply charged ions, thereby bringing large molecular ions into the mass to charge (m/z) range of commercial mass spectrometers that are not able to detect their singly charged counterparts. In addition to wide scale application of ESI and MALDI to proteomics (1), ESI is also emerging as a means of studying non-covalent complexes (3). By maintaining interactions during their transit from solution to gas phase, direct measurement of the mass of the complex reveals the stoichiometry of interacting biomolecules. A schematic representation of the process is given in Fig. 1. A prerequisite for this type of MS is that during the phase transition complexes in aqueous droplets experience desolvation conditions such that water/buffer molecules are lost but interactions between proteins and nucleic acids are maintained. The development of nanoflow ESI (4), a miniaturized version of ESI, facilitates this desolvation process because of the reduced size of droplets generated compared with conventional ESI; consequently desolvation is effected under relatively mild conditions. Aqueous buffers (typically at physiological pHs containing millimolar quantities of volatile buffer) are typically employed such that complexes are introduced from native solution conditions. A few microliters of solution containing the proteinnucleic acid complex, typically at a concentration of 1-10 mM, are used per experiment. The introduction of nanoflow ESI together with improved transmission of macromolecular complexes in time of flight mass spectrometers has enabled spectra of protein complexes (5,6), intact viruses (7,8), and whole ribosomes to be recorded (9). Given that it is now possible to record mass spectra of complexes of megadalton molecular masses it is interesting to consider why only a small fraction (ϳ2%) of the ϳ500 publications that describe intact protein complexes describe protein-nucleic acid interactions (for a review see Refs. 10 and 11). This disparity in the number of publications employing MS may reflect the practical difficulties encountered when obtaining spectra of protein-nucleic acid complexes. These difficulties include the need for high salt concentrations in buffers, nonspecific cation binding to nucleic acids, and the inherent heterogeneity of many oligonucleotides. The combination of these effects broadens the peaks of molecular ions containing nucleic acids, often rendering their mass measurement more ambiguous than for complexes consisting of only protein molecules. In addition to these difficulties, other aspects common to all MS investigations of non-covalent interactions are exacerbated in the study of protein-nucleic acid interactions. Specifically as the electrospray droplet diminishes in size (as a result of evaporation) the concentration of species will increase locally. This is particularly problematic for highly charged protein and nucleic acid molecules and can lead to the possibility of nonspecific interactions. Consequently it is not possible to provide absolute binding constants for such interactions because of changes in solution concentration. Moreover complexes maintained by electrostatic interactions, common in protein-nucleic acid interactions, are more stable in the gas phase than those involving hydrophobic ones (12). As a consequence the intensity of mass spectral peaks assigned to complexes of comparable solution (K d ) would be greater for complexes involving predominantly electrostatic rather than hydrophobic interactions (12). These aspects (the changing concentration in the electrospray droplet and the nature of the interaction) represent the practical problems associated with the application of MS to protein-nucleic acid complexes.
The significant advantages of MS for studying protein-nucleic acid interactions include the speed of analysis (typically spectra are recorded within a matter of minutes), direct measure of the stoichiometry of the interacting components, tolerance to heterogeneity of different components, and information about conformational changes of proteins upon nucleic acid binding. Of particular focus in this review are applications that exploit the observation of noncovalent interactions in the mass spectrometer. We describe the information available from MS study of individual proteins and their cognate RNA-DNA complexes and review how such experiments have been used to tackle issues of specificity of binding and protein folding within the context of the complex. Addressing larger complexes such as those involved in degrading RNA as part of the degradosome, a multicomponent ribonucleolytic complex implicated in RNA degradation in Escherichia coli (13), illustrates the capacity of MS to define the stoichiometry of interacting proteins and oligonucleotide molecules (14). The ability to monitor conformational change in a supramolecular complex is demonstrated in studies of the ribosome, the macromolecular machine responsible for translation of RNA (15). Despite the fact that E. coli ribosomes consist of 54 proteins and three large RNA molecules it is possible to obtain precise information regarding changes in protein-RNA interactions from MS.
For each of the complexes described in this review, different levels of structural information are available. For bimolecular interactions between protein and nucleic acid, changes in affinity resulting from mutations in protein or DNA are interpreted within the context of structural information from x-ray crystallography. For the degradosome, detailed structural information of the components is not available. Unique opportunities therefore exist for MS to contribute to the elucidation of this complex structure. In the case of prokaryotic ribosomes atomic resolution images have lead to intriguing insights into the mechanistic details of RNA translation and protein synthesis (16 -18). Combining high resolution structural information with MS presents opportunities to follow the response of the supramolecular complex to binding of various cofactors and small molecule ligands.

Bimolecular Oligonucleotide Complexes
Since the introduction of ESI MS to study interactions in the early 1990s considerable effort has been directed toward demon-strating that this gas phase technique is capable of reproducing solution phase complexes. A powerful example which illustrates that specificity of interactions in solution can be effectively represented was demonstrated recently by direction control in DNA binding to peptide nucleic acids (PNAs) (19). PNAs are used in many biological applications, including detection of point mutations and inhibition of transcription and translation (20). Using an electrospray approach chiral PNA (bearing three adjacent chiral monomers in the middle of the strand) was able to show selective binding to two different oligonucleotide sequences corresponding to an antiparallel but not a parallel target. The results from this investigation demonstrate therefore that to observe PNA binding to DNA in the mass spectrometer specific interactions in solution are required rather than simple electrostatic interactions that might be anticipated from evaporation and the close proximity of highly charged molecules in the shrinking ESI droplet.
For proteins such as the catalytic domain of bacteriophage integrase, which catalyzes site-specific DNA recombination (21), it is possible to observe changes in conformation of the protein upon DNA binding. Solutions of free protein were examined in the absence of DNA. Species in the mass spectrum were assigned to unfolded, folded, and dimeric protein on the basis of the m/z range of various charge states. The most highly charged series are assigned to unfolded monomeric protein, having accumulated the most charge as a result of the electrospray process. Folded proteins are more compact and less highly charged than their fully unfolded counterparts because of preservation of salt bridges within folded structure. Dimeric species contain intersubunit salt bridges and have the lowest charge states and the highest m/z values in the mass spectrum. Although these distinctions are largely qualitative MS does allow observation of populations of different folded states within a single spectrum rather than producing an average of different folded states as is common in optical spectroscopic techniques. Addition of a cognate DNA to the protein-containing solution that gave rise to these three different folded forms resulted in peaks assigned only to folded monomeric protein and the protein-DNA complex. The absence of the unfolded conformation led the researchers to conclude that DNA binding stabilizes the global fold and binds stoichiometrically to monomeric -integrase.
MS unlike other biophysical techniques, can reveal the identities and relative abundances of different complexes by direct observation within a mass spectrum. This fact is exploited in an investigation of the binding of the complex formed by Tus protein binding to specific DNA sequences (Ter sequences) (22). Tus protein binds as a monomer to termination sequences on the E. coli chromosome, halting replication. The interaction of Tus with its recognition sequence, Ter B, has been well characterized using other biophysical methods, including x-ray crystallography and surface plasmon resonance. MS was used to compare the relative strengths of binding of native protein and single point mutations with six Ter sequences and a nonspecific sequence of DNA. The researchers examined the stability of 1:1 complexes with various DNA sequences and Tus proteins by comparison of the stability of gas phase complexes to different desolvation conditions. Because differences in binding were difficult to establish due to the stability of the gas phase complex, increasing concentrations of ammonium acetate were employed to decrease the electrostatic component such that relative binding affinities could be assessed. The results demonstrated the absence of binding to a nonspecific Ter sequence and binding for DNA oligomers with higher solution binding affinities in the presence of DNA molecules of lower affinity. Competition experiments Charge states assigned to the protein tetramer are labeled. B, expansion of the spectrum over the m/z range 6300 -9000. The lower spectrum (shown in red) was recorded in the presence of a 2-fold stoichiometric excess of a substrate RNA analogue. The additional charge states correspond in mass to a distribution of RNA molecules binding to the protein tetramer. The major series labeled with charge states corresponds to up to three molecules of RNA binding to the protein tetramer, but higher m/z species can be observed with up to four molecules on charge states B33-B31. The inset in B shows an expansion of the spectrum and overlay with the theoretical charge states calculated from the known isotopic composition of the protein tetramer with four RNA molecules. Adapted with permission from Ref. 14. FIG. 1. Schematic representation of the nanoflow electrospray process (not to scale). Protein-RNA complexes at concentrations typically in the range of 5-20 M are introduced from millimolar ammonium acetate buffer using a nanoflow capillary. A voltage of 1.5-2 kV is typically applied to the capillary, and backing pressure is often used to initiate flow. Droplet formation takes place at atmospheric pressure; each droplet is calculated to contain one protein-nucleic acid complex, providing the appropriate concentration and needle orifice are employed. With the aid of a countercurrent flow of gas, desolvation takes place such that the droplet shrinks from a few hundred to ϳ10 nm before yielding gas phase ions. For non-covalent interactions it is critical to employ sufficient desolvation, by acceleration of the ions in the electrospray ion source, to remove volatile buffer/solvent molecules. If these conditions are too harsh, however, dissociation of the complex will result. To maintain the non-covalent interactions between components in the macromolecular complex, it is necessary to employ collisional cooling in the various vacuum stages to reduce the translational energy and hence internal energy of the macromolecular ions (29).

Minireview: Protein-Nucleic Acid Interactions 24908
were also used to establish relative binding affinities of wild type Tus and mutational variants. The MS results showed oligonucleotide binding to Tus protein in preference to Tus mutants. Attempts to obtain absolute K d values, using dilution to observe dissociation of the complex, were not successful because the complex remained intact at concentrations where solution K d data would predict appreciable dissociation. This underlines the fact that changes in the concentration of the protein and DNA during the electrospray process, as well as the prevalence of electrostatic interactions in such complexes, have the effect of overemphasizing the proportion of complex in solution. The success of the competition experiments, however, lies in the fact that the differences in binding are readily observed by changes in relative abundance of the various components; consequently a series of relative binding affinities was deduced.
A similar MS approach was applied recently in a comprehensive study of the binding affinity and stoichiometry of complexes formed between nucleocapsid protein and RNA stem-loop hairpins of the human immunodeficiency virus-1 recognition element (23). The 1:1 stoichiometry of protein:RNA was determined as the primary binding mode for three RNA hairpins, whereas further binding of the protein was found to occur with significantly lower affinity and without evidence of cooperative effects. The authors also carried out a series of competition experiments in which the three RNA ligands were added in equimolar amounts to increasing quantities of the protein. The rank order for binding that could be deduced from these experiments was the same as that determined by a variety of established techniques including NMR, isothermal titration calorimetry, and fluorescence methods. The authors of this study conclude that although the presence of an electrostatic component is likely to play a role in the observation of the complexes by ESI, the subtle variation of binding affinity revealed by the study is related to the differences in bonding that occur in solution. Specifically because the RNA constructs used contain the same number of nucleotides, it is anticipated that the same coulombic attraction between the negatively charged RNA and positively charged protein would operate providing similar contributions to the binding strength. Moreover the ionic strength used in this investigation is considered sufficient to reduce the electrostatic interactions such that hydrogen bonds and hydrophobic interactions contribute to the overall binding energy.
These bimolecular interactions examined by MS demonstrate the basic principles of specificity and stoichiometry as well as folding in protein-nucleic acid complexes, namely direction control in a PNA-DNA duplex, folding in the presence of DNA for the integrase, and relative binding affinities for Tus-Ter and nucleocapsid protein-RNA complexes. These fundamental observations enable much larger systems to be investigated where the stoichiometry of interacting protein subunits and nucleotide is not yet established.

Multiprotein-RNA Complexes
One of the most basic advantages of studying RNA processing machinery by MS is the ability to determine in a relatively straightforward manner the stoichiometry of the protein that binds to RNA. The catalytic domain of RNase E was investigated by MS in conjunction with other biophysical tools (in this case to determine the stoichiometry of both protein and RNA binding in the complex (14)). RNase E is part of the degradosome, and using MS it was possible to demonstrate that the protein exists exclusively as a tetramer under the solution conditions employed (Fig. 2). Moreover by monitoring the change in mass upon addition of cognate RNA a shift in the mass was observed corresponding to binding to RNA molecules. Given the masses of the RNA molecule (3933 Da) and protein tetramer (247,537 Da) it was not possible to observe splitting of the peaks that would be anticipated for complex binding to individual RNA molecules. To determine the number of RNA molecules in complex therefore theoretical spectra were simulated and compared with the experimental data. The results of such a comparison are shown in Fig. 2 (inset). Coincidence of simulated and actual data reveals that four molecules of RNA are bound to charge states from 7700 to 8400. For charge states below these m/z values however it was apparent that two and three molecules of RNA were bound to the protein tetramer. This led to the conclusion  ) and thiostrepton (B). The two spectra are markedly different. The spectrum recorded in the presence of fusidic acid is similar to that observed for ribosomes in the absence of EF-G under these solution and MS conditions. By contrast the complex inhibited by thiostrepton demonstrates considerable reduction of the signals assigned to L7/L12 and the presence of additional proteins L5, L6, and L18. The structure of the 50 S subunit was produced using the coordinates from Thermus thermophilus at 5.5-Å resolution (Protein Data Bank file 1GIY) (18). The structure of EF-G (Protein Data Bank accession code 1FNM) was fitted according to the structure of Ban et al. (16). The proteins colored in the two structures represent those that are released form the two complexes. Reproduced with permission from Ref. 28. Minireview: Protein-Nucleic Acid Interactions 24909 that up to four molecules of RNA were capable of binding to the tetramer. These MS results contributed to a model of RNase E and corroborated the proposed mechanism of interaction with RNA.
The above examples show that the interactions between proteins and nucleic acids may be successfully examined within the gas phase. The complexes described above however employ synthetic oligonucleotides as opposed to full-length nucleic acid molecules. Many protein-nucleic acid interactions occur within complexes of molecular masses of the order of megadaltons, such as ribosomes.
In the next section we review the recent applications of MS to study protein components as well as protein-RNA interactions within these macromolecular complexes.
MS is already established as a powerful technique for cataloguing proteins in ribosomes from a variety of different sources. Impressive examples include the identity of the full complement of proteins present in both subunits of the human mitochondrial ribosome (24,25). Of the proteins that were reported 14 and 17 small and large subunit proteins, respectively, were found to be novel. The researchers also demonstrated that 15 and 20 proteins from the two subunits are specific to mitochondrial ribosomes with no apparent homologues in other higher eukaryotic ribosomes. This allowed the proposal that components of the mitochondrial protein biosynthetic system may also play pivotal roles in apoptosis and in mitochondrial diseases (24,25).
The protein components of the Saccharomyces cerevisiae cytoplasmic ribosomes were identified using a similar approach (26). The researchers were able to identify a novel protein component (YMR116p) from the small (40 S) subunit. Recently, liquid chromatography coupled with Fourier transform ion cyclotron resonance has been used for direct MS analysis of the yeast 60 S subunit (27) identifying 42 of the 43 core 60 S subunit proteins. This study also demonstrated that 58 of the 64 possible core large subunit isoforms could be identified in a single experiment. These results correlate well with data from the transcriptome and therefore appear to validate the model of transcriptional regulation of yeast ribosomal proteins.
In addition to the applications described above that investigate the composition of ribosomes it is also possible to maintain noncovalent interactions within these particles in the gas phase and to examine complexes by MS. Spectra of 70 S ribosomes from E. coli recorded in the presence of Mg 2ϩ showed that ions from intact ribosomes have m/z values in excess of 20,000 (9). More recently it has been demonstrated that different sets of proteins are released in the gas phase from the E. coli ribosome in response to different solution conditions (28). From solutions at pH 7.0 only four proteins, largely those from the mobile stalk region, are observed in mass spectra (L7, L12, L10, and the protein at the base of the stalk L11). At pH 4.5 four additional proteins as well as 5 S rRNA are observed implying that 5 S RNA is destabilized under these solution conditions. When Mg 2ϩ is replaced by Li ϩ , a procedure that is known to weaken protein-RNA interactions in solution, a further nine proteins are observed. Despite considerable variation of solution and MS conditions it was not possible to release all 54 proteins from the three large RNA molecules present in ribosomes. Close examination of the x-ray analysis of crystals of prokaryotic ribosomes from other bacterial sources demonstrates, however, that rather than pI of the protein or the surface-accessible area governing the extent of dissociation there is a positive correlation between low surface area of interaction of protein with RNA and a propensity for release into the gas phase (28). The authors reasoned that this ability to track changes in protein-RNA interactions through monitoring protein dissociation could be used to address conformational changes in ribosomes.
To test this hypothesis complexes of the ribosome known to be in different conformations were formed with elongation factor G and inhibited by fusidic acid or thiostrepton, and their dissociation was probed by MS. Fusidic acid, which stabilizes the complex in a conformation close to that of the post-translational ribosome, did not change the mass spectrum significantly from that recorded for ribosomes at pH 7.0 in the absence of other factors. In contrast, gas phase dissociation from the thiostrepton-inhibited complex, a state thought to resemble closely the pretranslational state, revealed markedly different mass spectra. The peaks assigned to the stalk proteins (L7, L12, and L10, normally the dominant peaks in mass spectra under all conditions examined previously) were virtually absent (Fig. 3). Their considerable reduction in signal intensity strongly implies that these proteins are anchored to the ribosome, presumably through interaction with RNA. Moreover both L5 and L18 were released; these proteins were not observed in the absence of thiostrepton and elongation factor G. Because both L5 and L18 interact exclusively with the 5 S rRNA and given the absence of L7, L12, and L10 it was possible to conclude that the ribosome elongation factor G complex, inhibited by thiostrepton, involves destabilization of 5 S rRNA-protein contacts through long range interactions with the mobile stalk region. This study revealed the sensitivity of the dissociation to conformational changes within ribosomes as a result of perturbation of the protein-RNA interactions, a phenomenon likely to be applicable to other investigations of protein-RNA supramolecular complexes by MS.

Concluding Remarks
In summary this brief review encompasses some of the highlights in recent years of the study of protein-nucleic acid interactions using MS. Although the enormous power of MS to catalogue proteins during different stages of the cell cycle continues to provide a wealth of valuable information, a small but emerging role of MS in probing interactions in protein-nucleic acid interactions is coming to the fore. Whether through mutational changes in the binding protein, base changes in nucleic acid, or changes in protein-RNA interactions, MS provides a rapid and sensitive means for detection. With recent advances in technology as well as the continued emergence of structural data for protein-RNA interactions it is likely that the full potential of this approach is yet to be realized.