Raman Spectroscopy, the Sleeping Giant in Structural Biology, Awakes*

Principally through the efforts of crystallographers, we are being presented with an ever expanding atomic view of the biological world. Although this brings into focus many questions regarding the mysteries of function, techniques are needed that facilitate the transition in our understanding from structure to function. Raman spectroscopy is one of these; because the Raman effect involves an intimate interplay between atomic positions, electron distribution, and intermolecular forces, it sits at the bridgehead between structure and function. Thus, the Raman technique can answer questions that lie at the heart of issues such as ligand macromolecule recognition and enzymatic catalysis. Raman spectroscopy involves analyzing the scattered photons from a laser beam focused into the sample solution (1). The inelastic scattered photons (the Raman spectrum) provide information on molecular vibrations that, in turn, yield data on molecular conformation and environment. At its most effective, Raman spectroscopy can provide exquisite detail from an important site in a much larger macromolecular complex. Although Raman was first applied to the definition of biological molecules in the 1930s (2), the giant has remained drowsy due the difficulties both in obtaining high quality data and in interpreting those data. Considerable advances have been made in these areas in the past few years, and the giant is stirring! A major goal of this review is to provide biochemists with enough information to determine whether the Raman technique could provide structural insights into their systems. The specific issues addressed are which type of systems are amenable to study and what information could be obtained. Practically, present-day sample requirements are for 20 ml of clear solution, where the target molecule is in the 100–300 mM range. Because the number of vibrational modes of a molecule is 3n 2 6, where n is the number of atoms, the complete Raman spectrum of a macromolecule is exceedingly complex. Thus, Raman is most suited to systems where it is possible to focus upon a small region of interest, e.g. a ligandreceptor or enzyme-substrate binding site. Historically, this condition was achieved by using resonance Raman spectroscopy (1) to obtain the intensity-enhanced spectra from chromophores at specific sites in macromolecules. Recent technical advances mean that similar information can now be gleaned from non-chromophoric systems, markedly broadening the application of the technique. The information obtained can be very detailed, exceeding the level of resolution found in x-ray or NMR analyses (3–5). In addition to providing structural data, the Raman spectrum can also reveal changes in the distribution of electrons in a bound ligand and details of active site-ligand interactions, such as hydrogen bonding strengths. Raman spectroscopy is beginning to fulfill its potential to contribute to structural biology because the three roadblocks that impeded its application to biological systems have been all but removed. These were low sensitivity, interference from fluorescence background, and problems with data interpretation. Sensitivity has increased several orders of magnitude with the corresponding decrease in concentration requirements because of advances in optical filters and photon detectors (6). Fluorescence interference is now minimized by using deep-red excitation in the 650–800 nm range, made possible by the advent of photon detectors with high efficiency in this region (7). Problems with interpreting Raman spectra have receded with the availability of “friendly” software packages (8) and ever increasing computational power that enable us to calculate, ab initio, the Raman spectra of midsized molecules (of the size of many ligands or co-factors found at biological sites). Interpretation is further strengthened by a comparison of the calculated and experimental shifts in Raman peak positions when a molecule is substituted with stable isotopes. Recently, this approach has been used to characterize hydrogen bonding in a complex of adenosine deaminase with a transition state analogue (9) and to discriminate between different protonation states of dihydrofolate binding to dihydrofolate reductase (10).

Principally through the efforts of crystallographers, we are being presented with an ever expanding atomic view of the biological world. Although this brings into focus many questions regarding the mysteries of function, techniques are needed that facilitate the transition in our understanding from structure to function. Raman spectroscopy is one of these; because the Raman effect involves an intimate interplay between atomic positions, electron distribution, and intermolecular forces, it sits at the bridgehead between structure and function. Thus, the Raman technique can answer questions that lie at the heart of issues such as ligand macromolecule recognition and enzymatic catalysis. Raman spectroscopy involves analyzing the scattered photons from a laser beam focused into the sample solution (1). The inelastic scattered photons (the Raman spectrum) provide information on molecular vibrations that, in turn, yield data on molecular conformation and environment. At its most effective, Raman spectroscopy can provide exquisite detail from an important site in a much larger macromolecular complex. Although Raman was first applied to the definition of biological molecules in the 1930s (2), the giant has remained drowsy due the difficulties both in obtaining high quality data and in interpreting those data. Considerable advances have been made in these areas in the past few years, and the giant is stirring! A major goal of this review is to provide biochemists with enough information to determine whether the Raman technique could provide structural insights into their systems. The specific issues addressed are which type of systems are amenable to study and what information could be obtained. Practically, present-day sample requirements are for 20 l of clear solution, where the target molecule is in the 100 -300 M range. Because the number of vibrational modes of a molecule is 3n Ϫ 6, where n is the number of atoms, the complete Raman spectrum of a macromolecule is exceedingly complex. Thus, Raman is most suited to systems where it is possible to focus upon a small region of interest, e.g. a ligandreceptor or enzyme-substrate binding site. Historically, this condition was achieved by using resonance Raman spectroscopy (1) to obtain the intensity-enhanced spectra from chromophores at specific sites in macromolecules. Recent technical advances mean that similar information can now be gleaned from non-chromophoric systems, markedly broadening the application of the technique. The information obtained can be very detailed, exceeding the level of resolution found in x-ray or NMR analyses (3)(4)(5). In addition to providing structural data, the Raman spectrum can also reveal changes in the distribution of electrons in a bound ligand and details of active site-ligand interactions, such as hydrogen bonding strengths.
Raman spectroscopy is beginning to fulfill its potential to contribute to structural biology because the three roadblocks that impeded its application to biological systems have been all but removed. These were low sensitivity, interference from fluores-cence background, and problems with data interpretation. Sensitivity has increased several orders of magnitude with the corresponding decrease in concentration requirements because of advances in optical filters and photon detectors (6). Fluorescence interference is now minimized by using deep-red excitation in the 650 -800 nm range, made possible by the advent of photon detectors with high efficiency in this region (7). Problems with interpreting Raman spectra have receded with the availability of "friendly" software packages (8) and ever increasing computational power that enable us to calculate, ab initio, the Raman spectra of midsized molecules (of the size of many ligands or co-factors found at biological sites). Interpretation is further strengthened by a comparison of the calculated and experimental shifts in Raman peak positions when a molecule is substituted with stable isotopes. Recently, this approach has been used to characterize hydrogen bonding in a complex of adenosine deaminase with a transition state analogue (9) and to discriminate between different protonation states of dihydrofolate binding to dihydrofolate reductase (10).

What Is Raman and What Does It Tell You?
The Raman spectrum of 5-methyl thienylacrylic acid (5-MTA) 1 is shown in Fig. 1; the 5-MTA entity has been used extensively as a probe of protease active sites (3,11,12). The spectrum was obtained by focusing a laser beam into a solution in methanol and by analyzing the scattered light 90°to the direction of the beam using a Raman spectrometer. A small percentage of the scattered photons exchange energy with the vibrational energy levels (or, crudely, the "vibrations") of the molecules in solution. Thus, by analyzing the scattered photons information on the vibrational motions of atoms in molecules is obtained. These motions are a function of molecular conformation, of the distribution of electrons in the chemical bonds, and of the molecular environment. Thus, interpretation of the Raman spectrum provides information on all these factors. This is the underlying principle behind using Raman spectroscopy for defining the detailed chemistry of molecules at biological sites (1).
With present day computational power, commercially available software packages allow us to undertake high level quantum mechanical calculations on molecules such as 5-MTA acid and to predict the stable conformational states for this molecule as well as the infrared and Raman active vibrations (13). Such calculations put interpretation of the data on a sure footing. They also reveal the complex nature of vibrational spectra; many peaks are due to vibrational motions that include contributions from many atoms in the molecule. However, some vibrations are more or less localized in molecular groupings and three such are indicated in Fig. 1 (the CϭO and CϭC stretching vibrations and the breathing-type motion of the thienyl ring). We will see below how the carbonyl peak can provide detailed chemical information on the chemistry of this group in serine protease active sites. In addition, the CϭC stretch and ring modes can be used to follow the redistribution of -electrons for 5-MTA in cysteine protease active sites (11), and marker bands in the 1000 -1200 cm Ϫ1 region give the conformation, cis or trans, about the ϭC-CϭO single bond (13,14).
The absorption spectrum of the 5-MTA chromophore shows a maximum in the near UV at 324 nm. If a laser wavelength far from this electronic transition, e.g. near 650 nm, is used to generate the Raman spectrum the resulting spectrum is relatively weak. If, however, we use an excitation wavelength near 330 nm the coincidence between this and the absorption band leads to large (10 3 or more) intensity enhancement. This is the resonance Raman (RR) effect, and because of its high intensity it has been the method most used for obtaining vibrational spectra at biological sites. RR plays a key role in obtaining data from natural chromophoric sites such as occur in heme (15)(16)(17) and metalloproteins (18). Moreover, time-* This minireview will be reprinted in the 1999 Minireview Compendium, which will be available in December, 1999 resolved RR is a powerful means of following changes at a chromophoric site in a rapidly evolving system such as cytochrome oxidase (19) or a peptide in the early stages of folding (20). However, for stable or slowly evolving systems, increases in sensitivity now allow us to obtain Raman difference spectra from specific sites under non-resonance conditions.
The RR approach remains the only method available to probe short-lived species (with half-lives of less than a second down to the picosecond range), and RR studies on some reactive acyl enzymes exemplify the detailed structural information that can be obtained. In the late 1970s and 1980s ␣,␤-unsaturated acyl enzymes of the form R-CϭC-C(ϭO)-O-chymotrypsin (one of these acyl groups is 5-MTA, seen in Fig. 1) were good candidates for RR analysis (3,(21)(22)(23). These acyl enzymes are models for the natural acyl enzymes formed during peptide hydrolysis. The former have absorption maxima near 350 nm, and using near UV laser sources, RR spectra could be generated of the acyl groups in the active sites. Moreover, it was possible to obtain spectral data from the unstable acyl enzymes, prior to deacylation, at high pH in a rapid mixing rapid flow system. The focus of the work was the RR feature due to the acyl's CϭO group, which could be used to monitor this group prior to nucleophilic attack in the active site. Three findings emerged from these studies. 1) As pH was varied, changes in the CϭO stretch region occurred with the same pK a as that for the deacylation kinetics leading to the conclusion that the pK a of neighboring His-57 was being probed (24). 2) At high pH, a linear relationship was found between the position of the CϭO stretch and the log of the deacylation rate constant (25). It extends over a change in rate constant of 17,000-fold. Moreover, an empirical relationship between CϭO frequency and bond length could be used to follow CϭO bond length changes of the order of 0.001-0.01 Å. This approach relies on setting up accurate structure-spectra correlations on a series of "small" model compounds (in this case generated by IR and x-ray studies on crystals of cyclic and heterocyclic organic compounds (26)) and then using these to interpret the changes seen in the (resonance) Raman spectra in terms of exquisitely accurate structural definition. 3) Shifts in the CϭO stretch were postulated to be because of changes in the active site -CϭO hydrogen bonding strengths, and this effect, too, may be quantitated. By undertaking H-bonding studies, e.g. involving the ester of the compound seen in Fig. 1 in CCl 4 , with a number of hydrogen bond donors, it is possible to derive a relationship between the shift in the CϭO stretch and the strength of the hydrogen bond(s) to it. Across the present series of acyl enzymes, the enthalpy of hydrogen bonding changes by 57 kJ mol Ϫ1 (27, 28).

Evolving Technology Increases Applications in Enzymology
The work discussed above on acyl serine proteases utilized near UV lasers operating near 350 nm to generate RR spectra of the bound acyl groups. The spectra were recorded using double or triple monochromators (to separate the "Raman" from the interfering "Rayleigh" photons) and detected by a single photomultiplier or, later, an optical multichannel analyzer. Encouraged by the relationship cited above, attempts were made to extend the studies to acyl cysteine proteases, e.g. R-CϭC-C(ϭO)-S-papain, and to other ␣,␤-unsaturated thiol esters such as hexadienoyl-CoA binding to enoyl-CoA hydratase (29). Again these acyl groups have absorption features near 350 nm and are candidates for RR studies. However, satisfactory spectra could not be obtained from these systems because ␣,␤-unsaturated thiol esters are highly photolabile, and the samples underwent uncontrolled photoisomerization and decomposition in the laser beam used to generate the RR data. Technical innovations came to the rescue; a combination of efficient holograph-based optical filters to block the Rayleigh photons and high quantum efficiency charge-coupled device photon detectors increased the sensitivity of Raman instrumentation dramatically (6). In practical terms this meant that it was no longer necessary to use the resonance condition. For example, high quality Raman data were obtainable using excitation near 500 or 650 nm. This is far removed from any absorption peak, and thus the samples are not photochemically modified by the laser beam (30).
To observe the Raman spectrum of the bound ligand, Raman difference spectroscopy was employed where the spectrum of the enzyme is subtracted from the spectrum of the enzyme-ligand complex (31,32). The resultant difference spectrum contains features due to the bound ligand with the possibility of some protein modes appearing if there are conformational changes occurring upon ligand binding. Ligands such as the 5-MTA seen in Fig. 1 are strong Raman scatterers (they give rise to relatively intense Raman signals even under non-resonance conditions) because of the extended and polarizable -electron system. For this reason the ligand modes dominate the difference spectrum. When experiments involve ligands that scatter less intensely, protein features appear in the difference spectrum in both the positive and negative directions, and these are a source of information on changes in protein structure, although the interpretation of many of these features is still in its infancy (33,34).
The Raman difference spectra for acyl cysteine proteases, supported by absorption spectral data, provided different insights from those uncovered for the serine analogs (11,35). Principally, these involve so-called -electron polarization; in the cysteine protease active sites there are strong electrostatic forces that bring about a major rearrangement of the -electrons in the acyl group. An ␣-helix dipole, terminating at the active site cysteine, is likely one of the important factors causing polarization of the acyl group's electrons, and it was proposed that the dipole, with its positive pole pointing toward the acyl CϭO, functions to stabilize negative charge build-up in the transition state (11).
Ideas on active site-induced electron polarization have been further refined by Raman difference studies on acyl enzymes involving the semi-synthetic enzymes, thiol and selenol subtilisins (36). The Raman data for these 5-MTA acyl enzymes showed that the acyl group experiences no polarization in the active sites. However, when the active sites are enlarged, e.g. in selenol subtilisin by replacing asparagine 155 by a glycine residue, a new conformational state is detected, and this second conformer is polarized. Thus, the acyl group is able to switch from a region of null electric field to one where strong electrostatic forces come into play. Using the x-ray-derived structure of selenol subtilisin combined with modeling studies (13,36), it was possible to provide a molecular explanation for the polarized and non-polarized forms of the acyl groups observed by spectroscopy. In essence, the additional room in the active site, created by making the Gly-155 mutant, allows the acyl group to "flip" about its ϭC-CϭO single bond. Then, it goes from an environment where it experiences minimal electrostatic forces to one where there is a negatively charged side chain near the thienyl ring and strong electropositive effects at the carbonyl because of hydrogen bonds and an ␣-helix dipole. Thus, optimum polarization appears to be achieved by a combination of electron "push" and "pull," as shown in the schematic in Fig. 2.
The above discussion focuses on the use of the 5-MTA group to demonstrate many of the details on structure and chemistry that can be elicited from Raman data. This molecule was one of the chromophoric acyl groups developed as RR probes of active sites in early studies. It is not part of a natural substrate for the enzymes discussed. Now, however, with the latest generation of highly sensitive Raman spectrometers and the use of red or deep red excitation to avoid fluorescence interference, we are able to consider a plethora of "natural" enzyme-ligand systems. One such involves the enzyme 4-chlorobenzoate-CoA dehalogenase that carries out the transformation as shown in Reaction 1.
Initial attempts to undertake RR studies of dehalogenase-product complexes were thwarted again by the photolability of the CoA thiol esters. However, using a Raman difference spectrometer constructed in 1992 (6), red-excited Raman difference data for the product complex were obtained (37). These showed that the active site of dehalogenase can bring about a complete reorganization of the benzoyl group's electrons and, in turn, provided insight into how the difficult chemical step of replacing the -Cl atom on the ring by an -OH is achieved. The determinants for this strong polarization in the active site of dehalogenase (38) are very similar to those found for the cysteine proteases and shown in Fig. 2. At the benzoyl CϭO there is strong electron pull exerted by two H-bonds and an ␣-helix dipole combined with electron push by an aspartate side chain near the benzoyl's para position.
Even with the 1992 vintage high throughput Raman spectrometer, studies on dehalogenase complexes were hampered by concentration requirements. The latter were close to 1 mM and were usually thwarted by the protein coming out of solution during concentration procedures. However, in 1997 we were able to modify a commercial spectrograph to optimize its performance for dilute aqueous solutions (7). With this instrument concentration requirements were lowered to 100 -300 M. Collecting data for dehalogenase complexes reached a success rate of 100%, and high quality spectra were obtained for several dozen complexes with different substrate analogs and protein-engineered forms of the enzyme (39). One of these complexes had the surprising property of evolving with time (40). It involved the substrate binding to dehalogenase where the catalytically vital group Asp-145 had been mutated to asparagine. The enzyme was expected to be unreactive, and yet a series of spectroscopic changes were detected over a period of several minutes. Fig. 3 shows Raman difference spectra recorded approximately 1, 1.5, 2, and 4.5 min after adding the substrate to the Asn-145 variant. The data collection time for each spectrum was only 30 s, but the spectral fingerprints of several species can be identified. Initially (in the top trace) the peaks at 1586 and 1216 cm Ϫ1 show that there is the expected population of the 4-chlorosubstrate bound to the active site of the Asn-145 enzyme. However, this population decays rapidly with time. There is also evidence at early times for peaks at 1543 and 1490 cm Ϫ1 , and detailed analysis (39) shows that these are because of the ionized (4-O Ϫ ) form of the product bound to the Asn-145 enzyme. After 5 min the spectra cease evolving, and the signature (Fig. 3, bottom trace) is that of non-ionized product in the active site of the wild-type enzyme! The combined Raman and absorption data provide the following expla-nation for the spectral changes. Initially, the Asn-145 mutant enzyme contains a trace of the wild-type enzyme brought about by spontaneous deamidation of the Asn-145 side chain. The wild-type enzyme catalyzes the formation of product, which binds in its its ionized form to the large population of Asn-145 molecules. Thus, early in the time sequence there are populations of substrate and product binding to the dominant Asn-145 population. However, the ionized product catalyzes the deamidation of D145N to give the wild-type enzyme. Thus, in a few minutes all the substrate is converted to product and all the mutant enzyme is converted to the wild type, and we see the interesting phenomenon of product catalyzing changes to the enzyme. In general these studies show: 1) that Raman difference spectra were able to identify species in a complex evolving reaction and to aid in identifying the reaction mechanism and 2) that transient species (with lifetimes of tens of seconds or longer) could be identified at the sub-100 M level.

Limitations and Prospects
Although there are outstanding examples where the Raman approach provides important information on large macromolecular structures such as nucleic acid-protein complexes (41)(42)(43) it may find most use in defining small regions of large complexes. In that sense it complements techniques such as x-ray crystallography that provide the "big picture." Another limitation is that, occasionally, samples are encountered that have high background signals that are difficult to remove. This is an area of active investigation since the source of the background is not understood fully. The target molecules most suited for Raman difference spectroscopy are those that have relatively intense normal (non-resonance) Raman signals, which usually means that they have extended -electron systems because these are polarizable and give rise to strong Raman scattering. Saturated systems such as carbohydrates are less amenable for Raman analysis.
Even with the above limitations the prospects are good that Raman spectroscopy will make an increasing impact in structural biology. In the author's own area of interest, involving enzyme complexes, the last few years have seen the Raman technique go from being applicable to a very few systems to being a means of addressing mechanistic questions for a wide array of enzymes. It is now possible to follow changes in co-factor chemistry in detail. For example, flavoproteins have been difficult subjects for Raman investigations because of their intrinsic fluorescence. However, using red or deep red laser excitation, high quality non-resonance Raman data can now be obtained from the isoalloxazine ring system of flavins (32), and these promise to reveal a wealth of information. There are as yet few studies reported involving receptor-ligand complexes; these should appear in the near future. Similarly, the potential for Raman to provide, fairly rapidly, molecular information for a family of ligands binding to a receptor or enzyme target has yet to be exploited. In another application, Raman microscopes permit us to obtain the Raman spectra of microscopic objects under controlled conditions (44). High quality data can be obtained for protein crystals, and "Raman crystallography" has the potential to provide detailed information on complexes in the crystalline as well as the liquid phases and can thus provide a bridge between the results of the crystallographers and solution studies. Improvements in theory and computational power are leading theorists to calculate increasingly sophisticated models of macromolecular binding sites (45,46), and Raman may have a role in providing benchmarks against which to test the results of calculations.
Finally, will Raman spectroscopy become an important tool in many biochemistry laboratories as for example CD now is? Perhaps it will not be as widespread, but it may become an indispensable tool for scientists interested in the chemistry of many classes of small molecule-big molecule interactions. The tide is running in its favor, the cost of instrumentation is falling, experiments are becoming easier to undertake, and the results are becoming easier to interpret in a quantitative fashion.