A Unique Tool for Cellular Structural Biology: In-cell NMR*

Conventional structural and chemical biology approaches are applied to macromolecules extrapolated from their native context. When this is done, important structural and functional features of macromolecules, which depend on their native network of interactions within the cell, may be lost. In-cell nuclear magnetic resonance is a branch of biomolecular NMR spectroscopy that allows macromolecules to be analyzed in living cells, at the atomic level. In-cell NMR can be applied to several cellular systems to obtain biologically relevant structural and functional information. Here we summarize the existing approaches and focus on the applications to protein folding, interactions, and post-translational modifications.


Overview of In-cell NMR Approaches
A cellular structural approach is needed if we want to characterize macromolecules, their interactions, and their functions while they retain their native, intracellular localization. In this respect, NMR spectroscopy is an ideal technique for studying macromolecules in living cells, as it is non-destructive and provides structural and biochemical details of macromolecules in solution, over a wide range of temperatures. The first in-cell NMR proof-of-principle was described by Serber et al. (1,2), where the name "in-cell NMR" was coined to describe highresolution NMR applied to obtain structural information of a macromolecule, e.g. a protein, inside living cells, as opposed to "in vivo NMR," which often refers to the observation of small molecules in living organisms. In those works, the authors exploited recombinant protein expression in bacterial cells, an established sample preparation strategy for structural biology (Fig. 1a). With proper adaptations of the cell growth and induction protocol, globular soluble proteins (bacterial NmerA and human calmodulin) were overexpressed and isotopically labeled in Escherichia coli cells to a sufficient level to be detected by NMR above the other cellular components. Some important parameters were assessed, such as cell viability during the experiments, timing and type of isotopic labeling, and NMR line broadening. Uniform 15 N-labeling was found to be the ideal choice in most cases, whereas uniform 13 C labeling is often unsuitable for this type of experiment, due to the high natural occurrence of 13 C (1.1%) and to the high amount of carbon atoms in biological molecules. [methyl-13 C]Methionine labeling was shown to be a viable strategy to detect side chain carbon atoms with good selectivity against the cellular 13 C background (3); other amino acid type-selective labeling strategies were also examined (2). Labeling of the protein of interest with non-natural amino acids containing 19 F was also demonstrated to be a useful approach to investigate protein dynamics in the cellular environment (4,5). Although 19 F-containing amino acids tend to have large spectral overlap, the in-cell NMR spectra are virtually background-free. 19 F allows for slowertumbling molecules to be observed and is potentially useful for studying protein-drug interactions.
Further development of NMR in E. coli cells capitalized on the knowledge of bacterial protein expression systems. Burz et al. (6,7) combined different systems for controlling expression of heterologous proteins, so that two or more proteins are overexpressed at different times in the same cells, and isotopic enrichment of a single protein can be performed. This approach, called STINT-NMR, allows protein-protein interactions to be investigated directly in bacterial cells, and was applied to characterize the interaction between ubiquitin and two ligands, a ubiquitin-binding peptide and the signal transducing adaptor protein STAM2 (6,8). The same authors developed a similar approach, SMILI-NMR, which allows characterizing interactions of proteins with small molecules (9). The technique relies on STINT-NMR to produce a protein complex inside the cells, with only one partner isotopically labeled. The cells are then screened against a library of small molecules and monitored by in-cell NMR to detect changes in the complex due to protein-small molecule interactions occurring within the cells.
On the spectroscopy side, the application of advanced NMR methods increases the possibilities offered by in-cell NMR approaches. Structural investigation by NMR heavily relies on high-dimensionality (i.e. three-dimensional or more) heteronuclear experiments. Such experiments are necessary to perform a complete resonance assignment, which in turn serves as a starting point to solve the three-dimensional structure of a protein. However, they suffer from low sensitivity and require long experimental times, which are prohibitive considering the limited durability of living cell samples. Reduced sampling schemes decrease the time required for the acquisition of multidimensional NMR experiments (10), whereas fast-pulsing schemes decrease the interscan delay and increase the sensitivity by allowing the acquisition of more scans per unit of time (11). These approaches were shown to be beneficial when applied on bacterial cells (12) and were eventually used to determine a protein structure de novo, exclusively relying on in-cell NMR data (13). Direct 13 C-detected NMR experiments were also proven to be useful to investigate intrinsically disordered proteins in cells (14,15). Spectra of intrinsically disordered proteins have low dispersion of the 1 H signals, whereas heteronuclear 13 C-detected experiments benefit from improved resolution and relaxation properties over standard 1 H-detected experiments (16).
Since in-cell NMR was established in bacterial cells, it was clear that it could potentially bridge the gap between structural and cellular biology. However, the technique would only unleash its full potential when applied to eukaryotic cells, and eventually mammalian cells. NMR in eukaryotic cells was first successfully performed on Xenopus laevis oocytes (17)(18)(19). Thanks to the large size of these cells, proteins, first produced by recombinant expression, can be then inserted into the oocytes by microinjection (Fig. 1g). This approach provides high labeling selectivity (almost no cellular background); however, it requires the protein to be highly concentrated before injection, to avoid excessive dilution in the oocyte cytoplasm. Notably, injection in X. laevis oocytes was also successfully applied to observe nucleic acids by in-cell NMR (20,21). A further step forward was made by Inomata et al. (22), which reported the first example of in-cell NMR spectroscopy in cultured human cells. The authors devised a way -conceptually similar to microinjection, but on the molecular scale -to insert a purified protein inside the cells. Their approach relied on a fusion between the protein of interest and a cell-penetrating peptide (CPP) 2 derived from the HIV-1 Tat protein (Fig. 1d). Alternatively, the CPP sequence could be covalently linked to the protein in vitro through a disulfide bond, which was then reduced in the cytoplasm, thus releasing the peptide-free protein. This approach was later applied to observe the human protein copper, zinc superoxide dismutase 1 (Cu,Zn-SOD1), in human cells. However, the protein had to be heavily modified to be efficiently imported (23). Ogino et al. (24) reported an alternative approach to translocate labeled proteins into cultured human cells, which relied on the use of pore-forming toxins (streptolysin O) to permeabilize the plasma membrane and allow the entrance of a sufficient amount of protein (Fig. 1e). By treating the cells with Ca 2ϩ , the plasma membrane could be resealed, thereby preventing cell death. More recently, protein electroporation was reported as an efficient method for inserting proteins into mammalian cells (25). Electroporation reversibly permeabilizes the plasma membrane, thereby letting a protein into the cells by passive diffusion (Fig. 1f). Conceptually different from protein insertion, intracellular protein expression was also shown to be achievable in cultured human cells by our research group (Fig. 1c) (26). The expression approach was made possible by exploiting recent advancements in mammalian protein production systems (27,28). This approach has the advantage of investigating proteins directly in the cells where they are synthesized and does not require any import in other types of cells. It is therefore especially useful to study protein folding and maturation processes occurring immediately after protein synthesis in the cytoplasm (26,29). In-cell NMR on proteins expressed directly in the cells can also be performed in yeast, as shown by Bertrand et al. (Fig. 1b), and in insect cells, as shown by Hamatsu et al. (31) (Fig. 1c). By inducing protein expression in yeast supplemented with different nutrients, the expressed protein is localized in different cellular compartments, and the effect of different subcellular environments on the protein can be investigated (30).
One intrinsic limitation of the technique is the limited cell lifetime inside the NMR tube, which often prevents the acquisition of experiments longer than few hours. Efforts to increase sample viability have been made, and resulted in the design of bioreactors that can be fitted inside an NMR spectrometer. Such bioreactors have been reported for both bacteria (32) and human cells (33). In both designs, a flow is applied that replaces the spent medium with fresh medium, providing nutrients and stabilizing the external pH. To reduce mechanical stress, the cells are encapsulated within a hydrogel, where they can still exchange nutrients and by-products, thus improving sample lifetime.

Effects of Crowding on Folding and Weak Interactions
In-cell NMR allows observation of structural and functional features of proteins within intact living cells at room temperature. Therefore, it is ideally applied to understand how the thermodynamics of protein folding are affected by the intracellular environment. Bacterial cells can be considered a suitable model system when studying proteins -either bacterial or eukaryotic -which are natively localized in the cytoplasm, as the physicochemical properties of the cytoplasm are comparable among different organisms, if functional interactions are not to be taken into account. A consequence of the high concentration of macromolecules (mainly proteins and nucleic acids) in the cytoplasm is macromolecular crowding. Its main effect on solute molecules is the excluded volume; due to steric repulsion between macromolecules, a large fraction of the total volume of the cytoplasm would not be accessible to other solute macromolecules, such as proteins. Therefore, the effective concentration of the latter, as well as their thermodynamic activity, would increase accordingly. As a consequence of this, the folding equilibrium of a protein is expected to shift toward the folded conformation, which minimizes the occupied volume. The crowding effect of the cytoplasm on protein folding was first observed through in-cell NMR by the Pielak group (34). In their work, Dedmon et al. (34) showed that FlgM, an intrinsically disordered protein from Salmonella enterica serovar Typhimurium, obtained a partially folded conformation in the cytoplasm of E. coli. As a consequence of folding, the amide crosspeaks of the C-terminal part of the intracellular protein were not detected in the NMR spectra, whereas those arising from the N terminus, which remained unfolded, could still be detected and remained unchanged when compared with the protein in solution. The resulting NMR spectrum was similar to that of FlgM interacting with the transcription factor 28 in vitro, in which the C terminus of FlgM is known to be in exchange with a folded conformation bound to 28 . Recently, Smith et al. (35) measured the amide hydrogen exchange rates of intrinsically disordered proteins (␣-synuclein and FlgM) in bacterial cells and showed that protein disorder still persisted in the crowded cellular environment.
When proteins act as crowding agents, additional phenomena can affect the protein folding landscape. Weak attractive interactions can occur between the crowding agents and a poly-peptide, which stabilize more exposed conformations. This enthalpic contribution would therefore counterbalance the excluded volume, which is a purely entropic effect. The extent to which these two contributions affect the folding landscape of a protein depends on the intrinsic properties of the latter. Schlesinger et al. (36) proved this concept by showing a striking example of a protein that fails to reach the folded conformation in the bacterial cytoplasm. In the study, a variant of protein L from Peptostreptococcus magnus was investigated, which has a marginally stable folded conformation in vitro in the presence of K ϩ ions. In the bacterial cytoplasm, under similar concentration of K ϩ ions, the protein was mostly unfolded, indicating that interactions with other cytoplasmic components can overcome the excluded-volume effect, preventing folding of natively unstable globular proteins.
A common effect of the cellular environment on the in-cell NMR spectra of soluble proteins, which has been observed since the beginning, is the general broadening of the NMR resonances. The signal broadening is, in the simplest cases, due to a slower tumbling of the protein in the cytoplasm with respect to aqueous buffers. Intriguingly, globular proteins of similar size were shown to have very different NMR relaxation properties, and many of them tumble so slowly that they cannot be observed at all by in-cell NMR. Although line broadening is often observed in folded proteins, it is less likely to occur in disordered fragments. An example of such different behavior was shown in E. coli by Barnes et al. (37). A fusion construct was used consisting of ubiquitin (a folded protein) attached to ␣-synuclein (an intrinsically unfolded protein) with a flexible linker. In the bacterial cytoplasm, only the backbone resonances of ␣-synuclein were detected, whereas ubiquitin could only be detected upon lysing the cells. Wang et al. (38) characterized by in-cell NMR the rotational diffusion of three globular proteins in the bacterial cytoplasm: the B1 domain of staphylococcal protein G (GB1), a fusion of two GB1 domains, and the metalbinding domain of mercuric ion reductase (NmerA). By comparing the linewidth increase in cells, in lysates, and in a series of buffer solutions of increasing viscosity, they showed that the tumbling rate decreased differentially for each protein and did not correlate with the increased viscosity of the cytosol, nor with the molecular weight (Fig. 2a). Thus, the line broadening effect could not be attributed only to increased viscosity or excluded volume effects, and is mainly due to weak interactions between the soluble protein and the other cytoplasmic components.
The extent of weak interactions in the presence of protein crowders was extensively investigated in vitro by the Pielak group. By measuring NMR relaxation properties of proteins in the presence of different crowding agents, they showed that proteins as crowding agents differ from synthetic polymers in that they exert weak, nonspecific interactions that impact protein translational and rotational diffusion (39) as well as folding stability (40,41). More recently, the same research group devised a way to obtain quantitative, residue-level information on the folding thermodynamics of intracellular GB1. The authors measured by NMR the amide hydrogen-deuterium exchange on cell lysates; after quenching by acidification, the hydrogen-deuterium exchange occurred in the cells at different times (42). With this approach, they showed that the cytoplasm stabilizes the folded state of GB1, whereas protein crowding agents in vitro have an opposite effect, thus highlighting the intrinsic complexity of the intracellular milieu.
For some proteins, nonspecific interactions with the intracellular environment are strong enough to prevent NMR analysis even after diluting the cytoplasm upon cell lysis. Crowley et al. (43) probed such interactions by performing size-exclusion chromatography on cell lysates. They showed that in the bacterial lysate, cytochrome c (13 kDa) eluted with an apparent molecular mass of Ͼ150 MDa, due to interactions with cytosolic proteins. The effect was abolished in charge-inverted mutants, or with a high concentration of NaCl in the elution buffer, suggesting interactions of electrostatic nature. These charge interactions were also analyzed by observing a synthetic construct (⌬Tat-GB1) in a bacterial lysate by size-exclusion chromatography and NMR (44).
In eukaryotes, post-translational modifications can alter the surface properties of a protein. In X. laevis oocytes and extracts, Luh et al. (45) showed that a peptidyl-prolyl isomerase (Pin1) interacts nonspecifically with the environment through the N-terminal Trp-Trp-binding module (WW) domain. Interestingly, upon substrate recognition, the nonspecific interactions between the WW domain and the environment are lost, and both specific and nonspecific interactions are abrogated when Pin1 contains a mutation that mimics phosphorylation of the WW domain (Fig. 2b).
Together, these works highlight the diversity of the possible protein-environment interactions and hint at the existence of a hidden layer of weak functional interactions between intracellular proteins, referred to as quinary interactions, which were predicted long ago but have since remained mostly uncharacterized (46).

Protein Interactions, Maturation, and Compartmentalization
The results discussed above demonstrate that the cellular environment can have an important role in fine-tuning the structural and dynamic properties of biological macromolecules. Therefore, in-cell NMR also stands as an ideal approach to determine how the functional properties of biomolecules, which in turn depend on their structure and dynamics, are affected by the environment. Of particular interest are the mechanisms that regulate the fate of proteins within the cells: from the initial folding, to co-factor binding or post-translational modifications, translocation in the relevant cellular com- partment, and eventually, degradation. All these processes are strictly dependent on interactions with specific partners, which in turn are finely regulated inside the cell (e.g. they depend on partner and co-factor availability, properties of the environment, cellular localization).
By applying the STINT-NMR method, Burz and Shekhtman (8) showed in E. coli cells how a ubiquitin-partner interaction changed upon phosphorylation of the partners. The authors monitored the interaction of ubiquitin (Ubq) with two components of the receptor tyrosine kinase endocytic sorting machinery (STAM2 and Hrs) separately and together (STAM2-Hrs heterodimer). In a separate set of experiments, tyrosine-kinase Fyn was also expressed, and the two interacting proteins were phosphorylated in-cell. Isotopically labeled Ubq was then expressed, and then monitored by in-cell NMR. Phosphorylation was found to weaken the interaction between Ubq and the interacting partners, whereas the mode of interaction between Ubq and either of its partners did not change between the binary and the ternary complexes. An advantage of this approach is that interactions involving proteins with a short lifetime, which are not easily purified, can be studied. The same approach was then applied to investigate how the mycobacterial proteasome regulates the degradation of the prokaryotic ubiquitin-like protein (Pup) (47). Pup-GGQ (precursor of the active Pup-GGE form) is a small and unstructured protein that exerts in prokaryotes a similar function to that of eukaryotic ubiquitin, by targeting proteins to the proteasomal machinery. In that work, Pup-GGQ was shown to interact weakly with the proteasomal ATPase (Mpa) through residues in its N and C terminus (Fig. 3a). Conversely, in the presence of the full proteasomal particle (Mpa-proteasome core particle complex), Pup-GGQ was extensively bound to Mpa through most of its residues, suggesting that it binds to both the mouth and the central cavity of Mpa hexamer.
Protein-drug interactions can also be monitored by in-cell NMR, as shown both in E. coli (9) and in human cells (22), by observing binding of immunosuppressant drugs to the FK-506binding protein. More recently, the interaction between cisplatin and the human copper chaperone Atox1 was characterized by NMR in bacterial cells by Arnesano et al. (48). Upon treatment of cells with cisplatin, the formation of a previously uncharacterized 1:1 {Pt(NH 3 ) 2 }-Atox1 complex was observed first, followed by loss of the ammines and formation of a platinum-bridged Atox1 dimer. Thus, in-cell NMR shows great potential for drug design applications, as the effects of the cell environment on drug import, localization, and binding selectivity can be investigated.
Functional interactions regulating eukaryotic proteins should ideally be studied in the correct cellular environment. For example, protein phosphorylation has a fundamental role in regulating protein activity and transducing information throughout the eukaryotic cell. Selenko et al. (49) investigated a sequence of phosphorylation events occurring to a substrate of CK2 by monitoring real-time phosphorylation events in vitro, in X. laevis egg extracts, and in intact live oocytes. They analyzed the regulatory region of the viral SV40 large T antigen, focusing on adjacent CK2 phosphorylation sites, and found that a stepwise sequence of phosphorylation events occurred, which required the substrate to detach from CK2 in the intermediate step. After in vitro characterization, the authors could observe the same mechanism in both X. laevis egg extracts and live oocytes, by the action of endogenous CK2 (Fig. 3b). The same approach was used by Amata et al. (50) to study the phosphorylation of a disordered domain of c-Src in oocytes and egg extracts, where differential phosphorylation patterns pointed to indirect effects of the kinase-phosphatase networks.
Many eukaryotic proteins require a series of maturation events to take place to reach their final, functional form. To investigate such processes by in-cell NMR, the protein should ideally be synthesized within the eukaryotic cell. Together with members of our research group, we developed and applied a method to observe, by NMR, maturation of proteins directly expressed in cultured human cells. We first applied this method to the maturation events of the human Cu,Zn-SOD1 metalloprotein, a conserved enzyme mainly localized in the cytoplasm, which protects the cell from oxidative damage (26,51). We characterized by in-cell NMR the conformation of the intermediate maturation states of SOD1 by varying the abundance of the metal co-factors in the cell culture. In human cells, by enhancing the expression of the specific metallochaperone (CCS), we observed the CCS-dependent copper transfer and the formation of an internal disulfide bond, and revealed that CCS could promote disulfide formation in a previously unknown copper-independent mechanism (Fig. 3c). Mutations in the human SOD1 gene are related to the familial variant of amyotrophic lateral sclerosis (fALS), a fatal neurodegenerative disease. We recently analyzed the impaired folding mechanism of a set of fALS-linked SOD1 mutants in human cells by NMR (52). Although the mutations studied do not alter the structural properties of the metal-binding sites of SOD1, some of the mutants could not bind intracellular zinc, unlike the wild type protein, and were shown to irreversibly accumulate in an unstructured apo conformation, which was previously uncharacterized at residue level. Co-expression of CCS in the presence of copper rescued the maturation process of the mutant proteins, allowing them to reach the mature, folded form (Fig. 4).
Eukaryotic cells heavily rely on compartmentalization to separate different cellular processes. The vast majority of proteins are encoded by nuclear DNA and are synthesized in the cytoplasm. These proteins are then actively sorted and targeted toward the relevant organelles. In-cell NMR can be applied to understand how protein translocation mechanisms are regulated and how different subcellular environments affect protein function. Bertrand et al. (30) showed in yeast cells how the dynamic properties of ubiquitin change upon localization within different cellular compartments. By changing the growth medium composition, overexpressed ubiquitin could be either localized in the cytosol or targeted to protein storage bodies, where it had slower rotational diffusion. Mitochondrial protein import is another example of how the environment regulates protein function in each compartment. Our group showed by in-cell NMR that the import of Mia40, an oxidoreductase of the intermembrane space responsible for the oxidative folding of proteins in the intermembrane space, is regulated in the cytoplasm by a redox-controlled folding (29). Mia40 expressed in the cytoplasm of human cells adopts a folded conformation, due to the formation of two structural disulfide bonds, which cannot be imported into mitochondria. Co-expression of one of the two main thiol-disulfide-regulating proteins of the cytoplasm (glutaredoxin 1 or thioredoxin 1) caused Mia40 to remain unfolded, in an import-competent state. Finally, we recently showed that NMR can be performed In the absence of supplemented copper, partial disulfide formation occurs (left); when copper is supplemented, Cu-CCS-dependent disulfide formation is fully observed. Reprinted by permission from Macmillan Publishers Ltd., Nat. Chem. Biol. (26). Copyright (2013). on the intact mitochondria extracted from human cells (53). In-mitochondria NMR can provide precious structural information on mitochondrial proteins, and the approach can in principle be extended to other organelles.

Initial Developments in Solid-state In-cell NMR
An intrinsic limitation of solution NMR is the need for fast rotational diffusion of the observed molecules. Although this already poses an upper limit to the molecular size, even small proteins, as discussed above, may interact with the intracellular environment up to a point where they cannot be detected by solution in-cell NMR. Additionally, proteins associated with membranes, DNA, or the cytoskeleton are not suitable targets for in-cell solution NMR. Magic angle spinning solid-state NMR (MAS NMR) can overcome such limitations. To circumvent the molecular tumbling rate limit, the sample needs to be spun at high speed within the magnetic field. Fast spinning can cause high mechanical stress to cellular samples and is therefore a major obstacle for proper in-(living) cell NMR experiments. Bacterial cells offer moderate resistance to mechanical forces and can tolerate the experimental conditions required by MAS NMR. Renault et al. (54,55) applied MAS NMR to study the conformation of a small trans-membrane protein (the bacterial outer membrane protein OmpA) inside intact E. coli cells and isolated membranes. By freezing bacterial cells, Reckel et al. (56) showed that signals from soluble proteins, which form high molecular weight transient complexes with other cellular components, can be detected by MAS NMR. In both cases, complex isotopic labeling strategies and tailored SS-NMR experiments had to be devised to improve spectral quality and allow data interpretation. For the application of MAS NMR to mammalian cell samples, complex isotopic labeling and improved resistance to mechanical stress represent future challenges.

Future Directions
The ability to investigate macromolecules within living cells at an atomic level is an important asset for structural biology. In-cell NMR is a relatively young approach, and many of the methods reviewed here have not yet been applied to scenarios other than those described in the original articles. In other instances, the methods have already found applications, and novel biological findings have been reported. In fact, in-cell NMR has largely proven to be up to the task by providing information on protein structure, dynamics, interaction, and ultimately, function in multiple cellular environments. We believe that the applications described in this review really show the potential of the approach. E. coli cells provide a good model environment for the cytoplasm, and bacterial in-cell NMR is relatively cost-effective and easy to implement. However, future developments of the technique should -and will -be directed toward improved eukaryotic model systems. In particular, the availability of a general approach to allow in-cell NMR in many types of mammalian cell lines would push the boundaries of the technique, as it would allow unique atomic-level studies of disease-related protein alterations, such as aggregation phenomena, as well as other types of cellular stress, in cells with the meaningful phenotype. Development of solid-state NMR in eukaryotic cells would greatly increase the possible applications by overcoming the tumbling rate problem and by extending the method to proteins stably associated with cellular structures. As a closing remark, we believe that in-cell NMR should be combined with other cellular techniques (e.g. superresolution microscopy and electron cryo-microscopy) to develop totally new, integrated approaches for structural cellular biology. As an example, in a recent work, we have combined in-cell NMR with synchrotron radiation x-ray fluorescence microscopy and optical microscopy on immobilized cell samples to correlate the metallation state(s) of human SOD1 (monitored by in-cell NMR) with its expression levels and metal distribution within the cells. We observed that the increase in total cellular zinc was correlated with the SOD1 metal binding event, thus showing that the approach can potentially be applied to investigate protein-metal interactions at the subcellular level (57).