MD simulations reveal the basis for dynamic assembly of Hfq–RNA complexes

The conserved protein Hfq is a key factor in the RNA-mediated control of gene expression in most known bacteria. The transient intermediates Hfq forms with RNA support intricate and robust regulatory networks. In Pseudomonas, Hfq recognizes repeats of adenine–purine–any nucleotide (ARN) in target mRNAs via its distal binding side, and together with the catabolite repression control (Crc) protein, assembles into a translation–repression complex. Earlier experiments yielded static, ensemble-averaged structures of the complex, but details of its interface dynamics and assembly pathway remained elusive. Using explicit solvent atomistic molecular dynamics simulations, we modeled the extensive dynamics of the Hfq–RNA interface and found implications for the assembly of the complex. We predict that syn/anti flips of the adenine nucleotides in each ARN repeat contribute to a dynamic recognition mechanism between the Hfq distal side and mRNA targets. We identify a previously unknown binding pocket that can accept any nucleotide and propose that it may serve as a ‘status quo’ staging point, providing nonspecific binding affinity, until Crc engages the Hfq–RNA binary complex. The dynamical components of the Hfq–RNA recognition can speed up screening of the pool of the surrounding RNAs, participate in rapid accommodation of the RNA on the protein surface, and facilitate competition among different RNAs. The register of Crc in the ternary assembly could be defined by the recognition of a guanine-specific base–phosphate interaction between the first and last ARN repeats of the bound RNA. This dynamic substrate recognition provides structural rationale for the stepwise assembly of multicomponent ribonucleoprotein complexes nucleated by Hfq–RNA binding.

The conserved protein Hfq is a key factor in the RNAmediated control of gene expression in most known bacteria. The transient intermediates Hfq forms with RNA support intricate and robust regulatory networks. In Pseudomonas, Hfq recognizes repeats of adenine-purine-any nucleotide (ARN) in target mRNAs via its distal binding side, and together with the catabolite repression control (Crc) protein, assembles into a translation-repression complex. Earlier experiments yielded static, ensemble-averaged structures of the complex, but details of its interface dynamics and assembly pathway remained elusive. Using explicit solvent atomistic molecular dynamics simulations, we modeled the extensive dynamics of the Hfq-RNA interface and found implications for the assembly of the complex. We predict that syn/anti flips of the adenine nucleotides in each ARN repeat contribute to a dynamic recognition mechanism between the Hfq distal side and mRNA targets. We identify a previously unknown binding pocket that can accept any nucleotide and propose that it may serve as a 'status quo' staging point, providing nonspecific binding affinity, until Crc engages the Hfq-RNA binary complex. The dynamical components of the Hfq-RNA recognition can speed up screening of the pool of the surrounding RNAs, participate in rapid accommodation of the RNA on the protein surface, and facilitate competition among different RNAs. The register of Crc in the ternary assembly could be defined by the recognition of a guanine-specific base-phosphate interaction between the first and last ARN repeats of the bound RNA. This dynamic substrate recognition provides structural rationale for the stepwise assembly of multicomponent ribonucleoprotein complexes nucleated by Hfq-RNA binding.
Hfq is a conserved RNA-binding protein and a pleiotropic regulator of translation and RNA stability in diverse bacteria. Some of its best studied roles are to suppress translation of target mRNAs by annealing them with small regulatory noncoding RNA molecules (sRNAs) (1)(2)(3)(4) or by directly binding an A-rich sequence in the 5 0 -untranslated region of mRNAs (5,6). Six Hfq protomers assemble to form a hexameric, ring-like chaperone (Fig. 1A). The hexamer can bind RNAs via three surfaces (7,8), commonly termed as proximal, distal, and rim faces or sides (9,10). Furthermore, the intrinsically disordered C-terminal regions can also interact with RNA and autoregulate the activity of Hfq (11,12). The homo-oligomeric nature of Hfq favors recognition of nucleotide repeats in target RNAs, such as the ARN-triplet repeat motif (where A is an adenine and R and N are a purine and any nucleotide, respectively), which binds on the distal side ( Fig. 1B) (13).
In the gram-negative bacterium Pseudomonas aeruginosa, Hfq was identified as a versatile contributor to metabolic regulation (14) whose influences on different pathways are facilitated via interactions with other proteins (15). One such partner is the catabolite repression control (Crc) protein (16). In Pseudomonas, Crc is responsible for directing the metabolic pathways toward preferring succinate over other potential carbon sources (17). The mechanism of Hfq and Crc cooperation involves binding of both proteins to ARN repeats in the 5 0 -untranslated region of target mRNAs, thus repressing expression of enzymes involved in alternative metabolic pathways (14). When succinate is depleted, the sRNA CrcZ is expressed (18) and proceeds to sequester Hfq and Crc from their mRNA targets, allowing the mRNAs of alternative metabolic genes, such as the amiE, to be translated (6).
Crc has no intrinsic RNA-binding or Hfq-binding capabilities of its own (16), yet it can bind to Hfq-RNA complexes and strengthen Hfq interactions with target RNAs (14). The structural basis of this cooperative action has been unraveled by cryo-EM, which showed Hfq, Crc, and a short segment of amiE mRNA forming a quaternary complex (19). The structure of this complex encompasses two Hfq hexamers, each complexed with an RNA octadecamer containing six ARN repeats, the amiE 6ARN fragment. At a minimum, there is a homodimer of two Crc proteins positioned between the two Hfq-RNA complexes. We henceforth refer to this structure as the quaternary complex (Fig. 1C). Depending on availability, up to two additional Crc proteins can be recruited into the quaternary complex (19).
The cryo-EM structures of the quaternary complex reveal a recognition motif seen in earlier X-ray structures of Hfq complexed with polyadenine RNAs (8,20). In the crystal structures, the A and R nucleotides of ARN repeats are specifically bound by the Hfq, whereas the N nucleotides are bulged away from Hfq and interact with the neighboring crystallographic cells. The same RNA recognition pattern is present in the quaternary complex except that the N nucleotides are instead engaged in nonspecific interactions with the Crc proteins (19).
In this study, we used atomistic molecular dynamics (MD) simulations to explore the conformational variation of the Hfq-RNA binary complex and the higher order quaternary complexes formed with Crc. Our MD simulations utilize a set of carefully calibrated molecular mechanics models (21,22) that have been applied in many studies of protein-RNA complexes with predictive power (21,(23)(24)(25)(26)(27)(28). MD allows the study of atomic movements at spatiotemporal resolutions inaccessible to any currently available experimental method and can help rationalize experimental observations (21). Although simulation timescales are generally short, wellexecuted simulations can provide insights into biomolecular dynamics that are not apparent from static models obtained by structural experiments, for which the data are typically timeand ensemble-averaged (23,29). Biomolecular dynamics can be invaluable for understanding the nature of complex intramolecular interfaces, such as those of protein-RNA complexes, where it underpins binding affinities, specificities, and formation rates. The interface dynamics between biomolecules can involve competing local conformational substates rather than a fixed geometry (30), resulting in dynamic recognition. The substates associated with the dynamical ensemble may be important for the detailed mechanisms of the process of binding and unbinding (31). Dynamic recognition can be biologically significant as it could facilitate highly specific recognition of RNAs by a protein and would provide a mechanism by which a large pool of cellular RNAs can be interrogated with speed, specificity, and high affinity for target sequences. Different interaction intermediates can be preferred by different binding partners in quaternary complexes. We propose that Hfq must utilize a form of dynamic recognition because its in vivo RNA cycling at both proximal and distal sides was shown to be disproportionally fast relative to its low-nanomolar RNA-binding affinity, as measured by in vitro experiments (32,33).
Our results suggest that extensive equilibrium local dynamics indeed occur at the distal side interface of the Figure 1. Hfq architecture and assembly into higher order ribonucleoprotein particles. A, the structure of the Pseudomonas aeruginosa Hfq hexamer with differently colored monomeric units and its RNA-binding distal side highlighted in pink. B, a tilted view into the distal side with the amiE 6ARN mRNA engaged. The A, R, and N nucleotides are colored in yellow, red, and orange, respectively. The RNA sequence is specified. The Hfq protein is colored in dark gray, and the RNA backbone is in pink. C, the quaternary complex formed by Hfq, Crc, and amiE 6ARN . From left to right, the quaternary complex can contain two, three, or four Crc units (light gray), respectively. Crc, catabolite repression control.
Hfq-RNA complex and offer a concrete example of a conformational switch that can influence rates of translocation along a length of RNA. Namely, the first nucleotide in each ARN repeat (i.e., the adenines) can undergo frequent syn/anti flips, before Crc binding. The anti and syn conformations are supported by Hfq via unique adenine-specific interactions in both positions. The frequency of the flips is lowered upon formation of the quaternary complex, that is, after Crc is bound, and the syn conformation becomes less favored. We also identify a previously unknown binding pocket at the distal side of Hfq which can weakly bind the N nucleotides of the ARN repeats in the absence of Crc. Finally, we suggest a potential assembly pathway in which the Crc initially recognizes an intramolecular RNA interaction in the amiE 6ARN .

Design, stability, and reproducibility of the MD simulations
The simulations of Hfq-RNA and quaternary complexes showed no loss of structural compactness and integrity, which indicates good performance of the force field (21) and sufficient quality of the experimental structures (8,19) used as the starting states for the simulations. For such large systems, we did not expect to achieve a full thermodynamic convergence within affordable computational time (34). In fact, such convergence is not fully achieved even in longer MD simulations of much smaller systems such as RNA tetraloops and tetranucleotides (35). Nevertheless, even without achieving full quantitative convergence, the MD simulations can provide a wealth of information about the RNA systems (21) inaccessible to experimental methods. The simulations presented here can be considered qualitatively converged in a sense that the same simulation trends were observed in multiple independent parallel trajectories of the individual systems (Table 1). In addition, we extended selected simulations up to 5, 10, or 15 μs (Table 1), observing the same trends even on these longer timescales. The analyses presented in the main text are based mainly on these extended trajectories. For the remaining systems, the analyses were performed on a combined simulation ensemble and are described in Supporting information.
The bound RNA in the quaternary complex consists of six ARN trinucleotide repeats while the Hfq itself is a hexamer (Fig. 1B). All ARN motif nucleotides in the positions "A" (includes A 1 , A 4 , A 7 , A 10 , A 13 , and A 16 ) and "R" (includes A 2 , A 5 , A 8 , A 11 , A 14 , and G 17 ), respectively, are bound in identically organized binding pockets, and their positions are clearly defined by H-bonds, base stacking, and van der Waals (vdW) interactions ( Fig. 2A). The pockets exhibited highly similar behavior in simulations, and their protein-RNA interactions were all maintained with only reversible fluctuations (Tables S1 and S2). To simplify descriptions, we will collectively refer to the first and second nucleotides of the repeats as either A A or A R /G R , with the lower index signifying the nucleotide's position within the ARN repeat. For example, notation Q33(N)-A A (N7) describes backbone amide nitrogen of residues Q33 in the individual Hfq chains forming H-bonds with N7 of RNA nucleotides A 1 , A 4 , A 7 , A 10 , A 13 , or A 16 . Wherever the behavior significantly differed among the a PDB ID of the experimental structure which was utilized as the initial structure. In some simulations, only selected parts of the experimental structure were used (see Selection of initial structures and Fig. S1). b All bases of the "A" nucleotides within the ARN repeats were modified to be in syn conformation before the simulation start. c Crc1 and Crc2 refer to the Crc proteins that bind near and away from G 18 , respectively (Fig. S1). d Simulations were done without the HBfix (see System building and simulation protocol and Supporting information). e In "G18C", "circ", "ext", or "I30A" simulations, the G 18 nucleotide of the RNA was replaced with C 18 , the RNA chain was circularized by covalently connecting the 5 0 -and 3 0nucleotides via a newly modeled phosphate, the RNA was extended from its 3 0 -end by adding nucleotides U 19 and G 20 , or I30 was mutated into alanine, respectively. f All the "R" nucleotides within the ARN repeats were modified to be guanosines. g A circular polyadenine RNA octadecamer was bound to Hfq. h HBfix with 2 kcal/mol penalty was utilized (see System building and simulation protocol and Supporting information).

Hfq-RNA interface dynamics
repeats, we refer directly to the specific nucleotides by their residue numbering (Fig. 1). To help distinguish Crc amino acids, they are labeled with an accent (e.g., R162 0 ).

Simulations of isolated Hfq-RNA reveal syn/anti flips of the A A nucleotides
The most significant conformational change observed in the simulations of Hfq-RNA complexes, in absence of Crc, was the dynamic equilibrium between anti and syn conformations of the A A nucleobases. Most strikingly, the anti-A A and syn-A A nucleotides established interactions with the same amino acids. Structurally, this was possible by the N7 atom replacing the N1 atom as the H-bond acceptor and vice versa, whereas the second hydrogen of the N6 amino group was utilized as a H-bond donor (Fig. 2). There is also a single water bridge between A A and A R , which is formed solely with A A in anti (Fig. 3B). This water bridge is present in the X-ray structure of poly-A RNA bound to Hfq (8) and was regularly observed in all our MD simulations as long as A A was in anti while being abolished in syn. Finally, there is a close presumable repulsive atomic contact between O4 0 atom of A A ribose and the backbone carbonyl of K31 associated with the anti conformation of A A (Fig. 3B). In MD simulations, the K31(O)-A A (O4 0 ) distance increased when A A flipped into syn, thus relieving the repulsion. The average simulation time between the flips differed significantly among the individual simulations and A A nucleotides, ranging from tens to hundreds of nanoseconds ( Fig. 3; Fig. S2). The transition intermediate of these flips (Fig. 2B) was stabilized by a temporary formation of an intranucleotide H-bond, and the transition time of the flips was in the range of tens of picoseconds. In each simulation, several back-and-forth syn/anti flips were observed for every A A nucleotide with few exceptions ( Table 2; Table S3). We note the relatively large variability of syn/anti populations among individual A A nucleotides. After comparing multiple simulation trajectories (Table S3), we conclude that it reflects randomness of simulation sampling and the fact that the syn/ anti dynamics are not synchronized across the six ARN repeats.

Crc attenuates syn/anti flips of the A A nucleotides
The syn/anti flips were strongly reduced when Crc is bound to the Hfq-RNA intermediate ( Table 2 and Fig. 3) for the unmodified experimental structure (19) where all A A s are in anti. For nucleotides A 1 and A 13 , this can be explained by the Crc proteins forming vdW contacts and H-bonds that sterically block the flips and stabilize the anti conformation, respectively ( Fig. 4A) (19). For the rest of the A A positions, Crc forms extensive nonspecific contacts with the phosphate groups immediately downstream (Fig. S3). The syn conformation of A A is associated with dihedral angle transitions of this backbone suite (36) but not the anti conformation (Fig. S4). This indirectly promotes the anti conformation as the non-specific contacts with Crc restrict the available conformational space for such dihedral angle transitions. The overall atomic fluctuations of this backbone suite are also generally lower with Crc (Fig. S5). For both syn and anti conformations, the K31(O)-A A (O4 0 ) distance was shorter in the presence of Crc than in isolated Hfq-RNA (Fig. 4C), that is, Crc pushes the A A nucleotide deeper into the A pocket. This could affect the syn/anti balance as a tighter geometry is associated with the anti conformation. Flipping of the A A base into syn to relieve the K31(O)-A A (O4 0 ) repulsion then becomes less favorable when Crc is bound. On the other hand, the penalty for the loss of the water bridge ( Fig. 4B) upon flipping into syn remains the same in both systems. In conclusion, rather than simply blocking the flips, the Crc could also promote the anti conformation by subtly shifting the free-energy balance among multiple interactions within and around the A A binding pocket. Prior flipping of A A nucleotides into syn alters protein-RNA interface with Crc The experimental structures used in this study have all the A A nucleotides in the anti conformation. In a quaternary complex where we flipped all the A A bases into syn before the simulation start (2Hfq_4Crc_2RNA_A A -syn simulations; see Table 1), we observed some A A bases returning to anti on the simulation timescale (Table 2 and Fig. 3). This strongly contrasts with the simulations where the A A bases were in anti from the start (Crc attenuates syn/anti flips of the AA nucleotides) and subsequently showed no signs of flipping into syn. However, often the A A bases in the 2Hfq_4Crc_2RNA_A A -syn simulations which flipped into anti would once again flip into syn in time, seemingly contradicting the idea that Crc promotes anti or suppresses the flips. For many of the more short-lived anti states (Fig. 3), this was because not all of the interactions with Hfq (Fig. 2) had properly formed after A A flipped into anti. This never occurred in simulations without Crc and suggests Hfq's ability to seamlessly accommodate the spontaneous flips (Fig. 2) is limited by Crc. We also observed lower stability and alterations of the Crc-RNA interactions in the 2Hfq_4Crc_2RNA_A A -syn simulations (Table S4). Even for repeats where these interactions were not lost, the flips back into syn were also often preceded by temporary disruption of the local Crc-RNA interactions (Fig. S6). This suggests that a priori flipping A A into syn perturbs the interface with Crc and this disturbance is not fully relaxed in the course of our simulations. This allows subsequent syn/anti flips in both directions similar to those observed in simulations without Crc. We also suspect there could be some degree of synchronization between the flips of individual A A nucleotides originating from general destabilization of the Crc-RNA interface although this could not be decisively established from our simulations.  Table 1 for details of the individual systems.

Dynamics of the A A /G R binding pocket differs compared with the A A /A R consensus
The second nucleotide of the ARN repeats can be either adenosine or guanosine, collectively referred to as A R or G R . The G R nucleotides form interactions with the same Hfq residues as the A R nucleotides (Fig. 2), both in the experimental structure and in simulations. However, the simulations reveal a striking difference in dynamics of the binding pocket and the associated syn/anti flips of the preceding A A nucleotide when G R is present instead of the A R . Namely, in simulations of the Hfq-RNA complex where we replaced all the A R nucleotides with G R (Table 1), the syn population of the A A nucleotides was significantly increased (Table 2 and Fig. 3). We suggest the reason for this is formation of an interaction with the Q52 side chain that can simultaneously interact with A A (N6), A A (N7), and G R (O6) atoms only when A A is in syn (Fig. 5).
The simulations predict a third binding pocket for the N nucleotides on the distal side of Hfq In the cryo-EM structures of the quaternary complex (19), the N nucleotides of all ARN repeats, except G 18 , have their bases turned away from the distal side of the Hfq to interact with Crc. The N nucleotides are positioned similarly in the Xray structure of isolated Hfq bound to poly-A RNA, where they participate in crystal packing (8). In contrast, all N nucleotides bend toward Hfq in our simulations of isolated Hfq-RNA complexes. The nucleotides formed vdW interactions with I30, as well as H-bonding or ion-bridge interactions with the N28(O) atoms of the individual Hfq subunits ( Fig. 6 and Table S5). This consistently occurred in all simulations, either with the amiE 6ARN mRNA or poly-A RNA sequence bound ( Table 1) and was universally observed for all N nucleotides except G 18 . The simulations thus predict a third binding pocket at the distal side of the Hfq for nondiscriminatory binding of N nucleotides of the ARN repeats (Fig. 6). We henceforth refer to it as the N pocket, in analogy to the previously described A pockets and R pockets, which bind A and R nucleotides, respectively (8). The existence of N pocket binding would be in agreement with the previous report that Hfq mutants lacking I30 have a reduced affinity for RNA sequences that are bound to the distal side (9). Indeed, in simulations of a system where we replaced I30 in every Hfq chain with alanine, the N-pocket binding was either reduced or abolished (Table S6).
In the cryo-EM structures of quaternary complexes (19), as well as in our simulations, the N nucleotides form nonspecific The two lines in the "2RNA" simulations each describe one of the two RNA molecules contained in these systems. Crc's sterically obstruct flips of A 1 in 2Crc and of A 1 and A 13 in 4Crc systems, respectively. b Number of syn/anti transitions (in any direction) and the populations of the two states with one and zero corresponding to all-syn and all-anti, respectively. The A A nucleotide was considered to be in syn and anti when its χ dihedral angle was −30 to 150 and 150 to 330 , respectively. We disregarded transitions lasting less than 300 ps. c Simulation time in which the first syn/anti transition (in either direction) occurred. The "-" symbol indicates that no transition occurred. d Average simulation time that A A nucleotide remained in syn and anti, respectively, before flipping. The lifetimes are not stated when there was only a single or no transition observed.
interactions with the Crc partners. These interactions are described in detail in the Supporting information, except for the 3 0 -terminal G 18 , which is described in The G 18 /A 3 4BPh base-phosphate interaction is stabilized by Crc. In summary, the novel, putative N-pockets could serve as nondiscriminatory, transient 'status quo' binding pocket for the N nucleotides, until a partner molecule, such as Crc, engages the Hfq-RNA intermediate.

The G 18 -A 3 4BPh base-phosphate interaction is stabilized by Crc
The 3 0 -terminal G 18 is different from the other N nucleotides in the quaternary complex. First, the G 18 is not flipped away from the distal side of Hfq in the cryo-EM structure (19). Instead, it forms a vdW interaction with the I30 side chain in a manner similar to the N-pocket binding predicted for the other N nucleotides in isolated Hfq-RNA systems (Fig. 6). Second, the G 18 is forming a guanine-specific type-4 basephosphate (4BPh) interaction (37) with the phosphate of A 3 . Third, in the experimental structure, the G 18 is very close to potentially interact with Crc, namely with the K135 0 , R138 0 , and K139 0 side chains. These interactions were subsequently formed in MD simulations ( Fig. 7 and Table S7). Notably, the K139 0 interaction was sequence-specific and seemed to stabilize the intramolecular 4BPh interaction by compensating for the repulsion between the guanine's O6 atom and the A 2 phosphate (Fig. 7). This is supported by simulations of the isolated Hfq-RNA system (Table 1), in which the G 18 -A 3 4BPh interaction was visibly fluctuating and then lost early in all simulations, along with the G 18 base vdW interaction toward I30 (Table S7). For more details, see Supporting information.
amiE 6ARN G 18 as a putative anchor point for Crc binding to the Hfq-RNA complex Crc does not dimerize spontaneously (19), which is in agreement with our MD simulations (Supporting information). Yet, a minimum of two Crc proteins is required for full assembly of a quaternary complex with Hfq-RNA ( Fig. 1) (19). Therefore, a transiently formed structure with one Crc protein bound to a single Hfq-RNA intermediate could potentially exist at early stages of the quaternary complex formation. We thus simulated two Hfq-RNA-Crc systems that contained either the Crc1 (bound near the G 18 ) or the Crc2 protein  Table 1 for details of the individual systems. Crc, catabolite repression control.
( Table 1; Fig. S1). We observed reduced fluctuations in relation to the Hfq-RNA for Crc1 compared with Crc2 (Fig. S7). Interestingly, binding of Crc1 alone was sufficient to stabilize the 4BPh base-phosphate interaction and to establish the base-specific interaction with the K139 0 formed by the G 18 nucleobase (Fig. 7 and Table S7).
Next, we explored the dependency of Crc binding on the sequence identity and position of the 3 0 -terminal nucleotide within the RNA chain. G 18 is the only N nucleotide recognized specifically by Crc (Fig. 7) and could potentially act as a register-defining marker for its binding. To explore this, we first prepared a system with circularized amiE 6ARN mRNA (Table 1). Building the covalent bond between the two RNA termini necessarily involved disruption of the 4BPh interaction as the G 18 had to be shifted to make the bond. The G 18 4BPh interaction was never reformed in subsequent simulations while no specific interactions between Crc and the G 18 occurred. The K139 0 side chain formed nonspecific interactions with the newly modeled phosphate (Fig. S8). Therefore, the discontinuity in the bound RNA may be essential for proper assembly of the quaternary complex. Next, we prepared systems where we replaced G 18 with C 18 (Table 1). There, the C 18 neither formed base-phosphate interactions nor made any other specific interactions with Crc. It was consequently unstable in its initial position and interacted with the solvent or flipped over to stack with G 15 . Finally, we tested whether the 4BPh interaction could be stable without the G 18 being a 3 0 -terminal nucleotide, but still the last nucleotide bound by Hfq, as it would be in a full-length amiE mRNA. Thus, we prepared systems where we extended the amiE 6ARN mRNA by two nucleotides, introducing U 19 and G 20 . The newly modeled nucleotides were positioned in a way to avoid any clashes with the rest of the system. The G 18 4BPh interaction remained entirely stable in these simulations.

Discussion
Hfq is an RNA chaperone involved in numerous regulatory networks in many bacteria. Its diverse functional roles are underpinned by multiple RNA-binding surfaces, each preferring different sequences (9,10). Here, we examined the structure and dynamics of RNA binding at the distal side of Hfq, which prefers ARN repeats. In 90 μs of MD simulations, we observed significant equilibrium dynamics occurring at the A-pocket of the Hfq-RNA interface, which is attenuated by Crc binding (Fig. 1). In addition, a putative new binding pocket on the Hfq distal side was discovered. Finally, a possible folding pathway of the quaternary complex involving sequence-specific recognition of a 3 0 -terminal bound guanosine by the Crc is proposed.

Syn/anti flipping in A pockets may contribute to Hfq's binding strategy
Our simulations revealed syn/anti transitions of the Apocket adenosines (Fig. 2) on a submicrosecond timescale. These transitions were seen for the amiE 6ARN mRNA and for circular poly-A RNA. The adenosines interacted with the same A-pocket amino acids in both anti and syn conformations (Fig. 2). This strongly suggests that there may be a dynamic equilibrium of syn and anti A-pocket-bound adenosines existing within the solution structure of the Hfq-RNA complex. There have been previous reports of proteins utilizing syn/anti conformational differences to recognize multiple bases (38) or specifically promoting one of the conformations over the other upon binding (39). However, to our knowledge, there are no studies of protein-RNA complexes where a protein could recognize syn/anti conformations of a single base equally well via a fixed set of amino acids. Importantly, the syn/anti flips do not violate the experimentally known specificity of A pockets for adenosines (8) because adeninespecific interactions are formed in both conformations.
The two available X-ray structures of P. aeruginosa Hfq (PDB ID: 3gib, 5new) (8,40) with poly-A RNA bound at its distal side, have the A-pocket-bound adenosines solely in the anti conformation. In the structure of S. aureus Hfq bound to an RNA A tract via its distal side (PDB ID: 3qsu), some of the adenines are in the syn conformation, and the study also revealed evidence of syn/anti flips (41). However, Hfq's distal sides in S. aureus and P. aeruginosa are quite different. It should be noted that the assignment of syn/anti nucleobase conformers in X-ray structures is often challenging and mistakes can occur even in high-quality structures because of phase errors, resolution limitations, and the time-averaged and ensemble-averaged nature of the data collection (42)(43)(44). The crystal lattice or the cryo temperatures could also enforce the anti conformation exclusively (45). Our visual inspection of the electron density maps (8,40) is consistent with the Hfq-RNA interface dynamics assigned anti conformation, although this does not exclude possibility of phase error.
Populations of the syn/anti states predicted by MD simulations may be influenced by potential mild force field imbalances, and thus, the results may not reach quantitative accuracy. However, we suggest that the simulations unambiguously predict that the Hfq A pocket can readily host A in syn conformation. The syn conformation, even if overpopulated by MD, could represent a transient (higher energy) binding pattern (46) involved in the process of prebinding of RNA to Hfq or in substrate cycling. Such a binding pattern would be undetectable in ground-state experimental structures because of its low population. The syn conformation of Apocket adenosines was significantly more populated when the succeeding R-pocket nucleotide was guanosine instead of adenosine (Fig. 5), suggesting a degree of structural communication between the two pockets. There might also be Hfq cofactors that profit from accessibility of the syn orientation or are able to capture the transition conformation (Fig. 2B) for binding.   (19). The black and purple dashed lines indicate H-bonds and putative H-bonds, respectively. The putative H-bonds between Crc and G 18 were fully realized in all MD simulations (right). The K139 0 side chain formed a base-specific interaction with the G 18 and screened repulsion between the base and the RNA backbone. Crc, catabolite repression control.

Hfq-RNA interface dynamics Crc prefers anti conformation of A-pocket nucleotides
Specific protein-RNA interactions between A A nucleobases and Crc directly promoting the anti conformation are formed only for the A 1 and A 13 (Fig. 4). Hence, at first sight, the Crc should be able to tolerate syn in the other A-pocket nucleotides (i.e., A 4 , A 7 , A 10 , and A 16 ). Despite this, our simulations of the quaternary complex revealed very few flips toward syn for these nucleotides when starting from the experimental structure where they all possess the anti conformation (Table 2 and Fig. 3). When we manually flipped all A-pocket adenosines into syn before starting the simulations, the A A nucleobases subsequently had the tendency to flip back into anti (Table 2 and Fig. 3), despite all the Hfq-RNA syn interactions having been established, suggesting that the syn conformation is not supported in the presence of Crc. Furthermore, some of the Crc-RNA interactions are destabilized by the syn conformation (Table S4). Some flips back into syn were observed in case the Hfq interactions did not properly form after flipping into anti or during disruptions of some local Crc-RNA interactions (Fig. S6). In other words, the simulation timescale may be insufficient to fully relax the Crc-RNA interface after the initial introduction of the syn conformation. We suggest that Crc may promote the anti conformation via nonspecific interactions it forms with the phosphates downstream of the A A nucleotide (Fig. S3) and by sterically restricting the ability of the A pocket to accommodate the flips that shifts the balance in favor of anti (Fig. 4). This observation illustrates the delicate balance of interactions at the Hfq-RNA interface and its potential utilization by various cofactors which may prefer different conformation of A A nucleobases. We acknowledge that the proposed mechanism affecting the syn/anti balance could include additional components which were not sampled on the timescale of our simulations, such as the partial or complete unbinding of Crc from the quaternary complex.
The dynamic recognition of RNA could be important for substrate cycling by Hfq The avid binding affinity of Hfq for various RNA targets appears incompatible with its fast biological binding turnover and cellular response (32,33,47). The K D s measured in vitro are in the subnanomolar range, which would indicate binding half-lives of well over 1 h. In contrast, the cellular responses facilitated by Hfq are on a timescale of 1 to 2 min, suggesting that nascent RNAs are rapidly cycled through the cellular Hfq pool. To reconcile these two observations, it has been suggested that RNA bound by Hfq can be displaced by competitors from the cellular pool in a stepwise process, which was termed as active cycling (32,33). We note that the extensive dynamics observed in our simulations for RNA bound to the distal side of Hfq could provide an entry point for its displacement by a competing RNA. Importantly, displacement could occur one ARN repeat at a time or even a single nucleotide at a time, as envisaged by the active cycling model (32,33). Crc and possibly other Hfq-binding proteins could have coevolved to actively suppress the intrinsic dynamics of the Hfq-RNA interface to slow down RNA cycling. However, as our study involved only RNA bound to the distal side of Hfq, we do not claim this interaction fully accounts for the affinity-response discrepancy (32,33). Moreover, it is likely that we capture only part of the dynamics immediately pertinent to the dominantly bound state because of the 1 to 15 μs timescales of the simulations.

Hfq provides a weak binding pocket for the N nucleotides
In the quaternary complex (19), the N nucleotides are flipped away from Hfq (Fig. 6) and form many nonspecific interactions with Crc (Supporting information). The exception is G 18 which forms an intramolecular 4BPh interaction with the A 3 phosphate (Fig. 7). The N nucleotides are also flipped away in the two X-ray structures of the isolated Hfq-RNA complexes (8,40) where they form extensive crystal packing interactions.
In our simulations of the isolated Hfq-RNA complexes, all the N nucleotides quickly flipped toward Hfq and formed vdW interactions with I30 side chain and H-bonding or ion bridging with the protein backbone of N28 (Fig. 6). Based on our simulations, we suggest that the I30 and N28 residues might constitute a third binding pocket (N-pocket) at the distal side of Hfq. The N-pocket likely offers weaker contribution to the overall binding affinity than the known A and R pockets (8), given the smaller number of intermolecular interactions that define it. Indeed, the bound nucleotides were very dynamical in our simulations and we regularly observed a drift of the base along the molecular interface formed by vdW interaction with the I30 side chain. Such movement would become especially pronounced during temporary disruptions of the interaction toward the N28. We suggest this type of dynamical binding is realistic as it was reported to be in agreement with NMR data in other protein-RNA systems (48). In addition, we suggest that vdW interactions between nucleobase aromatic faces and other molecules are generally very well described by the utilized AMBER force fields with electrostatic potential-derived charges. This is true even for stacking interactions among nucleobases and with aromatic amino acids. Although these interactions are sometimes described as "π-π" in the literature, rigorous quantum chemical studies (49)(50)(51)(52) showed that AMBER type of force fields provides good description of these interactions. There are no substantial "π-π" orbital effects neglected by the force fields. Rather, these interactions primarily involve electrostatic interaction, London dispersion attraction, and short-range repulsion, all of which can be approximated by the current molecular mechanics model. This does not rule out overstabilization or understabilization of vdW interactions of nucleobases in MD simulations. However, such imbalance would rather involve the solvation and not inaccuracy in description of the direct (intrinsic) vdW contact (21).
The flexible behavior and weak binding explain why the Npocket RNA recognition is supplanted by crystal packing interactions in the two X-ray structures of the isolated Hfq-RNA complexes (8,40). However, such disordered-like binding could be useful for dynamic recognition of molecules interacting with the Hfq-RNA surface and could serve as a 'status quo' state of the Hfq-RNA intermediate. For example, Crc is recognizing the N nucleotides in a nonspecific manner, whereas other proteins might be able to directly read out their sequence. Owing to their dynamical binding to Hfq, the Npocket nucleotides are readily available to rapidly establish contacts with other partners. The N-pocket binding observed in our simulations is similar to what is seen in the X-ray structure of E. coli Hfq bound to A-rich linker from OxyS sRNA (PDB ID: 4qvc) (53). However, this structure also shows deviations from the ARN repeat consensus and has multiple repulsive and crystal packing interactions affecting its protein-RNA interface. In addition, only three nucleotides could be resolved. Therefore, we opted not to use it for MD simulations.
In summary, we suggest that the "disordered" N-pockets weakly contribute to binding affinity of RNAs to the Hfq in a sequence-independent manner. The dynamical N-nucleotides are at the same time able to quickly establish interactions with other partners, which can already be sequence dependent.
Crc might sample the ARN repeats during quaternary complex formation, searching for 3 0 -terminal guanosine The cryo-EM structure of the quaternary complex (19) shows extensive but base nonspecific interactions between the N nucleotides and Crc, suggesting that the Crc may interact equally well with any of the ARN repeats. It is therefore puzzling how the Crc selects its binding register during the quaternary complex formation or indeed if such selection occurs. The fact that it was possible to resolve a high-resolution structure of the quaternary complex by cryo-EM (19), which relies on averaging of many individual images of the complex, suggests that a specific binding register has been selected.
Our simulations indicated that the terminal G 18 nucleotide and its 4BPh interaction toward the A 3 phosphate might be the marker which, at least in case of the amiE 6ARN mRNA, provided a specific binding site for Crc. The guanine-specific 4BPh interaction is the strongest base-phosphate interaction occurring in folded RNAs (37). The simulations additionally predict a guanine-specific protein-RNA interaction between G 18 and the K139 0 side chain (Fig. 7). To establish both of these interactions, the guanosine does not need to be the 3 0 -terminal nucleotide, but merely the last nucleotide bound by the Hfq after which the RNA chain exits the Hfq's distal side. This would be the case in, for example, a full-length amiE mRNA, or during binding of multiple shorter RNA segments rich in ARN repeats, such as the CrcZ (Fig. 8) (14). In either case, recognition of these 3 0 -terminal guanine-specific interactions by Crc would be enough to overcome binding register degeneracy in the quaternary complex, as the binding sites of the next Crc would then be predefined by the first. It is however unclear whether preventing the degenerate binding of Crc could have effects on Hfq/RNA/Crc recognition in vivo or if it is just a system-specific coincidence which may have helped the resolution of the cryo-EM structures (19). We do not expect that the free-energy gain associated with the 3 0 -terminal guanosine recognition is large enough to completely abolish other Crc-binding patterns. However, even small increases in affinity could play a role in balancing the overall complex kinetic and thermodynamic networks of interactions in which Hfq is involved. In addition, it is possible that binding of some other proteins could be weakened by the same marker.

Concluding remarks
In this study, we explore the dynamics of RNA recognition by a conserved and pleotropic RNA chaperone, Hfq. We used state-of-the art atomistic MD simulations to obtain highresolution insight into local dynamics and substates associated with RNA binding to the Hfq distal site. We show that Hfq partly utilizes dynamic recognition of RNA substrates, a type of molecular recognition that is difficult to fully resolve in structural experiments. However, we suggest that dynamic recognition is likely an important contribution to structural mechanism by which Hfq engages RNA targets with adequate specificity while maintaining high RNA turnover rates in the cell. Upon presentation of a target RNA to effector molecules, such as Crc, this turnover is slowed down significantly, allowing for a downstream cellular response.

Selection of initial structures
We have used cryo-EM structures of Hfq-Crc-RNA quaternary complexes with molecular composition ratios of 2:2:2, 2:3:2, and 2:4:2 (Protein Data Bank [PDB] IDs: 6o1k, 6o1l, and 6o1m) (19) as starting structures for MD simulations. Models of isolated Hfq-RNA, partially assembled quaternary complexes, and the Crc dimer and tetramer were prepared by removal of subunits from the experimental 2:2:2 structure. The simulations of the Crc dimer were also started based on X-ray structure of the isolated Crc protein (PDB ID: 4jg3) (16) with its dimer structure obtained via crystallographic symmetry. We have also used the X-ray structure of Escherichia coli Hfq bound to RNA poly-A octadecamer for simulations of the isolated Hfq-RNA system (8). Starting structures for simulations with modified, extended, or circularized RNA sequences or modified nucleobase conformations were prepared by molecular modeling of the experimental structures. A complete list of simulated systems is presented in Table 1 and visualized in Fig. S1.

System building and simulation protocol
The starting files for MD simulations were prepared in tLeap module of AMBER 18 (54). We have used bsc0χ OL3 (i.e., OL3) (55) and ff12SB (56) force fields for description of RNA and protein, respectively; the preference for ff12SB rather than ff14SB in protein-RNA simulations is explained elsewhere (57).
A mild 1 kcal/mol stabilizing HBfix potential (58) was applied to the native H-bond interactions K31(N)-A A (OP2) and G29(O)-A R (O2 0 ). Both of these H-bonds are present in all available experimental structures of P. aeruginosa Hfq bound to RNA-containing ARN repeats (8,19,40). The HBfix was applied to reduce likelihood of potential random spurious departures of the trajectories from the experimental geometry in longer simulations due to some force-field imperfectness. It allows us to more efficiently examine, within the affordable simulation time, the dynamics corresponding to the Hfq-RNA bound state as indicated by the experiments. We emphasize that unlike the restrained or targeted explicit solvent MD (or even G o potentials in coarse-grained modeling (59)), HBfix introduces force potential between two atoms only within the narrow region corresponding to the H-bonding distance. Therefore, although somewhat bolstering these H-bonds by increasing their lifetime, it still allows the system to explore other geometries and should not bias the results derived in our study. Nevertheless, to verify this, we performed control simulations without any HBfix and simulations where we used increased 2 kcal/mol potential. These simulations confirmed that the results presented below do not depend on the use of the HBfix. Further details of the HBfix and of the control simulations are extensively described in Supporting information, together with explanation why the HBfix does not bias the results of the study.
In all simulations, the biomolecular systems were surrounded in a truncated octahedral box of SPC/E water molecules (60) with a minimal distance of 13 Å from the box border. The systems were neutralized and a salt concentration of 0.15 M was established by addition of K + and Cl − atoms (61).
The systems built in tLeap were minimized and equilibrated according to the protocol extensively described in Ref. (62) utilizing the pmemd.MPI module (54). Afterward, the production simulations were carried out with the pmemd.cuda module (63). The typical simulation timescale was 1 μs with selected simulations further extended afterward. We have used the SHAKE protocol along with HMR (hydrogen mass repartitioning) to allow a 4-fs integration step (64,65). Particle mesh Ewald (66) and periodic boundary conditions were used to handle long-range electrostatics and to prevent the box-border bias. The cut-off distance for Lennard-Jones interactions was set to 9 Å. The Langevin thermostat and Monte Carlo barostat (54) were used to keep the systems at temperature and pressure of 300 K and 1 bar, respectively.

Analyses
The cpptraj (67) was used to perform analyses of all simulation trajectories, and visual molecular dynamics (68) was used for their visual inspection. Raster3D (69) and POV-Ray were used for preparation of figures. LibreOffice and Inkscape were used to prepare graphs and schemes, respectively.
The presence of H-bonds was evaluated based on the donoracceptor distance and donor-hydrogen-acceptor angle, with 3.5 Å and 120 cutoffs, respectively. For selected simulations, principal component analysis (67) was used to evaluate RNA backbone dynamics and global interdomain movements between the Hfq and Crc. Every fifth frame of the trajectories was used to calculate the coordinate covariance matrix which was then diagonalized and used to obtain the first ten eigenvectors (principal components). The principal components of motion were visualized by projecting them along the utilized simulation frames.

Data availability
The authors declare that the data supporting the findings of this study are available within the article and its Supporting information. The raw MD simulation trajectories can be obtained from the corresponding author (Miroslav Krepl) upon reasonable request.