Translation Initiation Rate Determines the Impact of Ribosome Stalling on Bacterial Protein Synthesis

Background: Elongation factor P (EF-P) rescues ribosomes stalled at consecutive prolines, however not all proteins with polyprolines show EF-P dependent expression. Results: A correlation between translation initiation rate and EF-P dependence is demonstrated. Conclusion: Ribosome stalls lower protein levels only when they are more rate-limiting than initiation. Significance: Explanation for why stall motifs do not necessarily affect protein abundance. ABSTRACT Ribosome stalling during translation can be caused by a number of characterized mechanisms. However, the impact of elongation stalls on protein levels is variable and the reasons for this are often unclear. To investigate this relationship we examined the bacterial translation elongation factor P (EF-P), which plays a critical role in rescuing ribosomes stalled at specific amino acid sequences including polyproline motifs. In Salmonella for efficient expression in vivo . The α - and β -subunits of ATP synthase, AtpA and AtpD, are translated from the same mRNA transcript and both contain a PPG motif, however proteomic analysis revealed that AtpD levels are strongly dependent on EF-P whereas AtpA levels are independent of EF-P. Using these model proteins we systematically determined that EF-P dependence is strongly influenced by elements in the 5’ untranslated region of the mRNA. By mutating either the Shine-Dalgarno sequence or start codon we find that EF-P dependence correlates directly with the rate of translation initiation where strongly expressed proteins show the greatest dependence on EF-P. Our findings demonstrate that polyproline induced stalls limit translation This


INTRODUCTION
Cellular protein levels are the sum of transcription, translation and degradation rates and all of these incorporate their own regulatory mechanisms. During translation, one protein is produced for each ribosome that successfully recognizes and assembles on an mRNA transcript (initiation), constructs a polypeptide (elongation) and releases the full-length amino acid chain (termination). Decreased efficiency of any of these three phases reduces the number of ribosomes able to complete translation and, hence, lowers total protein production. To maximize the efficient use of resources, the rate-limiting step usually occurs at the point of initiation and extensive work has made it clear that the strength of ribosome binding greatly affects the amount of protein produced from an mRNA transcript (1).
By altering elements of the ribosome binding site (RBS), such as the Shine-Dalgarno (SD) sequence of canonical eubacterial mRNA, translation rates can be controlled in order to fine tune gene expression at the post-transcriptional level (2)(3)(4). Other elements of the 5' untranslated region (UTR) can also play a role in translation initiation including mRNA secondary structure, sRNA binding sites, the start codon, and translation enhancers -AT-rich regions in proximity to the SD sequence that improve small subunit binding (5)(6)(7).
Though initiation is typically the primary regulatory step in translation, slower progression through elongation can also influence protein synthesis. This has been studied extensively for rare codons that slow the elongation rate due to lower cellular tRNA concentration, and starvation conditions that restrict the amount of aminoacylated tRNA (8)(9)(10)(11)(12). In these instances ribosomes must pause until a properly charged tRNA accurately decodes at the A-site and the impact of this delay on protein levels can be influenced by rare codon clustering, repetition, or position within an open reading frame (ORF) (13)(14)(15)(16)(17). Other elongation stalls have been shown to regulate protein production such as during translation of secM wherein ribosomes pause at a particular amino acid sequence until the Nterminal peptide is recognized by the SecA secretion apparatus -thus directing secM expression (18,19). Interaction between the nascent polypeptide and the ribosome exit tunnel has been implicated to mediate this and other stalls in elongation or termination (20)(21)(22)(23). Though these stall motifs bear little sequence similarity, many involve prolines, presumably because of their unique characteristics and relatively poor peptide bonding capability (24,25).
Consecutive prolines within a protein sequence have been shown to have a negative effect on elongation during translation and the universally conserved translation factor, Elongation factor P (EF-P), and its eukaryotic homolog, eIF5A, play a critical role in alleviating stalls at polyproline containing sequences including but not limited to PPP, APP and PPG motifs (26)(27)(28)(29)(30). EF-P bears remarkable similarity in size and shape to a tRNA and has been shown to bind the ribosome between the P-and E-sites (31,32). When bound, EF-P projects a unique highly conserved modified lysyl-(R)-β-lysine residue into the peptidyl transferase center to presumably re-initiate peptide bond formation (33)(34)(35)(36)(37). Both in vitro and in vivo assays have demonstrated that some of these polyproline motif-containing proteins display dramatically reduced levels in an efp mutant of E. coli or Salmonella enterica serovar Typhimurium (hereafter referred to as Salmonella) (26,27,29,30,38). However, proteomic analysis by amino acid isotope labeling (SILAC) in both of these species has also demonstrated that a large percentage of polyproline containing proteins are unaffected by the absence of EF-P (29,30). This divergent impact of specific stall sequences on protein levels establishes polyproline motifs as an excellent model for examining the interplay between elongation stalls and cellular protein synthesis.
A particularly striking example is found in the ATP synthase complex, which contains two components, AtpD (β-subunit) and AtpA (αsubunit), which both contain a PPG motif. These two proteins are similar in sequence and structure, are present in the functional complex in equal stoichiometries, and are both encoded on the same mRNA transcript; yet despite these similarities, AtpD consistently exhibits lower levels in efp mutants whereas AtpA levels are unaffected by the loss of EF-P (39,40). In a recent study, we employed ribosome profiling to map the occurrence of translational stalls in an E. coli efp mutant and investigated the role of amino acids immediately upstream of the PPG motifs of AtpA and AtpD in EF-P dependent stalling (41). This work indicated that although AtpA protein levels were not altered in previous SILAC analyses, some translational stalling occurs at its PPG motif, albeit to a lesser extent than AtpD. Swapping even a single amino acid two residues upstream of the PPG motif between AtpA and AtpD resulted in a partial, but incomplete reversal of EF-P dependence. This suggests that although the region proximal to the PPG motif does play a role in stalling, there are other factors influencing EF-P dependence beyond the immediate vicinity of the pause site.
In the work presented here, we used atpA and atpD as model systems to investigate additional factors affecting the impact of stalling on protein levels. Systematic swapping of regions of these two genes revealed that the component most critical for the difference in EF-P dependence was contained within the 5' UTR. Further mutagenesis and modeling studies demonstrate a relationship between the translation initiation strength of a transcript and the impact of an EF-P dependent stall on protein production. Taken together our data indicate that not all polyproline motifs induce stalling to equal degrees and that EF-P dependence is only observed when a given polyproline sequence impacts the rate of synthesis more than translation initiation.

EXPERIMENTAL PROCEDURES
Bacterial strains and plasmids -Salmonella 'wild-type' was Salmonella enterica serovar Typhimurium strain 14028S (WN150) and generation of an isogenic efp mutant (WN1405) that deletes most of the efp gene (base-pairs 145-424) while maintaining the yjeK promoter has been described previously (29). The pXG10sf translational fusion reporter system was used to measure EF-P dependent expression in Salmonella strains and has been described previously (29,(42)(43)(44). The plasmid employs a tightly regulated lowcopy-number origin of replication (pSC101) and a constitutively active promoter (PLtet0-1) to minimize variations in transcription.
Salmonella atpA or atpD genes were inserted into the NsiI and NheI sites of pXG10sf immediately after the promoter and lacking a stop codon for C-terminal translational fusion to 'super-folder' GFP.
Since the native atpIBEFHAGDC operon is transcribed as a single mRNA, 75bp (atpA) or 74bp (atpD) upstream of the start codon was included as a 5' UTR. Successive directed mutations to these constructs were introduced by conducting around-the-world PCR, with primers containing the mutated sequence as overlapping overhangs, followed by Gibson isothermal assembly cloning to circularize (45). All constructs were confirmed by Sanger sequencing. Specific primers and constructs are listed in Table S1.
E. coli wild-type (BW25113) and efp mutant were from the Keio knockout collection and kanamycin cassettes were removed using FLP recombinase (46,47). For EF-P dependent expression assays in E. coli, the pBAD30XS translation fusion plasmid was used as described previously (29,41). The plasmid contains an arabinose-inducible tandem fluorescence cassette consisting of GFP followed immediately by the mCherry SD sequence and mCherry. Putative EF-P dependent motifs were inserted at the amino terminus of GFP, between codons 3 and 4. To vary the efficiency of translation initiation, different start codons (GTG, TTG, ATC, CTG) were inserted into the pBAD30XS plasmids by ds-oligo hybridization between the EcoR1 and XhoI cut sites (48). Specific primers and constructs are listed in Table S1.
GFP fluorescence assay in Salmonella -As described previously, Salmonella wild-type and Δefp mutant strains carrying derivatives of pXG10sf were grown at 37°C with shaking for 16h in a TECAN Infinite M200 microplate reader in MOPS minimal media supplemented with 0.2% glucose and 20μg/mL chloramphenicol (29,41). Optical density (OD 600nm ) and GFP fluorescence (475nm and 511nm excitation and emission wavelengths, respectively) were measured every 15 minutes. Background value from no-cell controls was subtracted from all readings and values at 10h post-inoculation are shown.
GFP fluorescence assay in E. coli -As described previously, overnight cultures of E. coli harboring pBAD30XS constructs were used to inoculate M9 media to an OD 600 of 0.05 (29,41). The M9 media was supplemented with 0.2% glycerol, 0.5 g/l, tryptone, 5 µg/l thiamine, 0.2% arabinose and 100 µg/ml ampicillin. Cultures were grown at 37°C with shaking and fluorescence was measured at 8h using a spectrofluorometer (Horiba) with excitation at 481nm and emission at 507nm for GFP or excitation at 587nm and emission at 610nm for mCherry. Blank medium measurements were subtracted as background.
Immunoblotting -Salmonella strains containing pXG10sf plasmids were grown in MOPS minimal media supplemented with 0.2% glucose and 20µg/mL chloramphenicol to mid log phase. Cells were then washed twice (1 mM Tris pH 8.0, 5 mM magnesium acetate) and lysed by sonication in lysis buffer (9.32 M urea, 2.67 M thiourea, 40 mM Tris, 86.78 mM CHAPS, pH 8.5). 10μg of total cell lysate was added to 2x SDS loading buffer, boiled for 10 min at 95°C and proteins were separated by SDS-PAGE followed by semidry transfer to a nitrocellulose membrane. Following 1h blocking at room temperature in 5% milk in TBST (1x Tris-buffered saline, 0.05% Tween 20), immunoblotting was conducted overnight in TBST + 5% milk at 4°C using a mouse anti-DnaK antibody (1:50,000; Enzo Life Sciences) and a mouse monoclonal antibody specific for GFP (1:1000; Santa Cruz Biotechnology). Blots were washed with TBST and subsequently incubated for 1h at room temperature with HRP-fused goat anti-mouse antibody (1:10,000 in TBST + 5% milk) for ECL imaging (Thermo Scientific).
Curve Fitting -Fluorescence data was fit to a '1-exp' curve of the form = (1 − !!" ) by the minimum sum of chi-squares method: For each observed wild-type fluorescence value (xcoordinate), chi-square values were generated by comparing the corresponding observed ∆efp fluorescence to the y-value calculated by the equation. The chi-squares for all constructs with the same ORF were summed and the values of A and B yielding the minimal sum of chi-squares was solved using Microsoft Excel's 'Solver' function. Coefficient of determination (R 2 ) comparing observed ∆efp data and y-values predicted by the solved 1-exp equation was calculated using Excel's 'RSQ' function.
Translation kinetics modeling -Translation was modeled in silico using a 150 codon ORF. At each cycle the program randomly decides if a ribosome will initiate by generating a random number between 1 and I and if the number is 1 it will bind at the start codon if it is unoccupied (initiation frequency = 1 / I). Similarly, elongation frequency was set to 1 / 10 and progression would only occur if the next codon was vacant. Movement from position 99 to 100 was defined as the 'stall site' and progression was determined by the 'pause frequency' of 1 / S. The program was run for 10000 cycles and 2000 replicate mRNAs each time and the values shown in figures represent the number of ribosomes able to terminate translation (frequency of termination at codon 150 was set to be equal to the elongation frequency). Additionally, the program saves the endpoint positions of the ribosomes on the mRNA to reflect ribosome densities at each codon. The in silico model was written in Perl 5.

RESULTS
The 5' UTR plays a significant role in EF-P dependence -In our recent paper, we employed translational fusion constructs and found that AtpD but not AtpA was dependent on EF-P for its efficient translation and this is consistent throughout the growth curve ( Figure 1 and reference 41). Furthermore, we observed that swapping the amino acids immediately upstream of the PPG motif of AtpA and AtpD partially reversed their EF-P dependence (41). However, this reversal was incomplete when compared to a negative control (LacZ), to a PPG::PLG mutation in AtpD, or to an RPPG::PPPG mutation in AtpA. Complete reversal was not achieved even when swapping up to 40 codons upstream of the PPG motif, a distance that accounts for more amino acids than can be accommodated in the ribosomal exit tunnel (21,23,49). This suggested that factors outside of the ribosome were negatively affecting AtpD translation in the efp mutant.
To investigate additional regions impacting EF-P dependence, we systematically swapped regions of atpA and atpD using the previously employed pXG10sf plasmid, involving a constitutively active promoter and C-terminally fused "super-folder" green fluorescent protein (GFP) as a reporter (29,(42)(43)(44). We generated serial swaps of increasing length from the 5' end of the mRNA transcript to the PPG motif, or in the reverse direction ( Figure 2). Results from this analysis demonstrated that the coding region beyond 12 codons upstream of the PPG motif had no additional reversal of EF-P dependence even when the entire N-terminus from the ATG start codon to the polyproline motif was swapped. Contrastingly, there was a stark change in EF-P dependence when the 5' UTR was included in the swapped region or when only the 5' UTR was swapped. Specifically, the atpD 5' UTR confers strong EF-P dependence onto the AtpA protein whereas the 5' UTR of atpA appears to significantly reduce the amount of EF-P dependence displayed by AtpD. Combined replacement of the 5' UTR and the 12 codons upstream of the PPG motif have a cumulative effect, suggesting that the two regions affect expression independently and that complete switching of EF-P dependence requires both elements. Indeed, the atpD construct with both the atpA 5' UTR and upstream PPG flanking region demonstrates higher fluorescence in the efp mutant than in wild-type Salmonella; this resembles the lacZ negative control and may be due to decreased levels of protease in the efp mutant (29).

The 5' UTR regions that include the SD sequence and a potential translation enhancer play a significant role in EF-P dependence -
Though native ATP synthase proteins are translated from a single mRNA transcript, both atpA and atpD possess their own SD sequence. Furthermore, their start codons are spaced 12 and 26 bases, respectively, after the preceding gene's stop codon suggesting that the translation of each is independent and not coupled to that of the preceding gene (NCBI accession: NC_016856.1). For the plasmid-based translational fusion assay employed here, the 5' UTRs of the atpA and atpD constructs consisted of the 75 or 74 bases (respectively) upstream of the start codon.
To further dissect the region within the 5' UTR involved in EF-P dependence, 15 bp stretches were systematically swapped between atpA and atpD and expression of the resulting construct was measured by GFP fluorescence. EF-P dependence was reversed most significantly when regions -15 to -1 or -30 to -16 relative to the start codon were swapped ( Figure 3). Notably, these regions are both predicted to be heavily involved in translation initiation: The -15 to -1 region includes the SD sequence while the -30 to -16 region of atpD is particularly AT-rich (74%) compared to atpA (47%) suggesting that it may act as a translation enhancer (5,7). Moreover, for all of the serial swap constructs generated there appeared to be a correlation between EF-P dependence and the degree of overall expression ( Figure 3). Specifically, constructs that were poorly expressed exhibited little difference in fluorescence levels between wild-type and efp mutant strains, whereas constructs with high levels of expression in the wild-type background were the most reduced in the efp mutant.
To ensure that this apparent loss of EF-P dependence was not due to an inability to accurately measure differences at low fluorescence levels, western blotting of select AtpA and AtpD constructs was performed. The patterns of EF-P dependence observed by western blotting were similar to those found by fluorescence measurements (Figure 4). For a few constructs where fluorescence levels are less than two-fold greater than no-plasmid controls, protein levels were very low or non-detectable by western blot in both the wild-type and efp mutant strains. These constructs, including the atpD construct with the whole 5' UTR of atpA, were omitted from downstream analyses.
Mutations that affect translation initiation influence EF-P dependence -To further assess whether the 5' UTR influences EF-P dependence through its role in translation initiation, we first confirmed that it instigates its effect via the polyproline stall rather than by a separate unrelated mechanism. Indeed, mutation of the PPG motif to PLG resulted in restoration of GFP fluorescence in the efp mutant even at high expression levels obtained by 5' UTR swapping ( Figure 5). We next addressed translation initiation by mutating individual nucleotides within the SD sequence and the start codon of atpD and atpA. The atpA SD sequence (AGGGGA) was improved by mutating it to the consensus sequence (AGGAGG). For atpD, we weakened ribosome binding by mutating the ATG start codon or by altering the wild-type SD sequence (AGAGG) with a G>C mutation (AGAGC). Furthermore, we also strengthened the atpD SD sequence by a G insertion yielding the consensus sequence (AGGAGG). Consistent with our hypothesis, the constructs displaying the highest levels of expression in wild-type cells were the most affected by the loss of EF-P ( Figure 6). Specifically, strengthening the SD sequence of either atpA or atpD yielded increased dependence on EF-P, while weakening the SD or the start codon of atpD reduced its dependence. Since these point mutations target regions critical for ribosome binding, these data suggest that the 5' UTR influences EF-P dependence via its role in translation initiation.

Initiation rate and stall strength correlate
with EF-P dependence -To gain insight into the relation between initiation rate and EF-P dependence, we plotted fluorescence in wild-type versus in efp mutant Salmonella for the full set of 5' UTR mutant constructs ( Figure 7A). We grouped constructs together by ORF such that in each group all constructs differed only in their 5' UTR. Plotting in this manner allowed for visualization of changes in expression in the efp mutant (subjected to stalls in elongation) as initiation rate (and thereby wild-type expression) increases.
Strikingly, for both atpD and atpA, we find that as expression increases in wild-type, expression in the efp mutant follows a distinct curve that eventually reaches a maximum. Indeed, the observed data could be fit well to a '1-exp' curve of the form: = 1 − !!" , where x and y represent fluorescence in the wild-type and efp mutant strains respectively, and A and B are constants particular to the ORF being analyzed. Specifically, A is the asymptote of the curve and can be interpreted as the theoretical maximum expression in the efp mutant for that particular ORF, at which point progression through the stall motif has overtaken initiation as the rate-limiting step for all transcripts. Similarly, B is inversely related to the slope of the curve and how quickly it reaches the asymptote. Interestingly, the curve for the AtpD ORF was steeper than that for AtpA and reached a lower asymptote. This emphasizes that the stall in AtpD is more EF-P dependent than that of AtpA, resulting in a lower maximum expression in the efp mutant and a significant hindrance on translation even at lower initiation rates.
Previously we found that the residues upstream of the PPG motif are partially responsible for the difference in EF-P dependence of AtpD and AtpA (41). To examine the additive effects of the upstream region and translation initiation rate, we applied our plotting analysis to constructs where the 12 amino acids upstream of the atpA or atpD PPG motifs were swapped. Interestingly, the data for the 12aa swap constructs resembles the protein from which the upstream region originated, suggesting that the upstream residues govern stall strength and thereby the arc and maximum of the expression curve ( Figure  7A).
To investigate the relation between translation initiation and EF-P dependence in another bacterial species and with proteins other than AtpD and AtpA, we employed the Keio collection E. coli efp mutant (46). We conducted a similar plasmid-based translational fusion assay wherein specific polyproline motifs were inserted directly into the 4 th codon position of GFP. We varied the efficient AUG initiation codon to GUG or UUG (Class I codons that support efficient translation) or to AUC or CUG (Class IIA codons that support translation at levels only 1-3% that of AUG) (48). Mutation of the start codons of these constructs revealed a similar correlation between expression in wild-type cells (initiation strength) and EF-P dependence for multiple different polyproline motifs ( Figure 7B). Fluorescence plotting and curve fitting follow a trend towards a maximum expression in the efp mutant and, interestingly, the strongest polyproline motif tested (six consecutive prolines) appeared to be rate limiting (reached maximum expression in the efp mutant) even with the weakest start codon tested, CTG. This supports that the relation between initiation and EF-P dependence is not restricted to AtpD and AtpA in Salmonella, but is generally applicable to other EF-P dependent motifs and in other species.
Modeling the interplay between initiation rates and stall clearance -Our observations of a maximum expression in the efp mutant as initiation rate increases prompted us to expand upon the relation between EF-P dependent stalls and initiation rate. Using high-throughput datasets from published works, we attempted to correlate measures of EF-P dependence -such as efp mutant SILAC (29,30) or ribosome profiling data (41) -with analogs for translation initiation rate including ribosome profiling reads (41), protein abundance (29, 30), or protein per mRNA (50) in wild-type cells. None of these analyses yielded a significant correlation (data not shown). This may be due to significant variation in the strength of different stalling sequences. EF-P dependence may be observed for some weakly expressed genes if they have a strong stall motif, and similarly some highly expressed genes encoding a polyproline motif may not show EF-P dependence if the motif triggers weak stalling. This heterogeneity may mask the correlation between translation rate and EF-P dependence in these high-throughput datasets.
To further explore the interplay between initiation rate and the stall strength of not just EF-P dependent but also other elongation pauses, we generated a computational model simulating translation of a hypothetical 150 codon transcript using parameters (I and S) such that we could modulate the initiation rate and elongation stall strength.
As described in Experimental Procedures, initiation was set to occur with a chance of 1 / I. Ribosomes then proceeded forward one codon with a chance of 1 / 10 and progressed through a mid-ORF stall with a chance of 1 / S. This allowed us to vary the values of I and S while assessing the number of terminating ribosomes as an analog for protein production.
The computational modeling results follow a similar curve as our observed data for EF-P dependent pauses wherein the impact of the elongation stall increases with initiation rate ( Figure 7C). Furthermore for any given stall strength the model predicts a maximum output at which point stall progression is the rate limiting step for all mRNAs. Similarly, with increasing stall clearance time (S), the slope of the curve and the protein output decreases for any given initiation rate. The maximum expression level is also lower for stronger stalls. This resembles our findings for atpD and atpA where the more EF-P dependent stall in AtpD results in a sharper curve reaching an earlier maximum than AtpA. Of note, when observing final ribosome positions at the end of the modeling run, a ribosome occupancy peak was observed at the stall site even for settings where initiation was clearly rate-limiting (I >> S) but total ribosome density on the transcript was similar pre-and post-stall ( Figure 8A, B). This suggests that stalling occurs when initiation rates are low but does not affect the number of terminating ribosomes because translation resumes prior to impeding the progression of upstream ribosomes. This was reflected in our previous E. coli ribosome profiling data for atpA and atpD where ribosome occupancy peaked at the PPG motif in both ORFs, but atpD showed a greater peak height relative to wild-type and a more significant decrease in ribosome density post-stall ( Figure 8C, D) (41).

Fate of stalled peptides in vivo -
A caveat of our model is that it assumes that ribosomes do not undergo premature release from the transcript during an extended stall at an EF-P dependent motif. To assess whether this is a frequent occurrence in a Salmonella efp mutant, we reanalyzed peptide data from our previous SILAC investigation (29). In this experiment proteins were fragmented into tryptic peptides that were quantified by mass spectrometry. Proteins observed during SILAC that contain an APP, PPG or PPP motif were identified and individual peptides observed in mass spectrometry were binned as either 'pre-' or 'post-motif'. Premature disengagement with the ribosome would be predicted to release truncated peptides, which would manifest as a relative decrease in peptides observed downstream of the polyproline motif as compared to peptides before the motif in the efp mutant strain. Such a decrease should not be observed in the wild-type strain, where the entire protein should be synthesized to completion.
We constrained our analysis to proteins with at least two peptides conclusively identified both before and after the motif, allowing for statistical comparison by t-test of pre-and postmotif peptides. Out of the 40 proteins for which sufficient peptide data was available, only one demonstrated a significant difference in relative expression pre-and post-motif ( Figure 9). Furthermore, SILAC ratios pre-and post-motif were linearly correlated with an R 2 value of 0.851. Though it remains possible that truncated peptides are being degraded more rapidly than they can be detected, this data suggests that ribosomes do not prematurely disengage from the transcript at most polyproline motifs in the absence of EF-P, but rather eventually resume synthesis and complete the full-length protein.

DISCUSSION
In this work we employed a plasmid-based translational fusion assay and the PPG containing proteins AtpD and AtpA to explore the underlying determinants of EF-P dependence beyond polyproline motifs and as a model for other elongation stalls. We found that the 5' UTR plays a crucial role in governing EF-P dependence and all data suggests this is through its effect on translation initiation rate. Simply put, the data suggest that many proteins containing polyproline motifs are unaffected by the loss of EF-P because poor initiation limits the rate of synthesis more than the stall itself. It follows that the impact of polyproline motifs on protein expression could be superseded not only by the rate of translation initiation, but also by the rate of elongation through alternate stalls including slow codons or starvation-sensitive regions (11,14,16,51,52). That is to say, an EF-P dependent stall will only have an observable effect on protein level if it impacts the rate of synthesis more than initiation and any other step in translation including elongation or termination. We believe this is a parsimonious explanation for the many instances observed when a polyproline containing protein does not demonstrate EF-P dependence in vivo (29,30,37,38,41).
Experimentally it is possible to observe an influence of both initiation and the stall sequence on translation. This can occur in situations where initiation and stalling impose similar constraints on the rate of protein synthesis and is due to heterogeneity in the population of transcripts. In a given population of mRNAs some transcripts will successfully initiate several rounds of translation while other transcripts will fail to initiate at all. Furthermore, the EF-P dependent stalls on some transcripts will stochastically resolve faster than others. When transcripts have high overall initiation rates, most transcripts in the population will have engaged with more ribosomes than can be cleared through the polyproline induced stall in the absence of EF-P, resulting in a buildup of blocked ribosomes upstream of the stall. For messages where initiation rates are low, ribosome binding is sufficiently rare that most stalled ribosomes will resume elongation before the next ribosome reaches the stall site. In this case the polyproline motif will have no observable effect on protein levels. The net output of protein depends on how the population of transcripts is distributed among these states ( Figure 10).
The model above is supported by our experimental results. For 5' UTR mutants of atpA and atpD, as GFP fluorescence increased in wildtype cells, expression in the efp mutant appeared to approach an asymptote where stall progression becomes the predominant rate-limiting step of translation. This effect was dependent on an intact PPG motif and implies a calculable maximum expression in the efp mutant that is related to the rate of stall clearance. This stall strength is particular to individual ORFs and allows for a comparison of EF-P dependence across all initiation rates. Interestingly, the slope of the atpD curve declines much more rapidly than that of atpA and reaches a significantly lower maximum expression, supporting the idea that the EF-P relieved stall of AtpD is more difficult to bypass than that of AtpA. This is supported by our previous work where we found that the region upstream of the PPG motif plays a significant role in the degree of EF-P dependence (41). Indeed, when twelve codons upstream of the PPG motifs were swapped, expression at multiple different initiation rates resembled the protein from which the twelve codons originated. Taken together, the region upstream of the PPG motif influences the strength of the stall and thereby the steepness and maximum of the expression curve as a function of initiation rate. Our in silico translational model supports this concept.
Our results also suggest that polyproline dependent stalls in the absence of EF-P are eventually resolved and that protein synthesis proceeds to completion in most cases. If the ribosome disengaged from polyproline induced stalls we would expect that proteins translated with low rates of initiation would show a degree of EF-P dependence similar to that of proteins derived from transcripts with much higher initiation rates. This supposition is supported by our SILAC data, where we do not observe an increase in peptide counts before the stall sequence compared to sequences downstream of the stall.
It is unlikely that the balance between translation initiation rate and impaired elongation is limited to EF-P dependent stalls. Indeed our computational modelling is not restricted to EF-P dependent pauses and could be applied to any stall in translation elongation regardless of mechanism. The correlation between codon usage and gene expression is well described where highly expressed genes have a strong bias toward using codons that correspond to the most abundant tRNAs (8,10,12,53,54). Chu et al. observed that the impact of slowly translated codons on protein synthesis could be modulated by altering the 5' UTRs of reporter plasmids in yeast (15). Other studies have described a translation 'ramp' wherein slow, non-preferred codons are more abundant at the 5' end of a coding region and this is supported by ribosome profiling data where increased ribosome occupancy was observed at the 5' end of ORFs (55,56). It was also observed that the length of this slow ramp was significantly larger for transcripts with high ribosome densities, suggesting evolutionary pressure on highly expressed genes to prevent downstream ribosome jamming (15,(55)(56)(57). At various transcripts such a ramp could also allow time for protein folding, binding of cofactors to the N-terminal peptide, or enable rare codons to affect translation rate via blocking new ribosomes from binding the mRNA (14,(58)(59)(60)(61). Cumulatively, these works support the interplay described in the present study between translation initiation rate and elongation stalls. This concept, demonstrated herein for EF-P dependent pauses, can be readily modeled in silico and can be further applied to other forms of ribosome stalls.    Western blot probing for GFP expressed from pXG10sf constructs in wild-type (WT) and ∆efp Salmonella grown to mid log phase under conditions similar to those used for fluorescence measurements. GFPcontrol is wild-type cells expressing pXG10sf-AtpD but with a FLAG tag replacing GFP. Unless otherwise indicated, 10µg protein lysate was loaded for each sample. DnaK was included as a loading control. The experiment was conducted in triplicate and one representative replicate is shown. For the 'AtpD-AtpA UTR' and 'AtpD AGAGG::AGACG' constructs where fluorescence levels were less than two-fold greater than no-plasmid controls, protein levels were very low or non-detectable in both the WT and efp mutant strains. The 'AtpA -AtpD UTR from -60-46', 'AtpA -AtpD from ATG -PPG', 'AtpD -AtpA UTR & 12aa', and 'AtpD -AtpA UTR & P214L' constructs also had similarly low fluorescence levels (Figures 2 and 3). These constructs were omitted from all downstream analyses.   Figure 3A. Data shows GFP fluorescence in arbitrary fluorescence units (AFU) normalized to OD 600 comparing expression in WT (dark grey) to ∆efp (light grey) Salmonella. Numbers above the columns indicate the ∆efp/WT ratio. Unaltered (wt) construct is shown for comparison. (B) As in panel A but with mutation in the SD sequence of the pXG10sf-AtpA construct. The wild-type sequence (AGGGGA) is underlined in Figure 3A. Values are the mean of at least three biological replicates and error bars show one standard deviation. fluorescence data plotted to compare expression in WT (x-axis) and ∆efp (y-axis) Salmonella. The y-axis is expanded from 0 to 1 to clarify differences at low fluorescence levels. Each point represents fluorescence data for one pXG10sf construct. The specific construct for each data point is indicated by numerical label referring to Table 1. Icon groups signify constructs that all have the same ORF (indicated in key at bottom right) and only differ from one another in the 5' UTR. For the AtpD and AtpA ORF groups, the data points were fit to a '1-exp' curve shown in the inset at top left. P214L and P281L groups were connected linearly and the dashed line indicates equal fluorescence in the WT and ∆efp mutant. Data is in arbitrary fluorescence units (AFU) normalized to OD 600 and is the mean of at least three biological replicates. Error bars showing one standard deviation are included for both x-and y-axes. 'AtpD -AtpA 12aa' indicates 12 codons upstream of the atpA PPG motif were swapped into the AtpD construct, 'AtpA -AtpD 12aa' is the reciprocal. (B) Fluorescence data from E. coli containing the pBAD30XS plasmid with the indicated poly-proline motifs inserted at the 4 th codon of GFP. For each motif construct, four start codon mutations were generated: From highest to lowest expression in wild-type: AUG (wt), GUG, UUG, AUC, CUG. Start codon mutant constructs with the same poly-proline motif were plotted as a group with a corresponding '1-exp' curve of best fit. 'No motif' has no inserted motif and is included as an EF-P independent control. Shown is a fluorescence ratio of GFP normalized to mCherry expressed from the same mRNA but with its own ribosome binding site. The y-axis is expanded from 0 to 2 to clarify differences at low fluorescence levels. (C) Computational model comparing the effect of varying rates of translation initiation and stall clearance on protein synthesis (terminating ribosomes). Black circles indicate tested combinations of stall progression and initiation frequency and are connected linearly. Details of the model are described in Experimental Procedures.  Only proteins with an APP, PPG or PPP motif and at least two peptides conclusively identified both before and after the motif were analyzed (40 proteins). A t-test was used to calculate statistical significance comparing the average peptide abundance ratio pre-versus post-motif. Gene names and pvalues are shown for proteins with a difference between pre-and post-motif of greater than two. AtpD and AtpA are also indicated. The only protein with p < 0.05 is highlighted as a black square (Lon). The linear regression is shown with coefficient of determination (R 2 ) indicated at right. Dashed line indicates a 1:1 regression. (B) All Lon peptides detected in the previously conducted SILAC assay are shown plotting peptide ratio (WT/∆efp) against their location within the full-length Lon protein . The dashed line indicates the location of the PPG motif. The average of all peptide ratios before or after the PPG motif are shown as a solid line and error bars indicate one standard deviation.