Proteolytic dynamics of human 20S thymoproteasome

An efficient immunosurveillance of CD8+ T cells in the periphery depends on positive/negative selection of thymocytes and thus on the dynamics of antigen degradation and epitope production by thymoproteasome and immunoproteasome in the thymus. Although studies in mouse systems have shown how thymoproteasome activity differs from that of immunoproteasome and strongly impacts the T cell repertoire, the proteolytic dynamics and the regulation of human thymoproteasome are unknown. By combining biochemical and computational modeling approaches, we show here that human 20S thymoproteasome and immunoproteasome differ not only in the proteolytic activity of the catalytic sites but also in the peptide transport. These differences impinge upon the quantity of peptide products rather than where the substrates are cleaved. The comparison of the two human 20S proteasome isoforms depicts different processing of antigens that are associated to tumors and autoimmune diseases.

The immune system constantly patrols the human body to detect pathological situations. An important role in this is played by CD8 ϩ T cells, which recognize and kill infected and aberrant cells. Because of their high specificity and cytotoxic activity, CD8 ϩ T cells are intensely investigated as tools and/or targets of immunotherapies against infection, autoimmunity, and cancer (1)(2)(3).
CD8 ϩ T cells recognize, via their T cell receptor coupled to a CD8 molecule, a specific epitope presented in the cleft of major histocompatibility complex class I (MHC-I) 4 molecules (4). The cytotoxic CD8 ϩ T cells are primed in lymph nodes. The naïve CD8 ϩ T cells arrive in the lymph nodes from the thymus. In this latter organ, thymocytes, prior to becoming mature naïve CD8 ϩ T cells, undergo a series of maturation/selection processes called central tolerance. According to one of the most accepted models, central tolerance can be summarized in two selection steps. Initially, thymocytes undergo a positive selection step, which takes place in the thymic cortex, leading to the survival and maturation of double-positive thymocytes that express T cell receptors with intermediate affinity and/or avidity for MHC-I-peptide complexes. Afterward, in the thymic medulla, thymocytes undergo the negative selection step, which leads to the elimination of thymocytes recognizing selfpeptide-MHC complexes with a high affinity (5).
The large majority of peptides bound to MHC-I molecules and recognized by CD8 ϩ T lymphocytes are generated by proteasome, which is the final effector of the ubiquitinproteasome system (6). This barrel-shaped protease can break proteins and release the peptide fragments or religate them, thereby forming new (spliced) peptides with sequences that do not recapitulate the parental protein (4). The proteasome is a multisubunit enzyme, which has a 20S proteasome as the core and various proteins bound at both sides of its gate, where they play a regulatory role (4). The 26S proteasome, comprising a 20S proteasome core coupled to a 19S regulatory complex, is often the most active form of proteasome, with an increasing amount of evidence suggesting that the 20S proteasome is independently functional and both degrades and activates proteins in cells (4,7,8). The 20S proteasome is constituted of four rings, two ␣ rings at the apexes and two ␤ rings forming the central chamber. Each ring has seven distinct subunits. Each ␤ ring carries three catalytic (i.e. ␤1, ␤2, and ␤5) subunits, which have distinct preferences for peptide sequence motifs (9). Human cells can express different isoforms of catalytic subunits, which are incorporated in distinct proteasome isoforms. Standard proteasome (s-proteasome) contains ␤1, ␤2, and ␤5 subunits. Immunoproteasome (i-proteasome) contains ␤1i, ␤2i, and ␤5i subunits and is present in immune cells (constitutively) as well as in cells exposed to inflammatory milieu. The majority of cells express a mixed-type proteasome population where both s-and i-proteasome subunits are present in various amount (4). A decade ago, Murata et al. (10) identified the so-called thymoproteasome (t-proteasome), which carries the ␤1i, ␤2i, and ␤5t subunits and has so far only been detected in the thymus.
To investigate the dynamics and its regulation of peptide hydrolysis by the 20S proteasome and to elucidate differences between s-and i-proteasomes, we previously developed a computational mechanistic model of proteasome peptide degradation (11) (Fig. 1A). The model was constructed and compared with many competing models in a Bayesian model selection framework (12,13) and finally challenged with further experimentation.
Specifically, a series of models was constructed with increasing complexity, starting from the simplest Michaelis-Menten model and ignoring the structural properties of the proteasome. Kinetic time-course data tracking the degradation of short fluorogenic peptides by 20S proteasomes were used to test whether the constructed models were able to produce the experimentally observed kinetics. The latter data and further Proteolytic dynamics of human thymoproteasome experimentation were used to yield insight into possible dynamics of the proteasome and guide the development of competing mechanistic models with increasing complexity. These models can be seen as representing competing hypotheses, allowing selection of the hypothesis that best justifies our experimental data. In contrast to many hypothesis-testing techniques, Bayesian model selection allows us to not only reject a likely wrong hypothesis but also rank competing models given experimental data (14,15). Model development and model selection have two aspects to consider: the model structure and the model parameters. Although the model structure can be seen as a map, the model parameters essentially identify where in that map the dynamics of the model can occur. It is therefore of interest to define both model structure and model parameters, or in Bayesian terms the posterior model distribution and the posterior parameter distribution (16,17). Our Bayesian model selection framework allows us to determine both the best model structure and the model parameters, each with prior knowledge, i.e. the original model distribution (the construction of a set of competing models and our initial confidence in them) and the original parameter distribution (the allowed values of a kinetic parameter). Both the suggested model structure and the suggested model parameters will contain uncertainty, which can be assessed from the posterior distribution (18,19). Roughly speaking, the broader a posterior parameter distribution is, the less information the experimental data contains about this parameter and thus the higher the uncertainty. This parameter uncertainty will be carried into model predictions. Note that in some cases not all model parameters must be inferred with low uncertainty to make precise model predictions (20). This is because the system's dynamics may not be susceptible to alterations in those parameters. The applied Bayesian model selection framework allowed us to determine the best model of 10 constructed competing models; however, this does not guarantee the accuracy of the winning model. A crucial step during model development is model validation, whereby independent experimental data, which were not used for model development or model calibration, are used to challenge the chosen model. At this point, it is important to note that any model is a simplification of the true biochemical system and that in many cases a model is developed to explain certain aspects of the system but not all (21). Assumptions and simplifications often dictate under which conditions the system can be described by the developed model. For example, in our 20S proteasome model, we do not include proteasome activators such as PA28␣␤ and 19S complexes, which may significantly alter the observed dynamics and its regulation. Model validation can help to elucidate the limits and predictive potential of the model.
In most cases, the motivation behind model development is to generate a model to predict system behavior that cannot be observed experimentally. However, here we have created a model that tests different mechanistic hypotheses and derives kinetic parameters to in turn characterize different proteasome isoforms. As we showed that proteasome isoforms differ quantitatively but not qualitatively in the peptides they generate (22) (at least with the sensitivity allowed by the assays applied), we can therefore assume that the overall model structure is the same for different isoforms and that only the kinetic parameters differ. These differences can be acquired by comparing the marginal posterior parameter distributions in a practical rather than statistical manner. That is, differences are detected if the distributions to be compared overlap only slightly and their values with the highest densities clearly vary. On the contrary, if the parameter distributions to be compared cover the same range of possible parameter values, then either the data do not contain sufficient information to detect differences, or the parameters indeed do not differ.
In our previous study, parameter inference and subsequent comparison of posterior distributions of the kinetic model parameters (Fig. 1B) showed that 20S s-and i-proteasomes differ in the activity of their catalytic sites, in peptide transport along their inner channels, and in the transport regulation dynamics. Because the peptide transport often seems to be the rate-limiting step of the overall peptide-bond hydrolysis (11), Figure 1. Overview of the computational modeling approach describing the proteasome proteolytic dynamics. In A, the schematic of the compartmentalized proteasome model proposed by Liepe et al. (11) is shown. The model describes all relevant steps involved in substrate degradation. These include (i) peptide transport steps (peptide binding close to the outer site of the gate, peptide influx into the chamber, peptide translocation inside the chamber, and peptide efflux out of the chamber), (ii) substrate hydrolysis steps (peptide binding to the active site and subsequent hydrolysis and peptide binding to the noncatalytic inhibitor site), and (iii) transport regulation (peptide binding to the noncatalytic enhancer site, peptide binding to the noncatalytic inhibitor site, and resulting effects on the conformation of the proteasome gate). The gray chamber represents a simplification of the 20S proteasome catalytic chamber with openings to the outside. The substrate and product peptides (purple) can enter the 20S proteasome chamber upon binding to the outer face of the gate, interact with the regulatory and catalytic sites inside the chamber, and leave the proteasome chamber upon translocation to the proximity of the inner face of the gate. Gray arrows indicate the transport of substrate and product peptides. The orange arrow denotes the hydrolysis reaction where a substrate peptide is transformed into product peptide, thereby releasing the fluorophore. Enhancing regulatory sites inside the chamber are shown in blue with the dashed arrows indicating their effect (transport-enhancing gate conformation). The inhibiting regulatory site outside the chamber is shown in light blue with the dashed arrow indicating its effect (transport-inhibiting gate conformation). The catalytic site consists of an active site (light orange) and an inactive modifier site (dark orange). For details of the model equations, model setup, and model parameters, please refer to Liepe et al. (11). In B, the schematic of Bayesian inference is sketched. Computational models describing biological systems are often parameterized. These parameters can be abstract, or as is the case here, they can be kinetic parameters with a direct physical translation. To learn anything from the model, it is necessary to calibrate the model against experimental data. This can be done in many fashions; however, in the last decade Bayesian inference techniques proved to be powerful model calibration tools. One of the advantages of Bayesian inference is that it estimates not only the model parameters but also their uncertainty. In general, experimental data are collected and a computational model is formulated. Both are then used as input for the Bayesian inference algorithm (here approximate Bayesian computation). The basic concept of the algorithm is to test all sorts of combinations of parameters (through a defined sampling scheme) and simulate the model with those parameter combinations. If the model simulations correspond well to the experimental data, the corresponding parameter combination is accepted; otherwise the parameter combination is discarded. This is done repeatedly until a certain number of accepted parameter combinations is reached, which then construct the so-called posterior parameter distribution. This posterior distribution contains all information about the separate model parameters as well as their dependences among each other. The outputs of Bayesian inference are therefore the model fits of the experimental data and the posterior parameter distributions. Calibrating the same model to experimental data generated under different conditions (here different proteasome isoforms) allows us to compare the obtained posterior parameter distributions and detect which model parameters differ for the different conditions and which parameters are not influenced.

Proteolytic dynamics of human thymoproteasome
such differences can impinge upon the degradation rate of specific proteins and the generation efficiency of specific antigenic peptides. Quantitative differences in antigenic peptide production, after the downstream antigen presentation steps, can result in an impaired or enhanced CD8 ϩ T cell response in vivo (22)(23)(24)(25).
The efficiency of the antigenic peptide generation by i-proteasome plays a key role in the negative selection of thymocytes because medullary professional antigen-presenting cells mainly express this proteasome isoform (5). The cortical thymic epithelial cells, on the contrary, mainly express the t-proteasome, although s-proteasome catalytic subunits have also been detected (10,26). t-proteasome influences the CD8 ϩ T cell repertoire and the response to infection in mice (10,(27)(28)(29). Some evidence hints toward a unique t-proteasome proteolytic activity, which, in mice, would lead to the generation of t-proteasome-specific antigenic peptides with peculiar features promoting the positive selection of thymocytes (5,28,30). As a consequence, the difference between the proteolytic dynamics of human i-and t-proteasomes is supposed to have a large impact on the central tolerance and, thus, on the T cell repertoire and the efficacy of CD8 ϩ T cells to recognize infected or aberrant cells and eliminate them in the human body.
To study in which aspects of proteolytic dynamics the two human proteasome isoforms diverge, we have coupled biochemical experiments to bioinformatics analyses. We have made use of the previously developed computational model to infer the kinetic parameters of human 20S t-proteasomes compared with human 20S s-and i-proteasomes. We have taken advantage of Bayesian inference to obtain posterior parameter distributions that capture not only the most plausible parameter values but also the information and uncertainty carried by the experimental data. The latter is of particular interest when aiming to detect differences between the proteasome isoforms. Based on the inferred kinetic parameters, we have performed model simulations to identify the rate-limiting steps and peptide transport dynamics of human 20S t-proteasome.
In a second step of the study, we have investigated the quantitative differences between the three 20S proteasome isoforms in substrate cleavage-site preferences and epitope production. Because of the immunological implications that such differences can have, we have used synthetic polypeptides substrates derived from tumors and multiple sclerosis, which are two examples of diseases where the MHC-I-presented epitopes are therapeutically relevant.

Human 20S t-proteasome differs from s-and i-proteasomes in its proteolytic dynamics
Human ␤5t subunit has been previously detected in different forms of human thymoma (31,32), which are tumors originating from the epithelial cells of the thymus. We have tested whether the t-proteasome subunit was detectable in other cancer-derived cell lines. The mRNA of PSMB11, which is the gene encoding the human ␤5t proteasome subunit, is also detectable in several tumor-derived or immortalized cell lines by RT-PCR ( Fig. 2A); however, its expression does not lead to a detectable quantity of ␤5t subunit through the use of a standard proteomics strategy (Fig. 2B). This result confirms that in humans the expression of the ␤5t proteasome subunit also seems to be limited to the thymus. Therefore, not having access to enough human cortical thymic epithelial cells, we generated a cell line, C5.5. This cell line is derived from the human lymphoblastoid cell line T2. The C5.5 cell line expresses mainly the ␤1i, ␤2i, and ␤5t subunits, with a ␤5:␤5t subunit ratio of 1:2.5 according to our quantitative proteomics analysis carried out with AQUA peptides (Fig.  2, C and D). This mixed-type 20S proteasome, here referred to as t-proteasome, purified from the C5.5 cell line has been compared in our study with either the 20S s-proteasome derived from parental T2 or the intermediate-type 20S proteasome purified from Epstein-Barr virus-immortalized lymphoblastoid cell lines (LCLs) that has a ␤5:␤5i subunit ratio of 1:2.5 or larger (22) and has often been used as an example of i-proteasome.
To investigate the proteolytic dynamics of human 20S t-proteasome as compared with 20S s-and i-proteasomes, we have adopted an approach that integrates an extensive set of in vitro degradation kinetics with computational modeling (Fig. 1, A and B). We have first purified 20S proteasomes from T2, LCL, and C5.5 cells and used these proteasomes to perform in vitro degradation kinetics of the short fluorogenic peptides Suc-LLVY-MCA, and Z-LLE-MCA, two substrates specific for the chymotrypsin-like and caspase-like activity of proteasomes. The MCA group of these substrates is released upon endopeptidase cleavage by proteasomes, and its fluorescence can be measured quantitatively. Murata et al. (10) showed that the degradation rate of Suc-LLVY-MCA, which is mainly carried out by the ␤5/␤5i/␤5t subunits, diverges between mouse i-and t-proteasomes. In contrast, the cleavage rate of Z-LLE-MCA, which is mainly carried out by the ␤1/␤1i subunits, should not be different between human i-and t-proteasomes because they carry similar amounts of the ␤1i subunit (Fig. 2C). Furthermore, although these short fluorogenic substrates do not recapitulate the full substrate specificity of proteasomes (22), they have been successfully used to discriminate between 20S s-and i-proteasome dynamics by the development of a computational model and its calibration with time-course data (11) (Fig. 1). We have here applied the same computational modeling and model calibration approach on Z-LLE-MCA and Suc-LLVY-MCA digestion kinetics of 20S s-, i-, and t-proteasomes. Using Bayesian inference techniques, we have obtained model fits to the experimental data by estimating the model parameters, resulting in a posterior parameter distribution for each 20S proteasome isoform (Figs. S1 and S2). The posterior parameter distribution contains information on possible kinetic parameter values able to explain the experimental data sets and their relationship with each other. The kinetic differences between 20S s-and i-proteasomes confirm our previous results (11) and therefore the correct setup of our study (Fig. 3, A and B).
Regarding the comparison of the two proteasome isoforms mainly involved in the positive/negative selection, no difference emerges between 20S t-and i-proteasomes when the estimated parameter distributions obtained from Z-LLE-MCA digestions Proteolytic dynamics of human thymoproteasome are compared (Fig. S3). On the contrary, in the Suc-LLVY-MCA degradation kinetics, we have found differences related to the active-site parameters between these two 20S proteasome isoforms (Fig. 3A). Most apparent is the hydrolysis strength, k p (for explanations, see Table 1), which appears to be ϳ4-fold smaller in 20S t-proteasome compared with 20S i-proteasome and therefore recapitulates the k p observed in s-proteasome. Also, the dissociation constant of the peptide to the substratebinding site (K aS ), which is slightly increased in 20S t-proteasome compared with 20S i-proteasome, results again in a similar K aS as observed in 20S s-proteasome (Fig. 3A). In addition, the peptide transport and transport regulation parameters dif-Proteolytic dynamics of human thymoproteasome fer, indicating that the subunit exchange has not only local but also global effects on 20S proteasome dynamics (Fig. 3B).

Peptide transport dominates the substrate degradation in human t-proteasome
By studying inferred posterior parameter distribution, we can determine the rate-limiting steps of the reaction. Previously, we showed that for both 20S s-and i-proteasomes, the gate conformation, which determines the peptide transport (influx and efflux), is often the rate-limiting step (11). By in silico simulations, we have now found that although several kinetic parameters significantly differ between 20S i-and t-proteasome, for the Suc-LLVY-MCA substrate the rate-limiting step of 20S t-and i-proteasomes is the peptide transport (Fig.  3C), whereas for the Z-LLE-MCA substrate the rate-limiting steps are primarily the peptide-bond hydrolysis at the active site and secondarily the peptide transport (Fig. 3D).
Peptide transport also regulates how much substrate and product are located inside the proteasome chamber over time (11). The local substrate concentration around the active site Thr 1 inside the proteasome chamber then strongly influences peptide hydrolysis. For a fast substrate turnover, the peptide flux through the chamber should be large enough to allow sufficient supply of new substrate molecules and sufficient efflux of product molecules. Furthermore, the peptide flux should be such that the substrate concentration around the active site is high enough to obtain reaction velocities at a level of v max but low enough to avoid substrate inhibition (11).
Taking this into account, differences in peptide transport between proteasome isoforms and thus in the filling dynamics of the proteasome should strongly influence the observed substrate degradation rates. To investigate this aspect, we have used the estimated posterior parameter distributions for 20S iand t-proteasomes to calculate in silico the amount of substrate and products inside the proteasome chamber over time. In the case of Z-LLE-MCA, our simulations show only minor differences in the filling kinetics between 20S i-and t-proteasomes (Fig. 4A). In contrast, in the case of Suc-LLVY-MCA, our simulations suggest that the chamber of 20S t-proteasome is filled more slowly with substrate and product molecules than the 20S i-proteasome chamber (Fig. 4B). For both 20S proteasome isoforms, our simulations suggest that, in our experimental conditions, after 6 h of reaction an equal proportion of substrate and product molecules is present inside the chamber. However, this equal proportion is reached faster by 20S i-proteasome than 20S t-proteasome, indicating a stronger peptide-bond hydrolysis activity by 20S i-proteasome, which could be reflected by its higher k p value (Fig. 3A). On the contrary, the peptidebond hydrolysis of the Z-LLE-MCA substrate does not differ between these two isoforms, and it is much lower compared with that of the Suc-LLVY-MCA substrate. In summary, these analyses highlight that the overall substrate degradation differences between human 20S i-and t-proteasome cannot be explained only by differences in the active-site subunits, but they can result from dynamical differences that regulate the peptide transport efficiencies.

20S proteasome isoforms quantitatively differ in substrate cleavage-site strength and the generation of self-epitopes
To introduce a further degree of complexity in our experimental approach, we have analyzed, by mass spectrometry (MS), digests obtained after incubation of 20S proteasomes with eight synthetic polypeptides, three derived from the melanoma-associated antigen gp100 PMEL17 and five from myelin sheath proteins, which are the main autoantigens attacked by CD8 ϩ T cells in multiple sclerosis (Table S1). We have selected these substrates because of their immunogenicity toward Briefly, K aS and K aP are the dissociation constant of substrate (S) and product (P) to active site(s); k p is the peptide-bond hydrolysis rate at active site(s); ␤ is the factor by which k p is multiplied upon binding to inhibitory site(s); n a and n i are the Hill coefficients for binding to the active site(s) and the inhibitor site(s); K iS and K iP are the dissociation constants of substrate (S) and product (P) to inhibitor site(s); v in and v out are the peptide influx and efflux rates, and v in /v out is their ratio; k off /k on is the ratio between the dissociation and the association rates to the gate; and R off /R on is the ratio between the unbinding and binding rates to the enhancing regulator site(s). The meaning of all parameters is depicted in Fig. 1 and Table 1. C and D, analysis of rate-limiting steps in 20S i-and t-proteasomes. Depicted is the -fold change of product formation (y axis) upon increase (by a factor of; x axis) of a specific reaction step for the degradation of Suc-LLVY-MCA (C) and Z-LLE-MCA (D) as simulated by our computational model of the 20S proteasome dynamics. The initial substrate concentration for this analysis is 160 M, and the -fold change is determined after 60-min reaction relative to the experimentally measured proteasome kinetics (factor ϭ 1). The mean of 1000 in silico predictions (colored lines) is plotted over time for the degradation of the substrates with the same initial substrate concentrations as in the experiments. The rate-limiting steps are those for which the increase leads to the largest -fold change.

Table 1 List of mathematical model parameters
The table and the meaning of the parameters have been previously published by Liepe et al. (11), and they correspond to the parameters indicated in Fig. 1A.

Proteolytic dynamics of human thymoproteasome
CD8 ϩ T cells associated with either tumor recognition or autoimmune response against oligodendrocytes in multiple sclerosis. In other words, epitopes derived from these antigenic sequences are known to be detected at the cell surface by auto-reactive CD8 ϩ T cells. These lymphocytes specifically detect tumor-associated antigens, e.g. gp100, and multiple sclerosisassociated antigens, e.g. myelin oligodendrocyte glycoprotein (MOG) and myelin basic protein (MBP), which can be

Proteolytic dynamics of human thymoproteasome
expressed in medullary thymic antigen-presenting cells. Nonetheless, they survived negative selection in the thymus and are present in the periphery as shown in various studies (33)(34)(35)(36)(37). By MS analysis of the peptide products generated by 20S s-, i-, and t-proteasomes, we have identified 510 nonspliced and 67 spliced peptide products. All spliced and nonspliced peptide products are quantifiable in the digestions. All 20S proteasome isoforms cleave the substrates between the same residues.
The fact that any of the three 20S proteasome isoforms do not use even one substrate cleavage site that is not used by the other two proteasome isoforms does not exclude that substrate cleavage sites can be preferentially used by one of the 20S proteasome isoforms. We have verified this hypothesis through investigation of proteasome-mediated digestion kinetics of four substrates (i.e. gp100 201-230 , gp100  , and MBP 102-129 ). We have adopted a quantitative strategy in our MS analysis, using the quantification with minimal effort (QME) methodology. QME estimates the absolute content of spliced and nonspliced peptide products based on their MS peak area, measured in the digestion probe (38). QME can also estimate the frequency of use of each substrate cleavage site by proteasomes, i.e. in the substrate cleavage-site strength (SCS).
According to our hypothesis, we have observed quantitative differences in the substrate cleavage site predominantly used between the three 20S proteasome isoforms (Fig. 5). There are no amino acids that are clearly preferred by one of the 20S proteasome isoforms rather than the others for peptide-bond hydrolysis, likely because the peptide sequence motifs (8 -10 residues) surrounding the cleavage site influence the frequency of usage of that cleavage site. For instance, the gp100 Leu 225 residue is seldom used by 20S t-proteasome (Fig. 5A), although the MBP Leu 112 (Fig. 5B) and the MOG Leu 193 residues are scarcely used by 20S i-proteasome (Fig. 5C), which conversely prefers the gp100 Leu 39 residue (Fig. 5D). For a more systematic comparison of the SCSs of the 20S proteasome isoforms, we performed pairwise correlations between their SCSs. We have found significant pairwise correlations between SCSs of all three 20S proteasome isoforms, thereby indicating that their overall catalytic activity is comparable (Fig. 6). However, when we analyzed the SCS of specific substrates, differences emerge. These differences are due to specific substrate cleavage sites, which are used by all three 20S proteasome isoforms although with divergent frequencies (e.g. gp100 Phe 215 , Leu 225 , Asp 226 , Leu 39 , and Ala 55 ).
These quantitative differences in SCSs are also reflected in the generation of specific peptide products, including some that have already been shown to be epitopes recognized by CD8 ϩ T cells. For instance, the epitope MBP 111-119 , which is an HLA-A*02:01-binding epitope recognized by multiple sclerosis and healthy donor patients (39 -43), is better generated by 20S t-proteasome than 20S i-proteasome. However, this phenomenon is epitope-specific because the generation of the epitope MBP [107][108][109][110][111][112][113][114][115] (41) is not favored when carried out by human 20S t-proteasome as compared with 20S i-proteasome (Fig. 7).

Discussion
Diverging from what was shown for mouse 20S t-proteasome (30), human 20S t-proteasome does not seem to possess a unique proteolytic activity in processing the self-antigen substrates included in this study and in our experimental conditions. We show here, however, that human 20S t-and i-proteasomes differ in their catalytic activity, peptide transport, and transport regulation. The differences in the peptide transport are particularly relevant because the latter is often the ratelimiting step, at least in the degradation of short peptides, as also shown here for human 20S t-proteasome. Differences in the peptide transport imply alterations of the overall 20S proteasome dynamics and long-range effects over the entire proteasome chamber due to the incorporation of the ␤5t subunit. As a consequence, the quantity of peptides produced by the ␤1i and ␤2i subunits (present in both t-and i-proteasomes) can differ in the case that they are incorporated into either 20S t-or i-proteasome because the peptide transport, and therefore the concentration and dynamics of peptides in these proteolytic pockets, could be altered too.
These dissimilarities in catalytic activity, peptide transport, and transport regulation can explain the variable preferences of human proteasome isoforms for specific substrate-cleavage sites. The fact that we have not observed cleavage sites only used by either 20S t-or i-proteasome could not be explained by the presence of standard catalytic subunits in our 20S t-and i-proteasome preparations. Indeed, if a substrate cleavage site is used only by 20S i-proteasome, for instance, we shall not detect it in the 20S t-proteasome digestions and vice versa. Therefore, although in our experimental setup we cannot exclude the existence of substrate cleavage sites exclusively used by human 20S s-proteasome, we can exclude substrate cleavage sites exclusively used by either 20S i-or t-proteasome in the sequences analyzed in this study, as previously demonstrated already for human and mouse 20S i-proteasomes (22). This does not exclude the possibility that improving MS sensitivity could show a qualitative difference in the spliced and nonspliced products between 20S proteasome isoforms, and this also does not provide direct information about differences between 20S proteasome isoforms coupled to 19S and other regulatory complexes.
We have observed, however, strong variation from proteasome isoform to proteasome isoform in the quantity of peptides produced, including MHC-I-binding epitopes, as shown here for the epitope MBP 111-119 . These quantitative differences in peptide generation can lead to a negligible presentation of epitopes at the cell surface as demonstrated in previous studies where mouse 20S s-and i-proteasomes have been compared (22,24,25). Our observations, therefore, are compatible with the hypothesis that cells expressing human thymoproteasome (both 20S proteasomes and 20S proteasomes coupled to regulatory complexes) could present on the cell surface some specific "private" epitopes, which could be directly involved in the positive selection of thymocytes (5). These private epitopes were not presented by cells expressing i-proteasome, although they were produced in the intracellular space, because of the progressive reduction of the peptide amounts during the steps of the antigen presentation pathway (44). Different proteolytic dynamics between human i-and t-proteasomes could also result in significant variation in the turnover of specific antigens and therefore in the antigenic landscape of cells expressing

Proteolytic dynamics of human thymoproteasome
either t-or i-proteasome, which would further favor the "private epitopes" hypothesis.
Our study also shows that 20S t-proteasomes can generate spliced peptides, which are thought to represent a large portion of the MHC-I immunopeptidome, the peptides bound to MHC-I molecules (45)(46)(47). The generation of spliced epitopes by t-and i-proteasomes in the thymus could strongly impinge upon our models of central tolerance and discrimination between self and nonself by our immune system (48).

Cell lines
To generate a cell line stably overexpressing the proteasome ␤5t subunit (protein, A5LHX3; gene, PSMB11), the thymic cDNA was transcribed from human thymic total RNA (Clontech) using a Transcriptor First Strand cDNA Synthesis kit (Roche Applied Science) according to the manufacturer's instructions. For amplification of the human ␤5t sequence (PSMB11) by PCR, the following primers were used: Fw, 5Ј-gggatggctctgcaggatgtgtgc, and Rev, 5Ј-ctcacaccgtctcagtccctgc. The PCR product was first inserted into pcDNA3.1(ϩ) (Thermo Fisher Scientific) and then recloned into a pSG5 vector (Stratagene) via EcoRI/BamHI. T2 cells were cultured in RPMI 1640 medium supplemented with fetal bovine serum to a final concentration of 10% and 2 mM L-glutamine in a 5% CO 2 atmosphere. Cells were transfected with pSG5/PSMB11, pSG5/ PSMB9 (lab stock), and pSVneo in equal amounts using Amaxa Cell line Nucleofector Kit C (Lonza). Stable transfected cells were selected with 1 mg/ml G418, and the expression was controlled by PCR and Western blotting. The positive cells were isolated and cultured again. The clone C5.5, which was transfected with ␤5t (PSMB11) and ␤1i (PSMB9) subunits, was selected for the study.
To verify the endogenous expression of the ␤5t subunit, the following cell lines were grown in basal Iscove's medium supplemented with 10% fetal calf serum, 2 mM glutamine, and 100 units/ml penicillin, 0.1 mg/ml streptomycin in a 5% CO 2 atmosphere: (i) T2 and C5.5 cell lines; (ii) LCLs; (iii) HeLa cells and

Western blotting and ␤5t subunit identification and quantification
Proteasome subunits have been revealed by Western blot assays as described previously (22)  LC-MS/MS analyses of peptides have been performed as follows. The sample has been concentrated for 4 min on a trap column (PepMap C 18 , 5 mm ϫ 300 m ϫ 5 m, 100 Å, Thermo Fisher Scientific) with 2:98 (v/v) acetonitrile/water containing 0.1% (v/v) TFA at a flow rate of 30 l/min and then analyzed by nanoscale LC-MS/MS measurements using a Q Exactive Plus mass spectrometer coupled with an Ultimate 3000 RSLCnano (Thermo Fisher Scientific). The system comprised a 75-m inner diameter ϫ 250 mm nano-LC column (Acclaim PepMap C 18 , 2 m, 100 Å; Thermo Fisher Scientific). Mobile phase A was 0.1% (v/v) formic acid in water, and mobile phase B was 80:20 (v/v) acetonitrile/water containing 0.1% (v/v) formic acid. The elution has been carried out using a gradient of 3-43% mobile phase B in 80 min with a flow rate of 300 nl/min. Full MS spectra (m/z 350 -1,600) have been acquired at a resolution of 70,000 (full width at half-maximum) followed by a data-dependent MS/MS fragmentation of the top10 precursor ions (resolution, 17,500; 1 ϩ charge state excluded; isolation window of 1.6 m/z; normalized collision energy of 27%; dissociation method, higher-energy collisional dissociation). The maximum ion injection time for MS scans has been set to 50  ␤5/␤5t subunit quantification of purified C5.5 20S proteasome reported in Fig. 2D has been performed after LC separation of tryptic peptides on a MALDI-TOF/TOF mass spectrometer. In the assay (n ϭ 2), 5 g of proteasome has been reduced, alkylated, and digested with trypsin as described elsewhere (51). Aliquots of the sample spiked with 500 fmol of each AQUA peptide have been analyzed by LC-MS/MS on a 4700 Proteomics Analyzer (AB SCIEX, Framingham, MS) off-line coupled with a Dionex UltiMate 3000 RSLC system and Probot fractionation device (Thermo Scientific, Idstein, Germany) as described previously (52). The calculation of the absolute amount of the tryptic peptides has been performed by comparison of the MS peak areas with those of the corresponding AQUA peptides. Based on the absolute amounts, the relative ratio ␤5:␤5t has been determined as follows: ␤5t/(␤5 ϩ ␤5t).

RT-PCR and validation
Total RNA has been isolated from human cell lines using a High Pure RNA Isolation kit (Roche Applied Science) in the presence of DNase according to the manufacturer's protocol. The cDNA has been obtained from 1 g of total RNA using a Primer Script RT Reagent kit (TaKaRa Bio Inc.). The RT-PCR has been performed with 1 l of cDNAs for 25 cycles at an annealing temperature of 56°C. For the RT-PCR specific for the huPSMB11 gene, the primers huPSMB11 Fw (5Ј-gggatggctctgcaggatgtgtgc) and huPSMB11 Rev (5Ј-ctcacaccgtctcagtccctgc) have been used, thereby generating a transcript of 907 bp. To control the efficiency of RT-PCR, actin has been amplified from the cDNA by RT-PCR using the following actin-specific primers: actin Fw, 5Ј-ctcaccatggatgatatcg; and actin Rev, 5Ј-tcgtcatactcctgcttgctg. For the verification of the ␤5t-subunit's RT-PCR products, the PCR products have been eluted from the agarose gel and inserted into pCR2.1 Topo (Thermo Fisher)

Proteolytic dynamics of human thymoproteasome
and sequenced by using the T7 primer by LCG Genomics GmbH (LGC Group). The latter step has confirmed the specificity of the amplified sequence marked in Fig. 2A.

Computational analysis of the proteasome dynamics
20S proteasome degradation dynamics have been assessed using an integrative modeling approach with Bayesian model calibration to the in vitro degradation of the short fluorogenic peptides Suc-LLVY-MCA and Z-LLE-MCA. In vitro degradation kinetics (n ϭ 3-5) have been performed with different substrate concentration (0 -480 M) in 100 l of TEAD buffer (20 mM Tris, 1 mM EDTA, 1 mM NaN 3 , 1 mM DTT, pH 7.2) at 37°C as described previously (22). For the analysis, we have used the computational model of 20S proteasome activity published previously (11). In essence, by applying an approximate Bayesian computation to fit the model to the experimental data, we have obtained posterior parameter distributions for each proteasome isoform. The comparison of the marginal posterior parameter distributions has allowed us to detect differences in the kinetic parameters between the different isoforms. For the Bayesian inference, we have used the package ABC-SysBio (12) implemented in Python with GPU support (53). The prior distributions for all parameters were uniform as described previously (11). Furthermore, all other algorithm parameters have been kept as in Liepe et al. (11). The posterior analysis has been carried out in R (54). Computation of rate-limiting steps has been performed as described in Liepe et al. (11).

20S proteasome purification
20S proteasomes have been purified from T2, C5.5, and LCLs as described previously (55). Proteasome concentration has been measured by Bradford staining and verified by Coomassie staining in an SDS-PAGE gel as shown elsewhere (56). The purity of the standardized proteasome preparations has been shown previously (22). LCL and C5.5 cell lines mainly express human i-proteasome (22,57) or t-proteasomes, respectively (Fig. 2). LCL proteasome has often been used as an example of i-proteasome in several previous studies (22,38,39,50,(57)(58)(59)(60)(61). T2 cell line expresses only s-proteasome (Fig. 2). The three proteasomes have been purified in parallel to minimize artifacts due to the purification or storage conditions.

Peptides and in vitro digestion of synthetic polypeptides
The sequence enumeration for the polypeptide substrates is reported in Table S1. All peptides have been synthesized using Fmoc solid-phase chemistry. Synthetic polypeptides (20 -40 M) have been digested by 1-3 g of 20S proteasomes in 100 l of TEAD buffer over time at 37°C as described previously (22).

Quantitative analysis of peptide products by QME and MS
Liquid chromatography-MS analyses of polypeptide digestion products have been performed as described previously (52) with the ESI-ion trap instrument DECA XP MAX (Thermo Fisher Scientific). Database searching has been performed using SpliceMet ProteaJ (52). Quantification of peptides produced in the experiments has been carried out by applying the QME method to the LC-MS analyses. QME estimates the absolute content of spliced and nonspliced peptide products based on their MS peak area measured in the digestion probe. QME is an optimization tool that makes use of the law of mass conservation and MS instrument features. The QME algorithm parameters were empirically computed in our previous study (38) and have been applied here. By applying QME, we have also calculated the SCS, which describes the relative frequencies of proteasome cleavage after any given residue of the synthetic polypeptide substrate. SCS values shown in this study are the average of SCS measured in kinetic assays.

Statistical analysis
The SCS correlation has been done using Pearson's correlation coefficient. Significance tests have been performed, testing for association between paired samples based on Pearson's correlation coefficient. All data analysis has been implemented in R (54). Sample size and number of replicates are disclosed in the figure and table captions.