Tolerance to amino acid variations in peptides binding to the major histocompatibility complex class I protein H-2Kb.

Major histocompatibility complex (MHC) class I molecules are cell-surface glycoproteins that bind peptides and present them to T cells. The formation of a peptide-MHC complex is the initial step in specific, T cell-mediated immune responses. But, unlike other receptor-ligand systems, peptides are essential for a stable conformation of the MHC proteins. To investigate the contribution of every amino acid of octapeptides to the stability and antigenic integrity of MHC proteins, complex octapeptide libraries with one defined amino acid and mixtures of 19 amino acids in the remaining seven positions were synthesized and tested for their capacity to stabilize the conformation of the mouse MHC class I molecule H-2Kb. Peptide transporter-deficient RMA-S cells were employed in this study. Amino acid preferences found for the eight sequence positions reveal constitutional, volumetric, and steric constraints that govern peptide selection by MHC molecules. The pattern of amino acid preferences indicates that the peptides behave as integral parts of the MHC proteins and follow rules established for the interrelationship of primary sequence and the conformation and stability of proteins in general.

T cell-mediated immunity is centered around molecules encoded by highly polymorphic genes of the major histocompatibility complex (MHC). 1 These molecules bind fragments of protein antigens to form complexes that are the ligands for the specific antigen receptor of T cells (Barber and Parham, 1993). MHC class I (MHC-I) molecules are composed of two subunits, MHC-encoded heavy chains and light chains (␤ 2 -microglobulin), plus peptides of mostly 8 or 9 amino acids. The peptidebinding site of MHC-I molecules formed by the heavy chains is a groove framed by two ␣-helices that are positioned on top of a ␤-pleated sheet (Bjorkmann et al., 1987a;Fremont et al., 1992). The groove is lined with some of the most polymorphic amino acids of the MHC molecules (Bjorkmann et al., 1987b). The peptides adapt extended conformations. In the case of MHC-I molecules, their orientation inside the groove is defined by conserved MHC side chains that compensate the carboxyland amino-terminal charges Madden et al., 1993). The peptide-binding domain is supported from beneath the ␤-plate by the ␤ 2 -microglobulin. The proximal domain of the heavy chain and ␤ 2 -microglobulin, in contrast to the peptide-binding region, fold like immunoglobulin domains.
At the physiological temperature of 37°C, MHC-I molecules are stable only when proper peptides are incorporated into their structures (Fahnestock et al., 1992). Naturally occurring MHC-associated peptides have been analyzed by pool sequencing (Falk et al., 1991) and high performance liquid chromatography combined with mass spectrometry (Jardetzky et al., 1991;Hunt et al., 1992) and were found to be mostly octa-or nonapeptides that conform to MHC allele-specific sequence motifs. Production and incorporation of peptides into the MHC-I structure inside cells involve complicated processing machineries. Proteasomes are believed to generate suitable peptides by proteolytic degradation of protein precursors, ABC transporters to deliver the peptides to the site of MHC biosynthesis in the endoplasmic reticulum and molecular chaperonins to assist in the assembly of the trimeric complex (Germain and Margulies, 1993). Cells with a genetic defect in the genes coding for the peptide transporter proteins have drastically reduced surface MHC-I levels. MHC-I expression can be restored by the addition of synthetic peptides that exhibit the epitope sequence motif for the tested MHC-I allomorph (Townsend et al., 1989). Also, incubation at 26°C results in increased cellsurface expression of MHC-I proteins, which then appear to be free of peptides (Ljunggren et al., 1990). These "empty" MHC-I molecules can be loaded externally with synthetic peptides and are thereby stabilized . Unloaded MHC-I molecules denature when the cells are cultured at 37°C.
The first specific step toward T cell-mediated immune responses is the creation of stable MHC-I protein conformations by the incorporation of peptides into the MHC molecules. This implies a close relationship between peptide sequence and MHC molecule. The dominant allele-specific sequence motifs found for peptides eluted from MHC molecules give an indication for this relationship (Falk et al., 1991). In this study, we analyzed the contribution of every amino acid in every sequence position of octapeptides to the stability of MHC-I molecule structures. We employed complex peptide libraries (Dooley and Houghten, 1993;Jung and Beck-Sickinger, 1992) in which the effects of the individual amino acids in the different positions of octapeptides were specified by combining single defined with seven randomized sequence positions (Udaka et al., 1995). 2 The synthesis of these libraries either with one defined amino acid or with a premixed set of 19 amino acids (all common proteinogenic amino acids except for cysteine) in the different positions of octapeptides was optimized to approach equimolar representation of the individual peptides in the libraries. The qualities of the preparations were confirmed by electrospray mass spectrometry, amino acid analysis, and pool sequencing. The completely random X 8 peptide library (mixtures of 19 amino acids in all eight positions) and 152 OX 7 sublibraries (one defined and seven randomized positions) were employed in MHC-I stabilization assays to elucidate the rules for amino acid preferences in peptides bound by the mouse MHC-I molecule H-2K b .

EXPERIMENTAL PROCEDURES
Peptide Synthesis-The details of synthesis and analysis of the peptide libraries have been described elsewhere . 2 Briefly, solid-phase peptide syntheses were carried out on Fmoc-Lamino acid-polystyrene resins loaded with a single amino acid or equimolar mixtures of 19 resins for all common proteinogenic amino acids without cysteine. The couplings were performed with prepared equimolar mixtures of Fmoc-amino acids equimolar to the coupling sites on the resins for the randomized positions or with a 5-fold molar excess with respect to the coupling sites of single amino acids for the defined positions. The dicyclohexylcarbodiimide/1-hydroxybenzotriazole method was used. The procedure optimized to yield equimolar peptide mixtures is characterized by extended coupling times, double coupling, the initial high content of the solvent dichloromethane, and open vessels to allow evaporation of the solvent and thereby concentration of the reagents during the course of the coupling cycles (Kienle et al., 1994). The peptides were cleaved off the resins, and the side chains were deprotected with trifluoroacetic acid/phenol/ethanedithiol/thioanisole (96:2:1:1; v/v/v/v). After the resins were filtered off, peptides were precipitated by adding cold n-heptane/diethyl ether (1:1; v/v), washed, and lyophilized from acetic acid/water/t-butyl alcohol (1:10:50; v/v/v). The peptide libraries were subjected to amino acid analysis, pool sequencing (Stevanovic and Jung, 1993), electrospray mass spectrometry, and high performance liquid chromatography-mass spectrometry analysis  to establish their sequences, to define side products, and to evaluate the amino acid compositions in the X positions. Deviations from the projected equimolar representation of the amino acids in randomized positions were within the error limits of these methods and were estimated to be maximally 3% (Kienle et al., 1994). For stabilization assays, peptides were dissolved in dimethyl sulfoxide at a concentration of 20 mg/ml, diluted to 200 g/ml of water, and controlled for correct concentrations with bicinchoninic acid protein assays. Stock solutions for stabilization assays were prepared in Dulbecco's modified Eagle's medium with 0.5% bovine serum albumin.
Stabilization Assays-The efficiency of peptide binding to H-2K b was measured by MHC stabilization assays using the peptide transporterdeficient RMA-S cells Udaka et al., 1995). The cells were maintained at 37°C in a humidified atmosphere with 8% CO 2 in Dulbecco's modified Eagle's medium supplemented with heat-inactivated fetal calf serum (5%), penicillin (100 units/ml), streptomycin (100 g/ml), glutamine (2 mM), HEPES (5 mM), and mercaptoethanol (30 M). RMA-S cells were cultured at 26°C for 1 night prior to the assays to allow accumulation of peptide-free MHC-I molecules on their surfaces. In standard stabilization assays, cells were incubated at 26°C for 30 min with serial 3-fold dilutions of the peptides in a total volume of 200 l of Dulbecco's modified Eagle's medium, 0.5% bovine serum albumin. The temperature was raised to 37°C for 45 min, and cells were collected and stained with the fluorescein isothiocyanate-labeled monoclonal antibody B8.24.3 (Köhler et al., 1981), which detects conformationally intact H-2K b .
The levels of cell-surface expression were analyzed by flow cytometry using a FACScan apparatus (Becton Dickinson, Heidelberg, Germany). Samples were gated according to forward and sideward scattering, and fluorescence data were collected with a logarithmic mode setting. Data were transferred to a personal computer using the FAST488 system (JTES BioTec, Freienwill, Germany), transformed to linear fluorescence values, and averaged to obtain mean fluorescence intensities (MFI) with the help of the MFI software (E. Martz, University of Massachusetts, Amherst, MA). The titration curves were compared at the inflection points. The peptide concentrations required for halfmaximal H-2K b expression were calculated by employing formalisms of the occupancy concepts (Moyle et al., 1978) and linear regressions over plots with logit(p) ϭ ln[p/(1 Ϫ p)] versus log[peptide], where p ϭ (MFI exp Ϫ MFI min )/(MFI max Ϫ MFI min ). Results are expressed as log[stabilization index], with the stabilization index (SI) being the concentration required to achieve 50% maximal effect with X 8 divided by the corresponding concentration needed in the case of the indicated test peptides (Udaka et al., 1995). All peptide libraries were tested in duplicates and in two to three independent experiments.

RESULTS
The X 8 peptide library and all 152 OX 7 sublibraries contain peptides that bind to and stabilize H-2K b as indicated by an increased number of conformationally intact MHC-I molecules detectable with monoclonal antibody B8.24.3 (Udaka et al., 1995). With all the peptide libraries, the same maximal level of H-2K b expression was obtained, and this level was identical to the levels achieved with defined H-2K b -binding cytotoxic T lymphocyte epitopes like SIINFEKL or RGYVYQGL (data not shown; see also Udaka et al. (1995)). The dose-response curves, however, were shifted with respect to the X 8 curve. Depending on whether a particular side chain has a positive or negative effect on binding of the peptides and on the stability of the resulting peptide-MHC complexes, lower or higher concentrations of the OX 7 sublibraries were required. Stabilization assays done with a panel of defined peptides demonstrated that detection of H-2K b by monoclonal antibody B8.24.3 is not influenced by the particular sequence of the peptide bound (data not shown). The peptide concentrations required for a halfmaximal stabilization effect varied slightly with the experiments. These variations were caused by the complexity of the assay system. Differences in the fidelity of the cells and variations in MHC expression are the most likely sources. To compensate for these variations and to allow direct comparison of results from different experiments, stabilization indices were calculated as the reciprocal ratios of the concentrations of test library peptides required for half-maximal H-2K b expression and the corresponding concentration of X 8 library peptides tested in the same experiment. These stabilization indices are expressed in logarithmic form and are shown in Fig. 1 (a-h) for the sequence positions P1 through P8, respectively, in the order of decreasing stabilization efficiency of the amino acids.
The average of the absolute values of log[SI] for all 19 OX 7 sublibraries of one sequence position gives a measure of the tolerance to amino acid variations in this position (Fig. 2). A position that exhibits absolute tolerance to amino acid variations would accept all amino acids equally well. This means that the concentrations required for half-maximal MHC-I expression would be the same for all 19 OX 7 peptide sublibraries and for the completely random X 8 peptide library, which includes all peptides of the OX 7 sublibraries. The SI values would be 1, their logarithms 0, and the average of the absolute values of log[SI] also 0. Tolerance to amino acid variations can be restricted because the properties of particular side chains are favorable or because the properties of other side chains are unfavorable. The dose-response curves for the corresponding OX 7 peptide libraries would be shifted with respect to the X 8 curve to lower or higher peptide concentrations. The resulting SI values would deviate from 1, and their logarithms would be positive for preferred side chains with MHC-stabilizing effects and negative for destabilizing side chains. We used the average of the absolute values of log[SI] to quantitate these deviations from the theoretical situations of complete tolerance independent of the direction of the biases.
No absolutely tolerant position was found for H-2K b -binding peptides. The analysis reveals three types of sequence positions differing in the degree of tolerance to structural variations. Positions 4, 7, and 6 are relatively tolerant, with tolerance decreasing in the indicated order. Positions 1-3 are significantly more restricted. Positions 5 and 8 are the most restricted positions. These different attributes of the eight sequence positions are also obvious from the SI panels in Fig. 1 (a-h). Moreover, these panels reveal four general features of the sequence positions of these octapeptides. First, all amino acids are permitted in all sequence positions. Second, the vast majority of the amino acids are destabilizing. The only exception to this tendency was found for position 7, where about half of the amino acids are stabilizing. Third, the more restricted positions (1-3, 5, and 8) are characterized by pronounced destabilizing effects. Fourth, strongly stabilizing amino acids were found for positions 5, 7, and 8. Position 4 and, similarly, position 6 show neither strongly stabilizing nor strongly destabilizing effects.
The amino acid selectivity exemplified with these stabilization measurements indicates preferences for particular side chains in different sequence positions of octapeptides and suggests that physical constraints (constitutional, volumetric, and steric) control peptide selection by H-2K b . To illustrate the influence of constitutional constraints, hydrophobicity indices for the amino acids as compiled by Roseman (1988) were included in Fig. 2. There is a general preference for hydrophobic amino acids. The dominance of amino acids with hydrophobic side chains is unequivocal for positions 1, 3, 5, and 8. Conversely, neutral or positively charged hydrophilic side chains are preferred in position 7. Positions 2, 4, and 6 allow hydrophobic as well as hydrophilic amino acids and appear to be less constrained than other positions.
The influence of volumetric constraints is also easily detected. Side chain volumes were calculated from the volumes of amino acids taken from Zamyatnin (1972)

DISCUSSION
Structure, stability, and functional capacity of proteins are determined by the amino acid sequences of their polypeptide chains. In this respect, MHC molecules are exceptional as their conformational integrity is dependent on foreign peptides of mostly 8 or 9 amino acids that need to be incorporated into their protein structure. These peptides are derived from diverse sources and can vary from one MHC molecule to another. Thus, MHC-I proteins are heterotrimers of a monomorphic light chain (␤ 2 -microglobulin), a polymorphic heavy chain with a high degree of sequence diversity within a species, and an extremely variable peptide. Antigen presentation by MHC molecules can be regarded as protein-chemical duty dictated by the necessity to form a stable protein structure. Selection of suitable peptides by these molecules is therefore expected to be determined by rules that also govern the interrelationship of primary structure and protein conformation. Hydrophobic side chains of amino acids in proteins are the major constituents of protein cores and are crucial for stable conformation (Kauzmann, 1959;Tanford, 1962;Baldwin, 1986;Privalov and Gill, 1988;Murphy et al., 1990), but they are not necessarily buried inside the molecules (Richards, 1977;Miller et al., 1987). A large fraction of such amino acids are found at protein surfaces. Nevertheless, a strict preference for hydrophobic side chains in defined sequence positions of homologous monomeric proteins strongly indicates that these side chains are buried (Rose et al., 1985). Equally relevant for a stable conformation is the exclusion of hydrophilic side chains from the protein core. Hydrophilic amino acids are mostly found at protein surfaces. Buried hydrophilic side chains require partners for hydrogen bonds or salt bridges in precisely defined positions in order to be tolerable inside protein structures (Baker and Hubbard, 1984). As a consequence, little variability is allowed for such positions (Lesk and Chothia, 1980).
With the peptide libraries used in this study, all possible octapeptides were tested for their contribution to a stable MHC conformation. The combination of single defined sequence positions with randomized positions allowed specification of the effect of every amino acid in such heterogeneous mixtures. The restricted tolerance to amino acid variations found for all sequence positions of H-2K b -binding octapeptides indicates the influence of physical constraints on peptide selection by MHC molecules (Bowie et al., 1990). Strong preferences for hydrophobic side chains in positions 1, 3, 5, and 8 are indicative of constitutional constraints. Apparently, a stable MHC conformation is achieved more readily when the side chains of the amino acids in these positions are buried inside the MHC molecule and thereby provide a large interaction area with MHC residues (Murphy et al., 1990). The strong preference for hydrophilic side chains in position 7 also reveals the influence of constitutional constraints. Side chains of amino acids in position 7 are likely to be exposed at the surface of the molecule.
Volumetric in addition to constitutional restrictions guide the amino acid preference for position 3 and, less pronounced, for positions 5 and 8. Positions 5 and 8 appear more restricted, with the former requiring aromatic and the latter aliphatic side chains. These two positions have been classified as anchor positions for the small number of different amino acids found by pool sequencing of peptides extracted from isolated H-2K b molecules (Falk et al., 1991). Various observations including results from pool sequencing and stabilization assays with analogues of known epitopes point to position 3 as a secondary anchor (Falk et al., 1991;Jameson and Bevan, 1992). Results from crystal structure analyses of H-2K b are in agreement with these interpretations . Pockets within the peptide-binding site were identified, which are deeper for peptide side chains in positions 5 and 8 and shallower for those in position 3. The pocket for secondary anchors in H-2K b is deeper in other MHC molecules like HLA A2.1 and accommodates one of the two dominant anchor residues (Madden et al., 1993). The volumetric flexibility of position 1 as compared with the other three positions that strongly prefer hydrophobic side chains implies a higher degree of freedom for the orientation of the amino acid side chains. Serine is the only hydrophilic amino acid in position 1 that has the capacity to stabilize H-2K b . The hydroxy group of this amino acid, like the hydroxy group of threonine, can form hydrogen bonds back to the main chain of the peptide. Serine and threonine are occasionally found in core regions of proteins (Lim and Sauer, 1989). Serine in position 1 could be buried or exposed depending on influences from other amino acids in the peptides. Based on the analysis of crystal structures of various proteins, Bordo and Argos (1991) have suggested permissive amino acid substitutions that would have minimal impact on neighboring side chains. Their classification correlates well with our stabilization indices.
Steric constraints, in contrast to constitutional and volumetric constraints, result from the conformation of the entire protein or peptide and therefore are not as easily detected. However, the strong preference for hydrophilic side chains in position 7 seems to indicate such steric conditions. The ultimate amino acid is fixed in its position through, first, the side chain buried deeply inside the F pocket of the peptide-binding groove and pointing away from the surface of the peptide-MHC complex and, second, the ␣-carboxy group that is bound by conserved side chains of the MHC molecule Madden et al., 1993). These structural conditions in the context of the extended conformation of MHC-bound peptides leave little freedom for the side chain orientation of the penultimate amino acid (Ramachandran and Sasisekharan, 1968). These side chains are bound to point out of the groove. Therefore, hydrophilic amino acids are preferred. Although there is no clear preference for the particular constitution or size of amino acids accepted in position 2, the strongly destabilizing effect of many amino acids indicates an anchoring function for this position. Positions 2, 4, and 6 are the least constrained, allowing a high degree of freedom for the choice of amino acids. Their side chains could be partially or completely exposed at the surface of the molecule, depending on the sequence context in the individual peptides.
The above interpretation of the stabilization measurements can be summarized to describe the structure of an ideal H-2K bbinding octapeptide. Positions 1, 3, 5, and 8 should be occupied by an amino acid with a hydrophobic side chain that would be buried inside the groove. For positions 5 and 8, aromatic and aliphatic residues are preferred, respectively. Position 7 should harbor an amino acid with a hydrophilic side chain that would be exposed. Positions 2, 4, and 6 could accommodate different amino acids, and their side chains could be partially buried. In this model, positions 8 and 5 followed by positions 1-3 and finally 7 would contribute most to the stability of H-2K b . Residues in positions 1-3, 5, and 8 should, if at all, only indirectly affect recognition of the complex by T cells (Chen et al., 1993;Falk et al., 1994). On the other hand, side chains in position 7 followed by positions 4, 6, and potentially 2 would contribute most to the interaction of the peptide-MHC complex with the complementary T cell receptor.
The structure of natural cytotoxic T lymphocyte epitopes may, however, deviate from this ideal epitope Madden et al., 1993). In addition to the role of the side chains of the ligands, interactions between MHC-I side chains and the main chain of the peptides have been found to contribute substantially to the binding . Also, the interactions of the invariant terminal amino and carboxy groups of the peptides with conserved MHC residues bear a major share of the stabilization energy (Bouvier and Wiley, 1994). Moreover, a proper distribution of anchoring side chains has been shown to compensate for the destabilizing influences of single amino acids (Saito et al., 1993;Udaka et al., 1995). Nevertheless, it is expected that the more a peptide deviates from the basic structure described above, the weaker is its capacity to stabilize the H-2K b conformation.
The stabilization indices presented in this report should help to predict the relative efficiency of peptides for binding to and for stabilizing MHC molecules. However, the capacity of every single amino acid side chain to contribute to a stable MHC conformation is influenced by other amino acids in the sequence. Mutual dependence of the contribution of the amino acids in epitope sequences is expected from the length of the peptides and from the extent of surface contact between peptide and MHC molecule . This interdependence precludes precise prediction of the efficiency of peptides for binding to MHC-I molecules (Udaka et al., 1995;Horovitz, 1987;Jencks, 1981). Nevertheless, an increasing number of amino acids with high stabilization indices will result in increased binding efficiencies. Thus, SI values should provide useful guidelines for heuristic approaches to epitope identification. The development of algorithms for the assessment of peptide efficiency for binding to MHC molecules is in progress. With the help of corresponding query systems that bear on SI values, protein sequence data bases can be scanned to identify potential epitopes. However, an exact prediction of T cell epitopes will not be possible for three reasons. First, the impact of interdependence of the amino acids in MHC-binding peptides has not yet been elucidated. Second, peptide binding to MHC molecules occurs in a situation that at best is described as a steady-state situation. Consequently, also weakly bound peptides can be presented and give rise to T cell responses if they are generated at sufficiently high rates. Third, selection of T cell epitopes is dependent on the T cell repertoire and thereby influenced by processes that select self-MHC-restricted and self-tolerant T cells.
The rules that govern amino acid preferences for peptides bound by MHC molecules are reminiscent of rules for the structural basis of protein integrity and stability. Peptide-MHC complexes appear to be convenient model molecules for studies of the relationship of protein sequence and protein structure. The possibilities of multiple peptide synthesis and generation of defined peptide-MHC complexes would allow many questions in this field to be very efficiently addressed.