Direct identification of naturally processed autoantigen-derived peptides bound to HLA-DR15.

Biochemical analysis of HLA class II-associated peptides from antigen-pulsed cells is a potentially useful approach to the analysis of antigen processing and presentation because it examines directly which antigen-derived peptides are presented. This is especially advantageous in the analysis of self-antigen presentation where conventional approaches utilizing antigen-specific T cells may be biased by the presence of self-tolerance. However, successful biochemical analysis has been reported for only one exogenous antigen and no autoantigens. We have used a novel analytical approach coupling biochemical data with the reported properties of class II-associated peptides to characterize the peptides derived from a clinically relevant autoantigen presented on the disease-associated class II type. Incubating the target of autoimmune attack in patients with Goodpasture's disease, the 230-amino acid NC1 domain of the α3 chain of type IV collagen (Goodpasture antigen, α3(IV)NC1), with human B cells homozygous for HLA-DR15, the allele carried by 80% of patients, we find that α3(IV)NC1 is presented as at least two sets of three to five peptides centered on common core sequences (nested sets). Synthetic peptides containing these core sequences bind to HLA-DR15 with intermediate affinity (IC50, 1.1-6 μM).

Biochemical analysis of HLA class II-associated peptides from antigen-pulsed cells is a potentially useful approach to the analysis of antigen processing and presentation because it examines directly which antigenderived peptides are presented. This is especially advantageous in the analysis of self-antigen presentation where conventional approaches utilizing antigen-specific T cells may be biased by the presence of self-tolerance. However, successful biochemical analysis has been reported for only one exogenous antigen and no autoantigens. We have used a novel analytical approach coupling biochemical data with the reported properties of class II-associated peptides to characterize the peptides derived from a clinically relevant autoantigen presented on the disease-associated class II type. Incubating the target of autoimmune attack in patients with Goodpasture's disease, the 230-amino acid NC1 domain of the ␣3 chain of type IV collagen (Goodpasture antigen, ␣3(IV)NC1), with human B cells homozygous for HLA-DR15, the allele carried by 80% of patients, we find that ␣3(IV)NC1 is presented as at least two sets of three to five peptides centered on common core sequences (nested sets). Synthetic peptides containing these core sequences bind to HLA-DR15 with intermediate affinity (IC 50 , 1.1-6 M). Extracellular antigens are recognized by T helper cells in the form of processed peptides bound to HLA class II molecules on the surface of antigen presenting cells (APC). 1 However, the mechanisms determining which antigen-derived peptides are displayed are poorly understood. Extracellular antigens are uptaken by APC into endosomes where, in an acidic reducing environment, proteolytic cleavage generates antigen-derived peptides (processing). Those peptides able to form stable complexes with HLA class II molecules are protected from further proteolysis and transported to the cell surface for T cell recognition. Which peptides are made available and in what relative proportion is believed to profoundly influence immune responses, but direct biochemical analysis of antigen-derived class II-associated peptides has been reported for only one antigen.
HLA class II-associated peptides have been purified and characterized from several in vitro propagated APC types bearing a variety of class II types (1)(2)(3)(4)(5) and cells extracted from spleen and thymus (6). They comprise complex mixtures of short (12-25 residues) peptides and derive mainly from membrane-associated and endosomal proteins. Peptides derived from exogenous antigens, which presumably are present at substantially lower concentrations within endosomes, comprise a small proportion of identified peptides (3,4,7). Nevertheless, Nelson et al. were able to identify peptides derived from hen egg lysozyme (HEL) among the IA k -associated peptides of a murine APC pulsed with high concentrations of HEL (8). The peptides contained a common core sequence and variable degrees of NH 2 -and COOH-terminal overhang, a nested set, now recognized as characteristic of class II-associated peptides and to have a structural basis in the "open at both ends" class II peptide binding groove (9). Importantly this core sequence was the immunodominant epitope of HEL on IA k , showing that the approach can identify T cell epitopes and supporting the concept that immune responses to exogenous antigens are directed toward the predominant antigen-derived peptides available.
In contrast to exogenous antigens, the relationship between the presentation of self-antigens and the specificity of autoimmune responses has not been established in any experimental or human autoimmune disease. There may be important differences. Because some degree of tolerance to self-antigens is likely to exist and be most complete to the predominant selfantigen derived peptides presented, autoreactivity may be directed at other epitopes (10 -12). This may be the case in an animal model of autoimmunity in which autoreactivity was found to be directed at a peptide with undetectable binding affinity for class II (13). If the methods used to predict T cell epitopes within exogenous proteins (on the basis of predicted high binding affinity for class II (14,15)) are to be extended to the prediction of the peptide targets of autoimmunity (16), it will be important to understand the relationship between how an autoantigen is presented as peptides and the autoimmune response to it.
We are investigating Goodpasture's disease as a model of antigen presentation in human autoimmune nephritis. Here we describe a novel approach to the analysis of biochemical data that has enabled us to characterize antigen-derived peptides presented by APC pulsed with modest quantities of the Goodpasture antigen. Goodpasture's disease is an uncommon form of glomerulonephritis caused by autoimmunity to a component of glomerular (and certain other) basement membranes (17)(18)(19)(20). It is a good model of autoimmunity because its pathogenesis is known, the target antigen (␣3(IV)NC1) is the same in all patients (21), and over 80% of patients carry the same class II allele, HLA-DR15 (22,23). In this work we have sought to identify biochemically the predominant peptides derived from ␣3(IV)NC1 presented by the disease-associated class II molecule HLA-DR15.

EXPERIMENTAL PROCEDURES
Preparation of Recombinant ␣3(IV)NC1-cDNA for ␣3(IV)NC1 (19) was expressed in Escherichia coli as fusion proteins with either glutathione S-transferase (expression vector pGEX (24)) or a 6-histidine residue tag (pET14b from Novagen). In each case a linker containing a thrombin cleavage site joined the fusion partner to the NH 2 terminus of ␣3(IV)NC1. The former fusion protein was purified by cation exchange chromatography before and after thrombin digestion. The purified material appeared as two equally stained bands on Coomassie-stained SDS-polyacrylamide gel electrophoresis corresponding to ␣3(IV)NC1 and undigested fusion protein; this was used in the first antigen pulsing experiment. The latter fusion protein was purified by metal chelation and cation exchange chromatography, appearing as single band on Coomassie-stained SDS-polyacrylamide gel electrophoresis, and was used in the second experiment.
Numerical Analysis of Extra Masses-Putative antigen-derived class II-associated peptides identified by mass and retention time were matched to sequences within ␣3(IV)NC1 with matching calculated mass. A computer was programmed to search the sequence of ␣3(IV)NC1 calculating the mass of all possible antigen-derived peptides assuming full reduction of disulfide bonds and absence of amino acid modifications. Peptides with calculated mass within the 95% confidence intervals of a measured extra mass were considered to match. In each experiment occurrences of each of the possible (226) 9-amino acid cores within peptides matched to extra masses were counted, and cores were ranked according to the number of different extra masses matched. Numbering of amino acid residues is relative to the sequence SPAT beginning the carboxyl-terminal noncollagenous domain of ␣3(IV). Random numbers (8000) in the range of the observed extra masses were similarly matched to ␣3(IV)NC1-derived peptides to assess the frequency with which nested sets would be observed in consecutive experiments due to chance alone.
Determination of a Peptide Binding Motif for HLA-DR15-Available motifs (15,25,26) for the gene products of HLA DRB1*1501 and HLA DRB5*0101, the two class II alleles expressed by cells carrying HLA-DR15, indicate that for both the primary anchor requirements are hydrophobic residues at positions P1 (though with different relative preferences for larger residues reflecting differences in position ␤86) and P4, with contrasting and less rigid requirements at positions P6 -P9. A motif was derived to identify all sequences containing the minimal requirements for binding to either DR15 gene product.
Synthetic Peptides-Synthetic peptides were purified by HPLC, and their composition was confirmed by mass spectrometry and amino acid analysis. Truncation analogues were generated by manual Edman degradation and partial carboxypeptidase digestion employing carboxypeptidase P and Y (Boehringer Mannheim). HPLC conditions were identical to those used for class II-associated peptide pools. Retention time predictions were made as described (27). The amino acid retention coefficients reported in Ref. 27 were used to seed regression analysis of the measured retention times of 26 synthetic peptides. Measured and predicted retention times exhibited good fit to a linear model (R 2 ϭ 0.93). Retention time prediction is well validated for short peptides but may be less reliable when extrapolated to peptides longer than 12 residues because of secondary structure considerations. This source of error was minimized by attempting retention time prediction only for peptides of similar length and sequence to the peptides used to calculate the amino acid coefficients.
Peptide Binding Assays-The myelin basic protein peptide MBP

RESULTS
MGAR, a human Epstein-Barr virus transformed B cell homozygous for HLA-DR15, was incubated with (treated) or without (control) recombinant human ␣3(IV)NC1, and the purified HLA-DR15-associated peptide pools were separated by reverse phase HPLC. When the chromatograms were compared (Fig.  1a), minor but distinct differences suggested the presence of extra peptides in the peptide pool obtained from antigentreated APC. In order to confirm the presence of extra peptides and to precisely measure their mass, each HPLC fraction was analyzed by matrix-assisted laser-desorption ionization timeof-flight mass spectrometry. Most fractions were found to contain complex mixtures of peptides with molecular masses between 1300 and 3000 Da, as is typical of HPLC-separated class II-associated peptides (3). Similar spectra were obtained for the majority of corresponding control and antigen-treated fractions, indicating that antigen treatment had not greatly perturbed baseline class II peptide presentation. In the first experiment 17 masses present in spectra from antigen-treated fractions could not be found in spectra obtained for any adjacent control fraction (Fig. 1b). The extra, putatively ␣3(IV)NC1derived, masses ranged between 1389.8 and 2509.1 Da (95% confidence limits Ϯ 0.07%) and in general occurred in fractions collected where the control and treated chromatograms differed. Attempts to further purify and obtain the sequences of the "extra masses" were unsuccessful because each was present at low levels (estimated to be 0.1-10 pmol) within complex peptide mixtures (comprising Ͻ5% of total peptide).
In order to identify which antigen-derived peptides could account for the extra masses, the sequence of ␣3(IV)NC1 was searched for peptides with matching calculated mass. 1-12 (median 5) peptides could be matched to each extra mass. Because HLA class II-associated peptides characteristically comprise nested sets enclosing common core sequences (1), we examined the sequences of the matched peptides for the recurring presence of a 9 (or more) amino acid core. Three putative nested sets each comprising peptides matched to six different extra masses were identified. Each set contained a 9 -11-amino acid core sequence (Table I) and one (Table II, upper set) was centered on a core sequence containing a DR15 binding motif. The motif, deduced from published data, specifies a 9-amino acid core sequence with hydrophobic residues at positions 1 and 4 (Fig. 2). A further six putative nested sets each containing peptides matched to five extra masses were identified, one with a core sequence matching the DR15 motif (Table II,

lower set).
In order to extend these observations a second experiment was performed with 4-fold more antigen. Seventeen extra masses were identified (1568.5-2449.13 Da Ϯ 0.05%) and matched to 1-10 (median 4) ␣3(IV)NC1-derived sequences. The number of extra masses that could be matched by mass to peptides containing the core sequences identified in the first experiment are shown in Table I. Eleven of the 17 extra masses could again be matched to peptides containing the two previously identified DR15 motif-containing core sequences (Table  III). The probability that this recurrent observation could arise by chance was investigated by matching 8,000 random numbers in the range of the observed extra masses to ␣3(IV)NC1derived peptides. Random masses could be matched to peptides containing either (or both) core sequence in 2651/8000 (p ϭ 0.33, ϭ 0.0526). The expected number of chance matches is therefore 5.5-5.8, and the probability of observing 11/17 matches lies between 0.0058 and 0.0099 (Binomial distribution using p Ϯ 1.96). This result suggested that at least a proportion of the sequences assigned to 11 extra masses (shown in Table III) were likely to be correct.
Although the APC had been pulsed with a higher concentration of antigen, again insufficient peptide could be purified to obtain confirmatory sequence data. Instead the sequence assignments shown in Table III were tested by examining the properties of synthetic peptides with some of the proposed sequences. Three out of 6 (marked with asterisks in Table III) exhibited consistent HPLC retention times. Chromatograms for two of these are shown in Fig. 3. Five other extra masses were in fractions with retention times close to those predicted  for their respective proposed sequences (marked with ‡ in Table III). Retention time prediction is likely to be acceptably accurate here because it is attempted only for peptides very similar to those used to calibrate the HPLC equipment; 17 of the 26 synthetic peptides used in the calibration were truncation variants of peptides containing one of the two 9-amino acid core sequences common to the sequences in Table III. Therefore analysis of the retention times of synthetic peptides with the sequences proposed in Table III supports sequence assignment in 8/11 and refutes assignment in 3/11. Because class II-associated peptides are known to be capable of binding to affinity-purified class II molecules (1), presumably by displacing other peptides, the binding capacity of synthetic peptides was examined in an inhibition binding assay (Fig. 4). Peptides containing either core sequence bound to affinitypurified HLA-DR15 with affinity (IC 50 , 1.1-6 M) intermediate within the range reported for class II-associated peptides (1,29) and similar (less than 5 fold difference) to that measured for the well described (26) DR15-binding myelin basic protein peptide 86 -98(98A). Similar results were obtained for longer (20 residue) peptides enclosing these core sequences, suggesting that all the peptides in Table III are likely to bind to HLA-DR15 with high affinity.
Taken together the data suggest that of the 17 extra masses biochemically detectable among the DR15-associated peptides of ␣3(IV)NC1-pulsed human B cells (in the second experiment), five correspond to peptides enclosing the core sequence LEE-FRASPF, three correspond to peptides enclosing FCNVNDV, and nine remain unidentified.

DISCUSSION
Class II-associated peptides have generally been studied by analyzing the responses of T cells to antigen or peptide-pulsed antigen presenting cells. Although this approach has the advantage of great sensitivity, it does not indicate the range and relative proportion of antigen-derived peptides presented nor necessarily the predominant peptides (30), because it can only detect peptides recognized by available T cells. These could be very different, especially in the case of autoantigens, both because of self-tolerance and because of practical limitations in the raising of panels of T cell clones with widely varying specificities and restrictions. In comparison, direct biochemical analysis of class II-associated peptides purified from antigenpulsed cells potentially indicates both the range and relative proportion of antigen-derived peptides available for T cell recognition. This is advantageous because both parameters are believed important in determining the initiation of immune (and autoimmune) responses (8,31,32) and the development of self-tolerance (11,33).

‡
a The retention times of extra masses in fraction n lie between n and n ϩ 1 minutes. b Mean of 3-5 determinations of mass; 95% confidence limits Ϯ 0.05%. c Symbols: *, consistent measured retention times; ‡, consistent predicted retention times; §, inconstent retention times.
FIG. 3. HPLC separation of a mixture of two synthetic peptides with the sequences assigned to two of the extra masses identified in fraction 31. The chromatogram obtained for the DR15-associated peptide pool from antigen-pulsed cells is superimposed, and the black rectangle marks fraction 31. Peak 1 is the peptide ALASPG-SCLEEFRASPFLE, peak 2 is the peptide PFLFCNVNDVCNFASR, and peak 3 is a disulfide-linked dimer of the peptide in peak 1. The use of a biochemical approach in the characterization of antigen-derived major histocompatability complex class II-associated peptides was first described by Nelson et al. (8). However, it is striking that the only reported successes with this approach have used HEL as antigen (5,8). This reflects the difficulty in analyzing the small quantity of antigen-derived peptides that can be isolated from even large numbers of antigen-pulsed cells (5,34,35) and perhaps the difficulty in preparing large quantities of antigen. Goodpasture disease may be particularly suitable for this approach. Unlike many autoimmune diseases, the target antigen is known and can be made in large quantities as a recombinant molecule. Also ␣3(IV)NC1, like HEL, is both a small molecule and cationic, factors that may promote uptake into APC (36). However, success depended upon the recognition that ␣3(IV)NC1, like HEL, was likely to be presented in the form of nested sets of peptides. A strategy could then be devised by which candidate sequences for extra masses could be distinguished from other mass-matched sequences. However the numerical analysis utilized can only identify nested sets with many members; 9 extra masses were not assigned sequences and may be from nested sets with a few dominant members, as a consequence of processing or class II binding constraints, or be derived from proteins induced by antigen-pulsing.
A deductive approach was necessary because insufficient ␣3(IV)NC1-derived peptide was obtained for sequence determination. Direct proof of the identity of the observed extra masses in peptide pools from antigen-treated cells requires sequence information. This was obtained by Nelson et al. (8), employing 70 M HEL (70 times the quantity of antigen used in this study), but Vignali et al. (5), employing 10 M HEL, and ourselves found only sub-picomole quantities of antigen-derived peptides, and other authors have found even mass spectrometry insufficiently sensitive to distinguish T cell stimulating and nonstimulatory HPLC fractions (35). The successful application of this approach to other antigens has not been reported. Because it appears that adequate APC uptake of exogenous antigen is limiting, it is likely that enhancing uptake would permit extension of this approach to a broader range of antigens. One approach would be to take advantage of the greater efficiency of receptor-mediated endocytosis, for example by conjugating antigen to transferrin.
Although it is striking that most autoimmune diseases exhibit strong HLA class II associations, the mechanism is unknown. If the association is to the HLA molecule and not to a linked gene, then current understanding of the function of HLA molecules would suggest that HLA-determined differences in antigen presentation is the mechanism. However, whereas there is evidence that the immune response to exogenous antigens is directed at the predominant antigen-derived peptides presented and that these are generally among those with highest class II binding affinity, data for autoantigens is lacking. Indeed there are theoretical arguments and experimental results that suggest that other peptides presented at lower level are important. In particular it may be expected that self-tolerance will be most securely established to the predominant self-peptides presented (11). If the rapidly accumulating knowledge on the peptide binding characteristics of class II molecules is to be utilized in the prediction of autoimmune disease-asso-ciated epitopes, it is important to clarify the relation between self-antigen-derived peptide presentation, class II type, and autoreactivity. Our data identify for the first time some of the predominant peptides derived from a human autoantigen presented on a disease-associated class II type, laying a foundation for investigating this relationship in human autoimmunity.