Mass Spectrometric Analysis of the N Terminus of Translational Initiation Factor eIF4G-1 Reveals Novel Isoforms*

In eukaryotes, translation initiation factor 4G (eIF4G) acts as the central binding protein for an unusually large number of proteins involved in mRNA metabolism. Several gene products homologous to eIF4G have been described, the most studied being eIF4G-1. By its association with other initiation factors, eIF4G-1 effects mRNA cap and poly(A) recognition, unwinding of secondary structure, and binding to the 43S initiation complex. Multiple electrophoretic isoforms of eIF4G-1 are observed, and multiple cDNAs have been reported, yet the relationship between the two is not known. We report here a new cDNA for eIF4G-1, present as a previously unidentified human expressed sequence tag, that extends the long open reading frame, provides a new in-frame initiation codon, and predicts a longer form of eIF4G-1 than reported previously. eIF4G isoforms from human K562 cells were cleaved with recombinant Coxsackievirus 2A protease and the N- terminal domains purified by m7GTP-Sepharose chromatography and polyacrylamide gel electrophoresis. Proteins were digested with proteolytic enzymes and peptides masses determined by matrix-assisted laser desorption ionization-time of flight mass spectrometry. In selected cases, peptides were sequenced by electrospray-mass spectrometry fragmentation. This identified the N termini of the three most abundant eIF4G-1 isoforms, two of which had not previously been proposed. These proteins appear to have been initiated from three different AUG codons.

The three stages of protein synthesis are catalyzed by three groups of proteins: initiation, elongation, and termination factors (1). Initiation is characterized by formation of a series of initiation complexes, each catalyzed by a different subset of initiation factors. Recruitment of mRNA to the 43 S initiation complex to form the 48 S initiation complex involves eIF3, 1 PABP, and the eIF4 proteins. eIF3 is a 520-kDa multimer that is required for both Met-tRNA i and mRNA binding. PABP is a 70-kDa protein that specifically binds poly(A) and homo-oligomerizes. The eIF4 factors consist of: eIF4A, a 46-kDa RNA helicase; eIF4B, a 70-kDa RNA-binding and -annealing protein that stimulates eIF4A; eIF4E, a 25-kDa cap-binding protein; and eIF4G, a group of proteins of 154 -180 kDa that form specific complexes with all of the other proteins known to be involved in mRNA recruitment.
The mRNA recruitment step is rate-limiting for initiation under normal cellular conditions and appears to be highly regulated (31,32). The best studied regulatory mechanism involving eIF4G is its cleavage by 2A protease of entero-and rhinoviruses and L protease of foot-and-mouth-disease virus. Upon infection of mammalian cells with these picornaviruses, most host protein synthesis is shut off coincident with the appearance of viral proteins (33). This is thought to be mediated by a switch from cap-dependent to cap-independent translation. Complexes containing eIF4G restore cap-dependent translation in lysates of poliovirus-infected cells (34 -36). eIF4G was discovered as a result of its proteolysis coincident with the loss of cap-dependent initiation during poliovirus infection (37). eIF4G is separated into two functional domains, an N-terminal cleavage product (cp N ) that binds eIF4E and PABP, and a C-terminal cleavage product (cp C ) that binds eIF4A and eIF3 (14,38). Initiation of picornaviral mRNA translation is via an internal ribosome entry site (39,40). Cleavage of eIF4G drastically inhibits translation of capped mRNAs in vitro, whereas internal initiation and initiation of uncapped mRNAs are either unaffected or even stimulated (41)(42)(43).
Multiple isoforms of eIF4G-1 are observed by SDS-PAGE, but the origin of these is not known. However, an analysis using 2A protease cleavage, SDS-PAGE, and antibodies directed against different domains of eIF4G suggested that the heterogeneity was attributable to the ϳ50-kDa cp N domain (38). Digestion with L protease further delimited the source of heterogeneity to the N-terminal ϳ30 kDa cp N1 domain (14). Multiple cDNAs for human eIF4G-1 have also been reported. The initial cloning, based on two overlapping cDNAs from fetal and adult brain, predicted a protein of 154 kDa (2). Subsequent cloning revealed a cDNA corresponding to a protein with an additional 156 aa at the N terminus (44). An initiation codon was proposed based on alignment with eIF4G-3 and lack of any further upstream cDNA sequence. Further cloning revealed a cDNA that was 42 nt longer (45), indicating either that multiple mRNAs exist or that Imataka et al. (44) had not reached the 5Ј-end of the same mRNA. Finally, a fourth and fifth cDNA were reported (46). One of these independently confirmed some of the sequence reported by Johannes and Sarnow (45) but did not extend the cDNA. The other was collinear with the other cDNAs up to a point, upstream of which it deviated, suggesting a splice variant. None of these cDNAs provided a new in-frame AUG. Thus, the most upstream AUG codon for all five eIF4G-1 cDNAs reported to date is that originally proposed by Imataka et al. (44).
The present study was motivated by the fact that the relationship between the multiplicity of electrophoretic forms of the protein and the multiplicity of cDNAs is not known. Importantly, despite speculation based on cDNA sequences, the N terminus has not been established for any eIF4G-1 isoform. It is important to establish the actual protein structures, since eIF4G-1 isoforms differing at the N terminus may contain different binding sites for proteins involved in translational control. This study identifies an EST in the public databases as corresponding to a longer form of eIF4G-1 mRNA. Because it predicts a new upstream, in-frame AUG, it could encode an even longer isoform of eIF4G-1 than those predicted from previous cDNA sequences. Mass spectrometric analysis confirmed that this was the case and also established the structure of two other eIF4G-1 isoforms.

EXPERIMENTAL PROCEDURES
Materials-Porcine trypsin was obtained from Promega (catalog number V5111). Endoproteinase Lys-C (catalog number P3428) and bovine ␣-chymotrypsinogen A (catalog number C4879) were obtained from Sigma. Arg-C from Clostridium histolyticum was purchased from Worthington (catalog number LS001641). Asp-N (catalog number 1420488) and Complete, Mini, Protease Inhibitor Cocktail Tablets (catalog number 1836153) were obtained from Roche Molecular Biochemicals. m 7 GTP-Sepharose 4B was purchased from Amersham Biosciences.
Identification of a New EST for eIF4G-1-As noted above, Bushell et al. (46) reported a novel cDNA related to eIF4G-1, GenBank TM nucleotide accession number AF002815. The hypothetical protein encoded by this cDNA was assigned GenBank TM protein accession number AAC78442. We compared the cDNA sequence corresponding to aa residues Ϫ24 to ϩ49 of this protein 2 to the est_human database using the program blastn found under the NCBI BLAST suite of programs. We obtained a new EST derived from an adult melanoma: GenBank TM nucleotide accession number AL120751. The 215-nt segment from nt 25-240 of AL120751 exactly matched the 215-nt segment from nt 4 -219 of AF002815.
The sequence reported for AL120751 covered only the most 5Ј-terminal 627 nt, even though the complete eIF4G-1 cDNA is predicted to contain 5510 nt (2,44,45). We obtained the plasmid corresponding to this EST, termed Homo sapiens cDNA clone DKFZp762O191, from the German Genome Project and determined additional DNA sequence information using the DNA sequencing facility at Iowa State University. The 3Ј-end of AL120751 (nt 4647-5306, using a composite numbering system) 3 was determined using an oligo(dT) 3Ј-A primer. A 3Ј-terminal poly(A) tract of 111 nt was observed using a primer corresponding to the SP6 promoter in the vector. A sequence covering nt 439 -1155 was obtained using the sense primer 5Ј-AACACGCCTTCT-CAGCCCCGC-3Ј. This corrected an entry of "N" at nt 456 to G in the GenBank record of AL120751. A sequence covering nt 710 -1399 was obtained using the antisense primer 5Ј-GGGGCAAGCTGGGGGAG-GAGC-3Ј. This sequence exactly matched nt 328 -1018 of cDNA AF104913, which encodes protein AAC82471. Finally, sequences covering nt 1-300 and 97-793 were obtained using the sense primer 5Ј-CGCCACGGCCGAAGCAGCTAG-3Ј and antisense primer 5Ј-AA-CACGCCTTCTCAGCCCCGC-3Ј, respectively. Overall, this extended the sequence information for cDNA clone DKFZp762O191 by 1544 nt.
Generation of K562 Cell Lysates-Human K562 cells were grown in RPMI medium (Invitrogen) containing 10% fetal bovine serum (Atlanta Biologicals, Norcross, GA). Cultures were grown to confluence in 175cm 2 tissue culture flasks maintained in a humidified, 5% CO 2 environment at 37°C. A standard preparation of lysate was derived from 3.2 ϫ 10 9 cells (16 T-175 flasks). Cells were resuspended on ice for 30 min in an equal volume of Buffer A (1% Triton X-100, 10 mM Tris-HCl, pH 7.4, 150 mM NaCl, 5 mM EDTA, 1 tablet of protease inhibitor mixture per 20 ml buffer) and then centrifuged at 25,000 ϫ g for 20 min. The supernatant was frozen in liquid N 2 and stored at Ϫ80°C.
Immunological Procedures-Two antibodies against different regions of eIF4G-1 were used. 4 Anti-Peptide 7 antibodies were obtained as described previously (2). Anti-Peptide 10 antibodies were produced and affinity purified using the peptide CRAQPPSSAASR, which corresponds to aa residues 55-65 of the consensus eIF4G-1 sequence 2 with an added N-terminal Cys residue, as described previously (2). Immunoblotting was carried out as described previously (47). Incubation with the anti-Peptide 7 antibodies (1:1000) was carried out at room temperature for 1 h. Incubation with affinity-purified anti-Peptide 10 antibodies (1:50) was at 4°C overnight. Both antibodies were visualized with alkaline phosphatase-conjugated goat anti-rabbit IgG antibodies (Vector Laboratories, Burlingame, CA) at 1:1000.
Purification of cp N -Recombinant 2A protease from Coxsackievirus serotype B4 was prepared as described previously (41). K562 cell lysates were cleared after thawing by spinning at 25,000 ϫ g for 20 min at 4°C. The supernatant was incubated with recombinant 2A protease at a final concentration of 50 -100 g/ml for 30 min on ice. (As 2A protease preparations differed somewhat in activity, the amount was adjusted to produce complete cleavage of eIF4G-1, based on western blotting with anti-Peptide 7 antibodies.) The 2A protease-treated lysate was then subjected to m 7 GTP-Sepharose affinity chromatography as described previously (48) but with the following modifications. One standard batch of K562 lysate was gently rotated with 2 ml of m 7 GTP-Sepharose for 1 h at 4°C. The slurry was then transferred to a column and the flow-through fraction was reapplied to the column. The column was washed first with 12 volumes of Buffer B 55 (20 mM MOPS, pH 7.6, 10% (w/v) glycerol, 0.5 mM EDTA, 0.25 mM dithiothreitol, 25 mM NaF, 55 mM KCl) and then with 3 volumes of 100 M GTP in Buffer B 55 . Proteins were eluted with 4 volumes of 200 M m 7 GTP in Buffer B 55 . The eluate was frozen in liquid N 2 and stored at Ϫ80°C.
Electrophoretic Separation of cp N Isoforms-Linear polyacrylamide was prepared (49) and added to each cp N sample at a final concentration of 120 g/ml as carrier. Ice-cold 100% trichloroacetic acid was added to 3 The sequence deposited as GenBank™ nucleotide accession number AL120751 contains only 627 nt of the cDNA represented by clone DKFZp762O191. The current report confirms the entire EST sequence and extends the cDNA sequence. Alignment with other eIF4G-1 cDNAs predicts a 5510-nt mRNA corresponding to clone DKFZp762O191. The nucleotide numbers used in this report, unless otherwise noted, are based on this composite mRNA in which nt 1 is the first nt of AL120751. 4 Names of eIF4G-1 peptides used for designing oligonucleotide probes and making site-specific antibodies in earlier studies (2, 52) are not related to the names for eIF4G-1 peptides used in the present report (Fig. 3). To avoid confusion, the earlier peptide names are always preceded by the term "anti-" in the present work. a final concentration of 10% and the sample allowed to stand on ice for 30 min. The precipitate was collected by centrifugation at 25,000 ϫ g for 20 min, washed four times with ice-cold 80% aqueous acetone, and dissolved in 1ϫ SDS-loading buffer (adjusted to pH 8.2; Ref. 50). Alkylation of Cys residues was performed as described previously (50). The sample was then adjusted to pH 6.8 with HCl, and the cp N isoforms were separated by electrophoresis on a 16 ϫ 20-cm gradient gel (8 -15%; acrylamide:N,NЈ-bisacrylamide, 30:0.8) with a 4% stacking gel. Electrophoresis was carried out a constant current of 16 mA for 4 h followed by 24 mA for 16 -20 h. Protein bands were stained with Coomassie Blue, excised, and stored at Ϫ80°C.
Protease Digestion-Gel pieces were minced and further destained with three washes (400 l each) of 50% acetonitrile, 25 mM ammonium bicarbonate, pH 8.0. The polyacrylamide was dehydrated for 5 min with 100% acetonitrile followed by vacuum centrifugation in a Savant Speedvac for 30 min. In-gel digestion with trypsin was performed overnight at 37°C by the addition of 15 l of enzyme (10 g/ml) in 25 mM NH 4 HCO 3 , pH 8.0, to each gel piece. For Lys-C digestion, a stock enzyme solution of 50 g/ml was made in 0.1 M Tris-HCl, pH. 8.0, and 15 l were added to each gel piece. Digestion was performed overnight at 37°C. For Arg-C digestion, the enzyme was reconstituted at 2 mg/ml in 1 mM CaCl 2 , 2.5 mM dithiothreitol. In-gel digestion was carried out at room temperature overnight in 15 l of 50 mM sodium phosphate, pH 7.6, 2.5 mM dithiothreitol by the addition of 3 g enzyme per gel slice. Finally, for Asp-N digestion, the enzyme was reconstituted in H 2 O to give a concentration of 40 g/ml in 10 mM Tris, pH 7.5. In-gel digestion was carried out overnight at 37°C in 15 l of 50 mM sodium phosphate, pH 8.0, by the addition of 4 l of enzyme per gel slice.
Mass Spectrometry-Mass spectrometric analysis was performed at both the LSUHSC-S Research Core Facility or the Laboratory for Mass Spectrometry and Gaseous Ion Chemistry, Rockefeller University. At LSUHSC-S, MALDI-TOF-MS was performed on a PerSeptive Biosystems Voyager-DE PRO Biospectrometry work station. At Rockefeller University, two instruments were used. Peptide mapping by MALDI-TOF-MS was performed on a PerSeptive Biosystems Voyager-DE STR Biospectrometry Work station. Sequence information was obtained by LC-ESI-MS/MS on a Finnigan LCQ-DECA ion trap mass spectrometer.
Peptides were prepared for MALDI-TOF-MS after proteolytic digestion by extraction from gel pieces twice with 50-l portions of 50% acetonitrile, 5.0% trifluoroacetic acid. Peptides in the extract were dried, dissolved in 15 l of 0.1% trifluoroacetic acid, and purified on a ZipTip (Millipore). They were eluted with 2 l of the appropriate organic acid (matrix) dissolved in 50% acetonitrile, 0.1% trifluoroacetic acid, and spotted on a MALDI plate. For masses in the 800 -6000 Da range, the matrix was a 0.01 mg/ml solution of ␣-cyano-4-hydroxytrans-cinnamic acid (Sigma, catalog number C-2020). For masses in the 6000 -26,000 Da range, the matrix was a 0.01 mg/ml solution of sinapinic acid (3,5-dimethoxy-4-hydroxy-trans-cinnamic acid; Aldrich, catalog number D13,460-0). When internal calibration was used, the eluting solution also included peptide mass standards. Data were summed over 50 -100 acquisitions in delayed extraction mode.
At LSUHSC-S, data analysis was performed using the Data Explorer software, version 3.5-4.0. At Rockefeller University, analysis was performed using the software program M over Z from Proteometrics, LLC. In both cases, spectra were subjected to algorithms for base-line correction and noise removal at two S.D. values. Throughout the current report, monoisotopic masses are reported for Ͻ2500-Da peptides, whereas average masses are reported for Ͼ2500-Da peptides. Theoretical masses were determined using the Peptide Mass tool at the Ex-PASy Proteomics website (ca.expasy.org/tools/). Peaks seen in both the sample of interest and also in a blank gel treated identically were eliminated from further consideration. For automatic matching of observed to predicted peptide masses, we used Auto-MS Fit, version 1.2.18 (PerSeptive Biosystems). For manual matching, we considered that peptides matched if their masses were within 1 Da for the range 800 -10,000 Da. Because of unresolved microheterogeneity, caused by Met oxidation for example, peptides with molecular masses above 10,000 Da were considered to be matches if the experimentally determined masses were within 0.6% of the calculated values.
LC-ESI-MS/MS analysis was carried out using a Smart System (Amersham Biosciences) equipped with 10-ml syringe pumps and a precolumn flow splitter (Michrom Bioresources, Auburn, CA). The chromatographic eluate was monitored by on-line mass spectrometry using an electrospray ion trap mass spectrometer, model LCQ-DECA (Finnigan ThermoQuest, San Jose, CA). Peptide mixtures were diluted 10-fold with 0.01% trifluoroacetic acid (v/v) in water/methanol/acetic acid (949: 50:1, v/v/v) and loaded on a reverse phase Magic C18 column (50 ϫ 0.2 mm inside diameter; pore size, 100 Å; particle size, 5 m) from Michrom Bioresources (Auburn, CA). Peptide separation was performed at room temperature with a fast-rising methanol gradient (in 5 min) at a flow rate of 2.8 l/min. The eluate was transferred through a 50-m inside diameter fused silica capillary from the column to the ion source of the mass spectrometer and electrosprayed at 2.8 -3.2 kV. The transport capillary in the mass spectrometer was kept at 130 -150°C in order to assist desolvation.

RESULTS
Identification of a New EST for eIF4G-1-As noted above, several cDNAs for eIF4G-1 have been reported (2, 44 -46). Four of the polypeptides theoretically encoded, NP_004944, AAC82471, AAC78443, and eIF4G-I ext , are collinear, but one of them, AAC78442, deviates from the others upstream of aa 50 ( Fig. 1). In some cases, the polypeptide sequences deposited in GenBank TM represent those encoded by the longest open reading frame following an AUG codon (NP_004944, AAC82471), but in other cases, the polypeptide sequence continues in the N-terminal direction despite the absence of an AUG codon (AAC78442, AAC78443, eIF4G-I ext ). We used sequence information from the cDNA with the longest open reading frame (GenBank TM nucleotide accession number AF002815, which encodes GenBank TM protein accession number AAC78442; see Fig. 1) to search for additional ESTs. The result was a new EST entered in GenBank TM under nucleotide accession number AL120751 (Fig. 1). The EST sequence reported (627 nt) was not previously identified as corresponding to eIF4G-1 mRNA, but a comparison with other eIF4G-1 cDNAs indicated an identical sequence over 215 nt.
We obtained the plasmid corresponding to this GenBank TM entry from the German Genome Project. Sequencing portions of the plasmid insert from both the 5Ј and 3Ј termini revealed: (i) agreement over the entire EST sequence for AL120751 except for a single nucleotide sequencing ambiguity at nt 456; (ii) preservation of reading frame between the 3Ј-end of the EST and other eIF4G-1 cDNA sequences; (iii) the presence of a 111-nt poly(A) tract; (iv) collinearity with the cDNAs encoding AAC82471, AAC78443, and eIF4G-I ext (Fig. 1); and (v) partial collinearity with the cDNAs encoding NP_004944 and AAC78442 (see "Experimental Procedures"). If the new cDNA is collinear with these cDNAs throughout its entire length, it corresponds to an mRNA of 5510 nt. 3 Surprisingly, the polypeptide encoded by AL120751 deviates from the AAC78442 polypeptide upstream of aa 50 ( Fig. 1), even though sequences from the cDNA encoding it, AF002815, were used to find AL120751. Further comparison of AL120751, AF002815, and the corresponding human genomic sequence (GenBank TM nucleotide accession number AC078797.8; gi number 15887175; nt 122674 -133235) indicates that AF002815 lacks two exons that are present in AL120751 (data not shown). The most 5Ј exon, however, is common to AF002815 and AL120751. The absence of these two internal exons from AF002815 results in a different reading frame upstream of aa 50. On this basis, we suggest that AF002815 represents a splice variant that lacks internal exons present in the AL120751 cDNA.
The new polypeptide encoded by AL120751 is collinear with AAC82471, AAC78443, eIF4G-I ext , and NP_004944, except for the Arg in the first position of AAC78443 (aa 30 using the common numbering system 2 ), which is Pro in eIF4G-I ext and AL120751 (Fig. 1). We assume this difference is due to an incomplete codon at the junction between the AAC78443 cDNA and the vector or a sequencing error. AL120751 predicts the longest eIF4G-1 protein reported to date, adding 22 aa residues to the eIF4G-I ext sequence. Furthermore, it provides a new potential translational initiation site. Upstream of the AUG that encodes Met-1 of AL120751 are both in-frame and out-of-frame stop codons. These features reduce the likelihood that a start codon even further upstream could be utilized.
Immunological Analysis of eIF4G-1 Electrophoretic Variants-To determine whether any of the multiple bands of eIF4G-1 typically seen by SDS-PAGE contain N-terminal sequences predicted by these cDNAs, we developed an anti-peptide antibody (anti-Peptide 10) against aa 55-65 (underlined in Fig. 1). 2,4 Both full-length eIF4G and cp N were separated by analytical gradient gel SDS-PAGE (Fig. 2). By silver staining, three partially resolved, major bands are observed at apparent molecular mass Ϸ 220 kDa in full-length eIF4G-1, as well as numerous minor bands (lane 2). Three major and several minor bands are also present in cp N (lane 1). Based on prior studies using various antibodies (14,38,51,52), these major bands in the cp N preparation are the N-terminal domains of the major proteins in the eIF4G preparation.
Some of the major bands in full-length eIF4G-1 detected by silver staining also reacted with the highly sensitive anti-Peptide 7 antibodies (lane 4), which were developed against eIF4G-1 aa residues 523-538 (2). As with the silver-stained preparation, the ϳ220-kDa cluster of proteins (lane 4) is not present in the cp N preparation (lane 3). Instead, cp N contains three major and several minor bands that are reactive with anti-Peptide 7 antibodies and migrate the same as Bands 1-3 of the silver-stained gel (cf. lanes 3 and 1). Most of the bands not reacting with this antibody are identified below by mass spectrometry. By contrast, with the new anti-Peptide 10 antibody, only the two slowest migrating forms were detected, whether in full-length eIF4G (lane 6) or cp N (lane 5). This suggests that only the two slowest forms contain aa 55-65 ( Fig. 1).
Separation of cp N Isoforms-The yield of eIF4G on m 7 GTP-Sepharose affinity chromatography, as measured with a mono-clonal antibody, is higher if the eIF4G is first cleaved by poliovirus infection (36). Furthermore, as shown in Fig. 2, cp N isoforms were separated better by electrophoresis than isoforms of full-length eIF4G. We therefore chose to separate isoforms of cp N rather than intact eIF4G. We attempted twodimensional gel electrophoresis of cp N , but the bands were extremely broad and failed to focus in the first dimension. This behavior has been previously observed (53). Instead, we obtained optimal resolution of cp N by one-dimensional SDS-PAGE on preparative (16 ϫ 20 cm) gradient gels.
Strategy for Assigning eIF4G-1 Isoforms to Bands-Application of anti-Peptide 7 and anti-Peptide 10 antibodies provided only limited information about the structures of the polypep- Total protein from K562 cells was subjected to m 7 GTP-Sepharose affinity chromatography followed by analytical gradient gel SDS-PAGE. Prior to chromatography the lysate was either untreated (lanes 2, 4, and 6) or treated with 2A protease (lanes 1, 3, and 5). Protein was detected by silver staining (lanes 1 and 2) immunoblotting with anti-peptide 7 (lanes 3 and 4) or anti-peptide 10 (lanes 5 and 6) antibodies.

FIG. 1. Alignment of polypeptide sequences derived by theoretical translation of cDNAs and an EST corresponding to eIF4G-1.
Protein ID generally refers to the GenBank TM protein accession numbers. One exception is AL120751, which is the GenBank TM nucleotide accession number for the sequence of an EST. The sequence shown is the conceptual translation product in the ϩ2 reading frame. The other exception is eIF4G-I ext , which refers to the sequence published in Ref. 45. Met residues are shaded. The underlined aa residues (55-65) indicate the epitope used to generate anti-Peptide 10 antibodies. Names for hypothetical forms of eIF4G-1 appear below potential N-terminal Met residues. AAC78442, AAC78443, and AL120751 do not continue toward the C terminus, because they are derived from only partial cDNA sequences. The aa residue 61 shown for AL120751 differs from the GenBank TM entry based on sequencing reported in the present work (i.e., ANT 3 AGT, encoding Ser). The aa residue 214 shown for NP_004944 differs from the GenBank TM record because of resequencing (52,67) of the cDNA (AGT 3 GGT, encoding Gly instead of Ser). tides making up the various electrophoretic bands. Initially, we attempted to identify them by N-terminal sequence analysis. The various cp N isoforms were separated by SDS-PAGE, blotted onto a polyvinylidene difluoride membrane, and analyzed by Edman degradation at the Macromolecular Structure Analysis Facility of the University of Kentucky. However, this failed to yield interpretable data, presumably due to blocked N termini.
We therefore turned to sequence information provided by cDNAs and ESTs. Five of the cDNA sequences (AAC82471, AL120751, AAC78443, eIF4G-I ext , and NP_004944) are theoretically translated into collinear polypeptide sequences. Due to the existence of multiple AUG codons, six eIF4G-1 isoforms can be predicted that correspond to alternative translational initation sites (Fig. 1). We termed these hypothetical isoforms eIF4G-1a, -1b, -1c, -1d, -1e, and -1f (Table I). The proteolytic peptides of each eIF4G-1 isoform can be predicted from the composite cDNA sequence (Fig. 3). It is also possible to determine the masses of peptides actually present in proteolytic digests from the various electrophoretic bands by MALDI-TOF-MS (54). Putting information on theoretical and actual peptides together, we can deduce which hypothetical isoforms are consistent with the peptide pattern of each electrophoretic band. The masses of "diagnostic" peptides, i.e. those that are unique to the various hypothetical eIF4G-1 isoforms, are shown in Table II.
Initial Characterization of Protein Bands in cp N Preparations by MALDI-TOF-MS-To determine which bands present in cp N preparations were actually related to eIF4G-1, we subjected a preparation of cp N to electrophoresis and staining with Coomassie Blue (Fig. 4). This pattern differs from that of Fig. 2, lane 1, for two reasons. First, higher resolution was achieved on the preparative gel (Fig. 4) than the analytical gel (Fig. 2). Second, staining with Coomassie Blue (Fig. 4) is roughly proportional to protein quantity, but staining with silver ( Fig. 2) is disproportionately strong for cp N polypeptides (38), perhaps because of several polyglutamic acid stretches in the sequence. Protein bands were excised, digested with endoproteases, and the resulting peptides analyzed by MALDI-TOF-MS (see "Experimental Procedures"). eIF4G-1 peptides were found in band b, band c plus band d (partially resolved), band e, and band g only, which correspond to Bands 1, 2, 3, and 4 in Fig. 2, respectively. These assignments, based on observed eIF4G-1 peptides, also agree with immunoblotting results (Fig. 2 as well as Refs. 14, 38, 51, and 52).
Some non-eIF4G-1 proteins present in the cp N preparation were potentially of interest. In addition to eIF4G-1, band c contained peptides that matched a LINE1 reverse transcriptase homologue. Bands d and n matched the p110 and p36 subunits of eIF3, respectively (55). As noted above, several heat-shock proteins bind to eIF4G, so it is interesting to note that band i is the endoplasmic reticulum-resident chaperone BiP, and Band l is hsp70-1. Band j is PABP, known to associate with eIF4G (10,11). Finally, whereas the 46-kDa DEAD-box helicase eIF4A is known to bind eIF4G-1 at two sites in cp C (14,20,21), it is interesting to note that band k is a 72-kDa DEADbox protein, and band m is a 68-kDa RNA helicase.
Band 1 Corresponds to eIF4G-1f-To correlate hypothetical eIF4G-1 isoforms (Table I) with electrophoretic bands (Fig. 2), we examined the eIF4G-1-containing bands in more detail with four different proteolytic enzymes over several mass ranges. A typical spectrum for Band 1 (which corresponds to band b in Fig. 4) in the range m/z ϭ 500 -6000 is shown in Fig. 5. Of the tryptic peptides predicted to arise from eIF4G-1 (Fig. 3), those with masses similar to peaks observed in Fig. 5 are listed in Table III. The average deviation of observed versus theoretical masses was 0.19 Da. The major peaks (other than calibrants) matched predicted eIF4G-1 peptides, indicating that the sample was not grossly contaminated. Peptides below 800 Da were excluded from further consideration because of high noise levels and paucity of significant sequence information. In addition to the tryptic peptides listed in Table III, we detected in other spectra Peptides 11⅐12 (missed cleavage), 13⅐14, 16, 29⅐30⅐31, 30⅐31, and 37⅐38, some of which were either singly or doubly oxidized in Met residues (see below).
Diagnostic tryptic peptides for Band 1 are shown in Fig. 6A  (upper panel). Band 1 has peaks at m/z ϭ 4534.7 and 4550.8, which are within 0.5 Da of the calculated masses of Peptide 2 (Table II) and the Met-oxidized form of Peptide 2 (ϩ16.0 Da). This "one-Met signature" agrees with the presence of one Met residue in Peptide 2 ( Fig. 3 and Table II) and contributes to the confidence of this assignment. Other peaks in Fig. 6A for Band 1 at m/z ϭ 4629.7, 4645.9, and 4662.1 are within 0.6 Da of the predicted masses of Peptide 5 (Table II) and its derivatives containing either one or two oxidized Met residues, respectively. This "two-Met signature" agrees with the presence of two Met residues in Peptide 5 ( Fig. 3 and Table II). Finally, the peak at m/z ϭ 4728.6 matches Peptide 28⅐29 (Fig. 3 and Table  III). In contrast, the spectrum of Band 2 contains peaks corresponding to Peptide 5 but not Peptide 2 (Fig. 6A, lower panel), nor were peaks corresponding to Peptides 2 or 5 observed in Bands 3 or 4 (not shown). Peptide 5 can come from eIF4G-1e or eIF4G-1f, but not from eIF4G-1a, -1b, -1c, or -1d ( Fig. 3 and Table II). However, Peptide 2 can only come from eIF4G-1f (or an undescribed, longer form not predicted from the cDNA sequence). Thus, the presence of Peptide 2 in Band 1 but not in Bands 2, 3, or 4 is consistent with eIF4G-1f being present only in Band 1.
Confirmation that eIF4G-1f is in Band 1 was obtained with endoprotease Lys-C. When this enzyme was used with Band 1, we detected Peptides k2, k4, k7, k12, k13, k15, and k21 (Fig. 3). A diagnostic peak at m/z ϭ 17,770 was observed in Band 1 but not Band 2 (Fig. 6B). Peptide k2 has a predicted, unmodified mass of 17,695 Da (Table II), which is 75 Da less than the peak The numbering system is given in Fig. 1. b Electrophoretic bands are shown in Fig. 2. c Not determined because of the high mass (15,387 Da) and microheterogeneity of Peptide d1C1. observed in Fig. 6B. Since Peptides 2 and 5 exist primarily in the oxidized forms (Fig. 6A), it is likely that the peak at 17,770 consists of a mixture of Peptide k2 and derivatives oxidized at one to four Met residues (calculated masses: 17,711, 17,727, 17,743, and 17,759 Da). The observed peak is within 0.06% of the latter calculated mass. These results indicate that Band 1

FIG. 3. Predicted peptides from digestion of hypothetical eIF4G-1 isoforms with trypsin, Arg-C, Lys-C, and Asp-N.
Peptides shown cover the cp N region of eIF4G-1 only. Tryptic peptides are numbered from N to C terminus using normal Arabic numerals (Peptides 1, 2, etc.). Lys-C peptide names are preceded by "k" (Peptides k1, k2, etc.). Arg-C peptide names are preceded by "r" (Peptide r1, r2, etc.). Asp-N peptide names are preceded by "d" (Peptide d1, d2, etc.). Peptides that form the N terminus of one of the hypothetical eIF4G-1 isoforms (Table I) Fig. 3 for peptide nomenclature. b Found only in the indicated hypothetical isoform of eIF4G-1 (Table I). c The numbering system and aa sequences of these peptides are given in Fig. 1. contains Peptide k2, which can only arise from eIF4G-1f (or a hypothetical larger form), whereas Bands 2, 3, and 4 do not contain Peptide k2. Unfortunately, the tryptic and Lys-C peptide predicted to represent the extreme N terminus of eIF4G-1f, MNK, is too small to be unambiguously identified by our approach. Thus, we cannot conclude from the data in Fig. 6, A and B, whether Peptides 2 and k2 were contributed by eIF4G-1f or a hypothetical larger form. We therefore turned to a different enzyme, Arg-C, which produced a diagnostic peak at m/z ϭ 4819.3 (Fig.  6C). The mass of Peptide r1, which is predicted to result from Arg-C digestion of eIF4G-1f, is 4908.6 Da (Table II). However, the mass of a derivative of Peptide r1 in which the N-terminal Met is removed and the resulting N-terminal Asp is N ␣ -acetylated (ϩ42.0 Da) is 4819.4, within 0.1 Da of the observed peak. This agrees with our observation that the same band was refractory to analysis by Edman degradation (see above). A second peak at 4835.6 matches the mass of Met-oxidized, N ␣acetylated Peptide r1 in which the N-terminal Met is removed. This one-Met signature is consistent with the postulated removal of the N-terminal Met-1 and oxidation of Met-41 (Fig. 1). The peaks at m/z ϭ 4819.3 and 4835.6 were observed in Band 1 but not Bands 2-4 (data not shown), which agrees with the hypothesis that eIF4G-1f is in Band 1 but not in the other bands.
To obtain additional evidence for eIF4G-1f in Band 1, we digested with Asp-N. A major peak with m/z ϭ 19,371 was observed in Band 1 (Fig. 6D, upper panel) but not Band 2 (Fig.  6D, lower panel), Band 3, or Band 4 (data not shown). This is similar to the mass of Peptide d1, which is 19,278 Da (Table II). Peptide d1 contains five Met residues, although based on the data with Arg-C (Fig. 6C), the N-terminal Met may have been removed on at least some of the polypeptides. Subtracting one Met residue, adding an acetyl group, and oxidizing four Met residues gives a mass of 19,253. Although this does not agree exactly with the m/z of the observed peak, it is within 0.6% of this value. As noted above, exact matching of predicted and observed peptide masses can be compromised by microheterogeneity (partial removal of N-terminal Met, acetylation, oxidation). Nonetheless, the peak at m/z ϭ 19,371 is at least consistent with the presence of Peptide d1 in Band 1, which is, in turn, indicative of eIF4G-1f.
Band 2 Represents a Mixture of Proteins, Including eIF4G-1e-Three lines of evidence suggest that Band 2 is a mixture of proteins. First, the band is broader than Band 1, Band 3, or other non-eIF4G bands in Coomassie Blue-stained gels (Fig. 4,  band c ϩ band d), suggesting that proteins are unresolved. Second, Band 2 is, in fact, partially resolved into a doublet or triplet in some gels (data not shown). Third, the Auto MS-Fit program identified a LINE1 reverse transcriptase homologue and an eIF3 subunit in bands c and d (Fig. 4), despite the fact that they also contain numerous eIF4G-1 peptides (see below). Nonetheless, we were able to obtain structural information about the isoform of eIF4G-1 present in Band 2.
Of the tryptic peptides in cp N with masses Ͼ800 Da, we detected Peptides 3, 4, 5, 13, 30, and 38 in Band 2, either as single peptides or composite peptides resulting from missed cleavages. A set of peaks with m/z ϭ 4630.0, 4646.3, and 4662.1 was observed for Band 2 (Fig. 6A, lower panel), corresponding to the two-Met signature of Peptide 5 (Table II). The presence of Peptide 5 is diagnostic of a protein being initiated with a Met upstream of eIF4G-1d, i.e. either eIF4G-1e or eIF4G-1f (Fig. 3). We also detected Peptide 13, which indicates the eIF4G-1 isoform present is initiated upstream of eIF4G-1a (data not shown). Yet Band 2 lacks Peptide 2 (Fig. 6A, lower  panel), indicating it does not contain eIF4G-1f. It is clear from the results with Band 1 that Peptide 2 would have been detected if present (Fig. 6A, upper panel). The presence of Peptide 5 but absence of Peptide 2 indicates that Band 2 contains eIF4G-1e.
Digestion with Asp-N revealed a peak at m/z ϭ 15,387 in Band 2 (Fig. 6D, lower panel). This m/z value is similar to that of Peptide d1C1 (15,300), which represents the N terminus of eIF4G-1e (Table II) 0.1%, which is considerably less than the difference for any other peptide predicted from the six hypothetical isoforms of eIF4G-1. This peak was not observed in Band 1 (Fig. 6D, upper  panel), Band 3, or Band 4 (data not shown), providing additional evidence that eIF4G-1e is present only in Band 2.
Band 3 Contains eIF4G-1c-The only tryptic peptides with masses Ͼ800 Da detected in Band 3 were Peptides 5C2, 12, 14, and 15. This is considerably fewer than for Band 1, presumably due to the lower amount of protein typically seen in this band (Fig. 2). A diagnostic peak was found in Band 3 at m/z ϭ 2426.20 that was not present in Band 1 (Fig. 7A, lower panel). This is 42.03 Da more that the mass of Peptide 5C2 (Table II), which is the postulated N terminus of eIF4G-1c. This peak is therefore consistent with an N ␣ -acetylated form of Peptide 5C2, which would agree with our inability to obtain information for this band from Edman degradation (see above). A second peak was detected for Band 3 at m/z ϭ 2442.23 (Fig. 7A, lower  panel). This one-Met signature agrees with the presence of one Met residue in Peptide 5C2 (Table II) and suggests that, unlike eIF4G-1f, the N-terminal Met is not removed before acetylation of eIF4G-1c. Alternatively, the nascent polypeptide may have been eIF4G-1d from which Met-88 was removed, followed by N ␣ -acetylation of Met-89 (Fig. 1).
Additional evidence for diagnostic peptides in Band 3 was sought using Lys-C. Peptides k2C3, k3, k4, k5, k7, k13, k14, k15, k16, k17, k18, k19, k20, and k21 were detected as either single or composite peptides. Diagnostic peptides were found at m/z ϭ 9124.1 and 9139.5 (Fig. 7B, lower panel). These are within 1 Da of the predicted masses for the N ␣ -acetylated form of Peptide k2C3 containing either one (9123.3 Da) or two (9139.3 Da) oxidized Met residues. This two Met signature is consistent with the presence of two Met residues in Peptide k2C3 (Table II). The peak at m/z ϭ 8883.3 of Band 1 corresponds to doubly charged Peptide k2 (predicted m/z ϭ 8847.9 for the unoxidized peptide; see Fig. 6B). Its breadth is presumably due to microheterogeneity from multiple oxidized Met residues.
Finally, we analyzed the Asp-N peptides of Band 3 (Fig. 7C). A unique peptide at m/z ϭ 10,317 matched the predicted mass for N ␣ -acetylated Peptide d1C3 (10,317 Da). Similarly, the observed peaks at m/z ϭ 10,333 and 10,349 matched the masses of the singly and doubly oxidized forms of this acetylated peptide (predicted at 10,333 and 10,349 Da). These peptides provide additional evidence for the presence of eIF4G-1c in Band 3.
In a separate experiment, MS fragmentation data (Fig. 8) confirmed our assignment of the tryptic peptide detected at m/z ϭ 2426.20 in Fig. 7A as N ␣ -acetylated Peptide 5C2 (cDNA-derived sequence given in Fig. 1).
Band 4 Represents a Mixture of Proteins That May Include eIF4G-1a-The amounts of eIF4G-related material in Band 4 were insufficient to make a definitive assignment. However, a Peptide names are given in Fig. 3, but with the following exceptions: ox indicates a single Met oxidation; ox1 means a peptide containing two Met residues but with only one oxidized; ox2 means both are oxidized; acr means Cys acrylamidization; and center dots designate composite peptides resulting from missed tryptic cleavages. Peptide sequences are given in Fig. 1.  Fig. 4) was subjected to trypsin digestion and MALDI-TOF-MS as described under "Experimental Procedures." Peaks attributable to eIF4G-1 peptides are indicated using the numbering system in Fig. 3. Composite peptides resulting from missed tryptic cleavages are shown with a center dot, e.g. Peptide 11⅐12⅐13. Peptides in which Met residues are oxidized have the suffixes ox, ox1, and ox2 (see Table III). Peaks resulting from peptide standards used for internal calibration are indicated as cal1, cal2, etc. For each peak, m/z values are compared with predicted mass values in Table III. peaks were detected that suggest the presence of eIF4G-1a. Specifically, we detected a peak in tryptic digests at 538.24 Da that is within 1 Da of Peptide 13C with one Met oxidized (537.25 Da). We also detected a peak in tryptic digests at 580.25 that is within 1 Da of the N ␣ -acetylated, Met-oxidized form of Peptide 13C (579.26 Da). In Asp-N digests, we detected a peak at m/z ϭ 3574.1 that is within 1 Da of Peptide d3C (3575.0 Da; Table II). We also detected a peak at m/z ϭ 3616.1 that is similar to N ␣ -acetylated Peptide d3C (3617.0 Da). All of these were only slightly above the background, making assignment of structure uncertain. Band 4 often occurs as a diffuse doublet, indicating heterogeneity, and immunoblotting demonstrates that the eIF4G-1 isoforms in Band 4 are much less abundant than in Bands 1-3 (Fig. 2). DISCUSSION The data presented here permit the identification of several eIF4G-1 primary structures and their assignment to electrophoretic bands. Band 1 consists of a novel isoform, here termed eIF4G-1f. This is also the most abundant form, as measured by Coomassie Blue staining and immunoreactivity to both anti-Peptide 7 and anti-Peptide 10 antibodies. Using trypsin or Lys-C, we observed Peptides 2 and k2, which can only arise from eIF4G-1f or a hypothetical protein initiated upstream of it. With Arg-C we detected the N-terminal peptide, Peptide r1, and its mass indicated that it was modified in two ways: the N-terminal Met was removed and the resulting Asn was N ␣acetylated. Asp-N produced Peptide d1, in agreement with the assignment of eIF4G-1f to Band 1, but it was too large (19,371 Da) to allow determination of the exact N-terminal structure. Thus, all of the data from Band 1 are consistent with its identity as eIF4G-1f. This isoform of eIF4G-1 was not predicted from any of the previously reported cDNA sequences. It is the longest isoform, with 22 additional aa residues at the N terminus compared with eIF4G-I ext (Fig. 1), the longest isoform previously postulated.
Several observations point to Band 2 being a mixture of proteins. Nonetheless, three types of data demonstrate that Band 2 contains eIF4G-1e. First, it reacts with anti-Peptide 10 antibodies, which are directed against an epitope in only eIF4G-1e and eIF4G-1f. Second, tryptic digests contain Peptide 5 but not Peptide 2, despite the fact that Peptide 2 is readily detectable in Band 1. Third, Asp-N digests contain Peptide d1C1 in Band 2 but not Band 1. Due to the size and microheterogeneity of Peptide d1C1, however, we were unable to determine the exact structure at the N terminus. eIF4G-1e is the isoform predicted from several of the cDNAs published before the present report (44 -46), although its existence had not actually been demonstrated. Comparison of eIF4G-1e to eIF4G-1f by immunoreactivity with two different antibodies indicates that eIF4G-1e is less abundant (Fig. 2 and data not shown).
Both peptide matching and fragmentation data support the assignment of Band 3 as a N ␣ -acetylated form of eIF4G-1c. In digests with trypsin, Lys-C, and Asp-N, we detected the predicted N-terminal peptides and their Met-oxidation products, viz. N ␣ -acetylated Peptide 5C2, N ␣ -acetylated Peptide k2C3, and N ␣ -acetylated Peptide d4C3, respectively. These peptides were found in Band 3 but not other bands. Finally, the sequence of N ␣ -acetylated Peptide 5C2 was established by MS fragmentation. Based on the agreement between Coomassie Blue staining and immunoreactivity in comparison with Band 1, Band 3 appears to be homogeneous for eIF4G-1c. This form of eIF4G-1 is clearly present in K562 cells but is less abundant than eIF4G-1f and -1e (Fig. 2). eIF4G-1c is another novel isoform, not previously predicted from cDNA sequences.
Additional forms of eIF4G-1 may exist but were too low in abundance to be identified in the present study. Several peptides unique to eIF4G-1a were detected in Band 4, but we do not judge the evidence to be compelling. It is interesting to note that many bands besides Bands 1-4 react with anti-Peptide 7 antibodies (Fig. 2, lane 4). One might consider them to repre-sent nonspecific binding of the antibodies to non-eIF4G-1 proteins except for one additional feature: they are not present in a preparation pretreated with 2A protease (Fig. 2, lane 3). The 2A protease of entero-and rhinoviruses is quite specific for a consensus aa sequence (38,52,56,57). A few cellular proteins other than eIF4G are cleaved by 2A proteases (58 -60), but two-dimensional electrophoresis reveals that the overwhelming majority of cellular proteins are unaffected by picornavirus infection (61). The likelihood that a non-eIF4G-1 protein would both react with anti-Peptide 7 antibodies and also be a substrate for 2A protease is quite remote. Thus, the numerous weak bands detected in anti-Peptide 7 immunoblots (Fig. 2, lane 4) may represent additional eIF4G-1 isoforms.
While there may well be additional isoforms of eIF4G-1, it is unlikely that there are any larger than eIF4G-1f. Even if longer cDNAs are found, three considerations make it unlikely that they will encode proteins containing an extension of the eIF4G-1f polypeptide sequence. First, we have shown that the structure of the major form (eIF4G-1f) is initiated from the AUG that is furthest upstream of the known cDNAs. Second, there are termination codons, both in-frame and out of frame, upstream of this AUG. It is possible that other forms of eIF4G-1 may be encoded by mRNAs arising by alternative splicing that contain different N-terminal aa sequences, e.g. AAC78442 (Fig. 1). However, we detected no peaks corresponding to either eIF4G-3 (4) or AAC78442. Third, eIF4G-1f is the principle constituent of Band 1, which is the slowest migrating immunoreactive band (Fig. 2). Drawing conclusions about eIF4G structure from electrophoretic mobility is complicated by the fact that the migration of eIF4G-1 on SDS-PAGE is aberrantly slow. The slowest eIF4G-1 isoform migrates at 220 kDa and the fastest, at 205 kDa (62), yet the calculated molecular masses range from 176 to 155 kDa (Table I). Thus, something besides the additional aa sequence reported here must account for the disparity. Despite the aberrant mobility, we observed that the order of bands follows the molecular mass of the isoforms, e.g. eIF4G-1f is both the largest form (Table  I) and the slowest eIF4G band (Fig. 2), etc. Larger forms of eIF4G-1 may exist but not be purified by the method used in this study; it is worth noting that our method of affinity purification on m 7 GTP-Sepharose would exclude any hypothetical eIF4G-1 isoforms that lack an eIF4E-binding site such as the eIF4G homologue eIF4G-2 (3,7,8).
Although it is clear that the various isoforms of eIF4G-1 contain different N termini, the origin of these proteins cannot be determined from the results presented. Several formal possibilities can be considered. The first is alternative translation initiation of the same mRNA by leaky scanning. The optimal consensus sequence for initiation in animals is A/GXXAUGG (63). A "strong" initiation codon is considered to be one containing either the purine at Ϫ4, the G at ϩ1, or both. The corresponding sequences for the various hypothetical eIF4G-1 isoforms are: eIF4G-1f, AAAAUGA; eIF4G-1e, CAAAUGA; eIF4G-1d, GUAAUGA; eIF4G-1c, AUGAUGA; eIF4G-1b, UU-GAUGA; and eIF4G-1a, AUCAUGU. Thus, the strong initiation codons are in mRNAs for eIF4F-1f, -1d, -1c, and -1a, perhaps explaining the absence of eIF4G-1b. The second possibility for the multiplicity of eIF4G-1 isoforms is internal initiation of translation. Two separate sequences derived from the 5Ј-portion of eIF4G-1 mRNA can function as internal ribosome entry sites in cultured cells (45,64,65). A single mRNA corresponding to the longest cDNA reported to date (this report; 5510 nt) could be initiated from the 5Ј-end, giving rise to eIF4G-1f, and internally, giving rise to eIF4G-1c. The nucleotide sequence representing the 5Ј-untranslated region of an mRNA that would encode eIF4G-1d has already been shown to direct internal initiation of translation (45). This may provide a mechanism for eIF4G-1c expression, allowing for the N-terminal processing of the nascent polypeptide (removal of Met-88 and N ␣ -acetylation of Met-89). Finally there may be several mRNAs differing in their 5Ј-terminal sequences that encode different eIF4G-1 isoforms, some of which exclude upstream AUG codons. These could arise from alternative splicing (e.g. NP_004944 and AAC78442) or alternative promoter usage. Resolution of this question will require an investigation of structure of eIF4G-1 mRNA(s) present in the cell and their translational properties. The fact that we never observed peptides diagnostic for slower bands in faster bands argues against the faster bands representing proteolytic breakdown products, or polypeptides missing internal exons. The same conclusion can be drawn from the absence of any bands running faster than Bands 1 and 2 that bind anti-Peptide 10 antibodies (Fig. 2).
We found a surprisingly large number of ligands co-purifying with cp N (Fig. 4). This type of analysis does not indicate whether these ligands are bound to the cp N fragment of all eIF4G-1 isoforms or to only a subset. These ligands include PABP and hsp70, which is consistent with previous studies showing these to be eIF4G-binding proteins (10,11,26). Less expected is the presence of the p110 and p36 subunits of eIF3. It is known that 11-subunit eIF3 has a high affinity for eIF4G, but the binding site has been localized in cp C , not cp N (14,19). The association of eIF4G with the p110 and p36 subunits may mean that there are additional points of attachment in cp N . The presence of a putative DEAD-box protein and an RNA helicase in cp N is intriguing, since eIF4A, a different member of the DEAD-box RNA helicase family, is well established to bind eIF4G-1 at two distinct sites in cp C (14,20,21).
At present we have not determined whether there are differences in the biochemical properties of the different eIF4G-1 isoforms. Since eIF4G is a protein that binds an unusually large number of protein and RNA ligands, the presence of different N-terminal sequences may direct the binding of isoform-specific ligands. The various eIF4G isoforms may have different affinities for ribosomes or initiation factors that link them to ribosomes (e.g. eIF3), opening the possibility that eIF4G-1 isoforms participate in mRNA recruitment to different extents. The only study explicitly comparing eIF4G-1 isoforms showed that transfection of cDNAs expressing eIF4G-1e and eIF4G-1a gave similar growth efficiencies in soft agar, an assay for malignant transformation (66). Yet no in vivo expression studies have been performed to date on eIF4G-1f, the longest and most abundant form in K562 cells.