JBC Oz Biosciences

HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Asano, K.
Right arrow Articles by Hershey, J. W. B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Asano, K.
Right arrow Articles by Hershey, J. W. B.
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Volume 272, Number 43, Issue of October 24, 1997 pp. 27042-27052
©1997 by The American Society for Biochemistry and Molecular Biology, Inc.

Structure of cDNAs Encoding Human Eukaryotic Initiation Factor 3 Subunits
POSSIBLE ROLES IN RNA BINDING AND MACROMOLECULAR ASSEMBLY*

(Received for publication, July 10, 1997)

Katsura Asano Dagger §, Hans-Peter Vornlocher Dagger , Nancy J. Richter-Cook , William C. Merrick , Alan G. Hinnebusch § and John W. B. Hershey Dagger par

From the Dagger  Department of Biological Chemistry, School of Medicine, University of California, Davis, California 95616, the  Department of Biochemistry, Case Western Reserve University, School of Medicine, Cleveland, Ohio 44106, and the § Laboratory of Eukaryotic Gene Regulation, NICHD, National Institutes of Health, Bethesda, Maryland 20892

ABSTRACT
INTRODUCTION
EXPERIMENTAL PROCEDURES
RESULTS
DISCUSSION
FOOTNOTES
ACKNOWLEDGEMENTS
REFERENCES


ABSTRACT

The mammalian translation initiation factor 3 (eIF3), is a multiprotein complex of ~600 kDa that binds to the 40 S ribosome and promotes the binding of methionyl-tRNAi and mRNA. cDNAs encoding 5 of the 10 subunits, namely eIF3-p170, -p116, -p110, -p48, and -p36, have been isolated previously. Here we report the cloning and characterization of human cDNAs encoding the major RNA binding subunit, eIF3-p66, and two additional subunits, eIF3-p47 and eIF3-p40. Each of these proteins is present in immunoprecipitates formed with affinity-purified anti-eIF3-p170 antibodies. Human eIF3-p66 shares 64% sequence identity with a hypothetical Caenorhabditis elegans protein, presumably the p66 homolog. Deletion analyses of recombinant derivatives of eIF3-p66 show that the RNA-binding domain lies within an N-terminal 71-amino acid region rich in lysine and arginine. The N-terminal regions of human eIF3-p40 and eIF3-p47 are related to each other and to 17 other eukaryotic proteins, including murine Mov-34, a subunit of the 26 S proteasome. Phylogenetic analyses of the 19 related protein sequences, called the Mov-34 family, distinguish five major subgroups, where eIF3-p40, eIF3-p47, and Mov-34 are each found in a different subgroup. The subunit composition of eIF3 appears to be highly conserved in Drosophila melanogaster, C. elegans, and Arabidopsis thaliana, whereas only 5 homologs of the 10 subunits of mammalian eIF3 are encoded in S. cerevisiae.


INTRODUCTION

The initiation phase of eukaryotic protein synthesis is promoted by at least 10 soluble proteins called eukaryotic initiation factors (eIFs)1 (1). The largest of these, eIF3, is a multiprotein complex of ~600 kDa that plays a central role in the initiation pathway. eIF3 binds to 40 S ribosomal subunits in the absence of other initiation factors and helps maintain 40 and 60 S ribosomal subunits in a dissociated state. It is also believed to play important roles in the formation of the 40 S initiation complex by interacting with the ternary complex of eIF2·GTP·Met-tRNAi and in promoting mRNA binding (2, 3). More specifically, eIF3 interacts with eIF4G, the largest subunit of the mRNA cap-binding protein complex (4), and with eIF4B (5).

Mammalian eIF3 consists of at least 10 nonidentical subunits: p170, p116, p110, p66, p48, p47, p44, p40, p36, and p35 (6). Human cDNAs encoding p170 (7), p116 (8), p110 (9), p48,2 and p36 (9) have been cloned and characterized recently. Through knowledge of the sequence of these subunits and from experiments involving their cDNAs, new insights into the structure and function of eIF3 have emerged. p170 is the site of binding of eIF3 to eIF4B, thus being implicated as playing an important role in the formation of higher order complexes of initiation factors (5). p116 has a prominent RNA recognition motif (RRM) near the N terminus that interacts with p170 (8), but it is not known if it binds to RNA. p110 is homologous to the yeast NIP1 protein (9), whose gene was isolated genetically as being involved in protein import into the nucleus (10). Although NIP1 was not found in preparations of yeast eIF3 isolated by classical biochemical fractionation techniques (11), it is present in eIF3 complexes isolated rapidly by using a polyhistidine-tagged subunit.3 The p48 subunit is identical to a mouse protein called Int-6, whose gene is a frequent site of integration by mouse mammary tumor viruses (12). This suggests that eIF3-p48 may control eIF3 activity and thus translation and thereby possibly affect growth regulation. p36 has a prominent WD-40 repeat signature that implies a role in formation of a multiprotein complex, an idea strengthened by the observation that its yeast homolog, eIF3-p39, is essential for the stability and maintenance of the yeast eIF3 complex (13). Of the five subunits whose cDNAs have been cloned and sequenced to date, all but p48 have homologous genes in Saccharomyces cerevisiae.

We have therefore turned our attention to the cloning and characterization of cDNAs encoding the remaining five subunits of eIF3. The rapidly growing data base of partial cDNA sequences, called expressed sequence tags (ESTs), is liberating researchers from cDNA screening with either antibodies or radiolabeled DNA probes. Rather, one can search the EST data base for matches to partial amino acid sequences derived from the protein of interest. In this report, we describe the isolation and sequencing of cDNAs encoding the p66, p47, and p40 subunits of eIF3 primarily by using this strategy. The cloning and characterization of p44 and p35 cDNAs is in progress and will be reported elsewhere.


EXPERIMENTAL PROCEDURES

eIF3 was prepared from human HeLa cells and rabbit reticulocytes essentially as described previously (14, 15). The polyclonal antiserum against rabbit eIF3 was prepared in a goat and characterized previously (16). Standard recombinant DNA techniques (17) were used. Affinity purification of anti-eIF3 antibodies and methods for immunoblotting and immunoprecipitation were described previously (9).

Cloning of cDNAs Encoding eIF3-p66, -p47, and -p40

Internal peptide sequences of the p66, p47, and p40 subunits of rabbit reticulocyte eIF3 were obtained by trypsin digestion of the corresponding protein obtained by SDS-PAGE, HPLC purification, and sequencing, all essentially as described (18). Internal peptide sequences of HeLa eIF3-p47 were obtained after Lys-C digestion in the gel, HPLC fractionation, and sequencing at the Protein Structure Laboratory (University of California, Davis). The peptide sequences are reported in Table I. Sequences in the EST data base that match the internal amino acid sequences were sought in the National Center for Biotechnology Information data base by using the BLAST (GCG, Madison, WI) program.

Table I. Peptide sequence analyses of eIF3 subunit polypeptides


eIF3 Subunit Peptide sequencesa Sourceb ESTc Matchd Similar proteine Size of mRNAf

kb
p66 (E/Q/D)GNVXAT(E/D)AVLATLMSX(T) Retic. T56803 15 /17 R08D7 3 (C. elegans) 1.9
p47 AYVSTHMGVPGR Retic. T39801 10 /12d Mov-34 (mouse) 1.3
MGVMFYPLTVK Retic.
SVE(V)X(NDF) HeLa
p40 GEPSL(E)EE(DL)S Retic. Z20189 9 /11 1.3
LFKPPQPPAR Retic. Z20189 10 /10
FMAQALQEYN Retic.
LL(Q/A)LLMDR Retic.
ANITF(F) Retic.

a Ambiguous residues are in parentheses, with alternative possibilities indicated following the slash. X represents any amino acid.
b Source of eIF3 employed for peptide sequencing was either rabbit reticulocytes (Retic.) or human HeLa cells (HeLa).
c GenBankTM accession number of EST deposited earliest as encoding each subunit peptide sequence.
d Number of residues encoded by the EST matching to each peptide sequence. The last two residues of the p47 sequence did not match because of a frame shift error in the EST sequences.
e According to the description attached to the sequence information. The origin of the protein is in parenthesis.
f The size of mRNA detected by Northern blot of HeLa poly(A) mRNA with labeled EST DNA.

eIF3-p66

The rabbit peptide sequence from p66 matches 15 of 17 residues in the human EST clone 67165 (accession number T56803). Clone 67165 DNA inserted at the EcoRI/XhoI sites of pBluescript SK+ was obtained from the Lawrence Livermore National Laboratory (Livermore, CA), was sequenced on both strands, and was renamed pBS67165. The 1202-bp insert encodes the 347 C-terminal amino acid residues of putative p66 and exhibits 53% sequence identity to a portion of a 64.2-kDa Caenorhabditis elegans protein encoded by cosmid clone R08D7_3 (accession number Z12017). The amino acid sequence of R08D7_3 protein was then used in additional BLAST searches of the EST data base, and four overlapping human ESTs (T3564, T32034, T32020, T32801) were found that encode related proteins with 40-53% sequence identity to the N-terminal region of R08D7_3. One of these clones, 102366 (T32020), with the DNA inserted into the EcoRI/XhoI sites of pBluescript SK+, was purchased from ATCC. The plasmid was named pBSp66, and the 1867-bp cDNA insert was sequenced on both strands by generating deletions with HindIII, which cleaves the insert both at 0.5 and 0.75 kb from the 5'-end, and by using custom-made primers. The 3'-terminal portion of the sequence matches perfectly that of clone 67165, and the full-length cDNA encodes a putative protein with a mass of 63.9 kDa. The cDNA sequence in pBSp66 was deposited in GenBankTM as human eIF3-p66 cDNA (accession number U54558).

eIF3-p47

Two rabbit peptide sequences matched the human EST clone 61212 (accession number T39801) at 10 of 12 and at 9 of 11 residues. The DNA was obtained from the Lawrence Livermore National Laboratory (Livermore, CA) and was sequenced on both strands. The 797-bp insert encodes a C-terminal open reading frame (ORF) with 214 amino acid residues and was used to screen a lambda ZAP human liver cDNA library (Stratagene). Two positive clones were isolated from 6 × 105 phages and the cDNA portions were excised in vivo. One, named pBSp47-17, carries a 1211-bp insert, which was sequenced on both strands by making BclI deletions (BclI cleaves the cDNA at 0.7 kb from the 5'-end) and by using custom-made primers. pBSp47-17 contains a C-terminal 352 codon ORF but lacks an apparent ATG initiation codon. To extend the sequence further upstream, 5'-rapid amplification of cDNA ends was attempted, but it added only four additional base pairs. Sequences at the 5'-end of pBSp47-17 were used to search again the EST data base. A recently deposited HeLa EST (AA179721) that overlaps pBSp47-17 with a perfect match extends the 5'-sequence by 30 bp and contains an in frame ATG codon followed by four new codons. Another human EST, AA158173, from pancreas DNA, extends the pBSp47-17 sequence by only four codons just up to the ATG and matches perfectly the sequence in AA179721. The validity of the AA179721 sequence was strengthened by the finding that two independent mouse ESTs for p47 encode an N-terminal amino acid sequence nearly identical to the human sequence (murine MASPA; human MATPA) and a similar 5'-UTR sequence: GCAAAG ATG GCT TCT CCG GCC (corresponding to nucleotides 1-21 in accession number U94855; bases shown in boldface type differ from the human sequence). The DNA sequence in pBSp47-17 plus the 21-bp 5'-extension that overlaps the mouse sequence, was deposited in GenBankTM as accession number U94855. A cDNA encoding full-length p47 was constructed as described below.

eIF3-p40

Of five peptide sequences derived from rabbit p40 (Table I), two match well (9/10 and 10/10) a 102-bp human EST (Z20189). Primers corresponding to the 5'- and 3'-regions of the EST were synthesized and used to amplify by PCR the DNA from a lambda ZAP human liver cDNA library (Stratagene). The resulting 115-bp DNA was subcloned into the EcoRI/HindIII sites of pSP73 (Promega) to generate pSPp40-115, which was sequenced to confirm its identity to the 102-bp human EST. The radiolabeled 115-bp fragment was used to screen 6 × 105 phages in the lambda ZAP human liver cDNA library used above, and one positive phage was purified. After excision in vivo, pBSp40-5 was obtained that carries a 1245-bp cDNA that lacks a suitable ATG initiation codon (see "Results" below). The cDNA sequence in pBSp40-5 was determined on both strands with deletion derivatives made with restriction enzymes SphI and SpeI, which cleave the cDNA at 0.4 and 0.8 kb from the 5'-end, respectively, and it was used to search the EST data base. Eight human ESTs were found to overlap pBSp40-5 and extend its 5'-sequence: W72146, T33675, T31050, T34146, T36262, T35271, T34243, and N56412. All overlap one another and are identical in the overlapping regions, except for an insertion of C following T-7 and a deletion of C-17 in W72146 and the deletion of G-20 in N56412, all of which were judged to be errors in the EST sequences. The EST T34243 extends the sequence in pBSp40-5 by 36 bp, GAAAG ATG GCG TCC CGC AAG GAA GGT ACC GGC TCT A, and contains an in frame ATG (underlined). The T34243 sequence overlaps and matches that of the longer W72146 sequence (except as noted above), confirming that it is a true extension of pBSp40-5. Further confirmation comes from six mouse ESTs (W92967, W54757, W48250, W98176, W49960, and W33557) that contain the homologous sequence, GAAAC ATG GCG TCG CGC AAG GAA GGC ACC GG(C/T) TC(T/C) A (bases not identical between the human and mouse sequences are shown in boldface type). The N-terminal amino acids derived from the mouse ESTs are identical to those encoded by the human sequence. To construct a cDNA encoding full-length p40, PCR amplification of pBSp40-5 was performed with the upstream primer 5'-CCCGAATTCCAT ATG GCG TCC CGC AAG GAA GGT ACC GGC TCT ACT GCC ACC TCT TCC AG-3' (tagged with EcoRI and NdeI sites and with the coding sequence in pBSp40-5 underlined) and a primer that contains the T7 promoter sequence and anneals downstream of the pBSp40-5 insert. The amplified DNA was cleaved with EcoRI and SphI, and the resulting 0.4-kb fragment was used to replace the corresponding EcoRI-SphI fragment in pBSp40-5 to yield pBSp40N.

Northern Blot Analysis

Total RNA was isolated from exponentially growing HeLa cells (15). Poly(A)+ RNA was isolated from total RNA and Northern blotting was conducted as described previously (8) with radioactive probes derived from pBS67165 (eIF3-p66; 0.35-kb EcoRI-NcoI fragment), pBS61212 (eIF3-p47; 0.35-kb EcoRI-BclI fragment), and pSPp40-115 (eIF3-p40; the 115-bp PCR DNA). Hybridizing RNAs were visualized by autoradiography.

Synthesis of eIF3 Subunits in Escherichia coli and in Rabbit Reticulocyte Lysates

For expression of eIF3-p66 cDNA in bacteria, pBSp66N was constructed as follows. A 0.95-kb EcoRI-NcoI fragment was amplified by PCR from pBSp66 with primer KAP66N (5'-CCCGAATTCCAT ATG GCA AAG TTC ATG ACA CC-3'; tagged with EcoRI and NdeI sites and with the initiator ATG of the p66 ORF underlined) and a T7 primer that anneals to the vector downstream from the 3'-end of the insert. The resulting amplified DNA was sequenced and then used to replace the 0.35-kb EcoRI-NcoI fragment of pBS67165 to generate pBSp66N. To construct pT7p66, the 1.8-kb NdeI-XhoI fragment of pBSp66N was cloned into the NdeI and SalI sites of pT7-7 (20). For p47 synthesis, pT7p47-17 first was constructed by subcloning the entire 1.2-kb EcoRI-XhoI cDNA insert of pBSp47-17 into the EcoRI and SalI sites of pT7-7. The plasmid encodes an altered form of p47 whose N-terminal five amino acids are replaced with MARTRHD encoded by the vector and by the linker sequence employed for generation of the liver cDNA library. The N-terminal sequence was corrected by PCR amplification of pBSp47-17 DNA with the primer 5'-CCCGAA TTCCAT ATG GCC ACA CCG GCG GTA CCA GTA AGT GCT CCT C-3' (tagged with EcoRI and NdeI sites and with the region corresponding to the pBSp47-17 sequence underlined) and 5'-GCT GTC CAC AAT GGA GGC C-3' that anneals to the p47 cDNA at residues 321-303 downstream from a NotI site at residue 269 (numbered as deposited as accession number U94855). The resulting 0.3-kb fragment was digested with NdeI and NotI and used to replace the 0.25-kb NdeI-NotI fragment of pT7p47-17 to generate pT7p47. For p40 synthesis, pT7p40 was constructed by cleaving pBSp40N with NdeI and XhoI and subcloning the 1.3-kb fragment into the NdeI-SalI sites of pT7-7.

pT7p66, pT7p47, and pT7p40 were linearized by digestion with ClaI, HindIII, and HindIII, respectively, and used as templates for in vitro transcription/translation (TnT system, Promega) with [35S]methionine. The radioactive translation products were analyzed by SDS-PAGE (19) alongside a lane containing purified human eIF3. The gel was stained with Coomassie Blue and subjected to autoradiography.

Construction of p66 Deletion Mutants

To construct pGEXp66 encoding a GST-p66 (full-length) fusion protein, the 1.8-kb EcoRI-XhoI fragment of pBSp66N was subcloned into the EcoRI and SalI sites of pGEX-4T-1 (Pharmacia Biotech Inc.). To make N-terminal deletions of the eIF3-p66 portion of the fusion protein, we synthesized the following two upstream primers: 5'-GGGAATTCTG GCG CGC ACA CAG AAG ACG-3' (tagged with an EcoRI site and with the sequence corresponding to nucleotides 326-343 of the eIF3-p66 cDNA sequence underlined) and 5'-GGGAATT CTG CAA TTT GGG GTT AGG CAG-3' (tagged with an EcoRI site and with the sequence corresponding to nucleotides 497-514 underlined). DNA fragments encoding different parts of eIF3-p66 were amplified with one of these two primers and with a downstream primer (5'-TCT CAA GCA CTG CTG GG-3'), corresponding to nucleotides 1057-1041 of eIF3-p66 cDNA. The PCR products were digested with EcoRI and NcoI, and the resulting 0.7- and 0.5-kb DNA fragments were used to replace the 1.0-kb EcoRI-NcoI fragment of pBSp66 to produce pBSp66-D1 and pBSp66-D2, respectively. The truncated reading frames encoded in the 1.5- and 1.3-kb EcoRI-XhoI fragments of these plasmids were subcloned into pGEX-4T-2 (Pharmacia) to generate pGEXp66-D1 and pGEXp66-D2, respectively. To construct C-terminal deletions, we employed PvuII, PstI, EarI, and Bsp120I, which cleaves the p66 cDNA in pBSp66N at positions 311, 425, 553, and 660, respectively. The 0.3-kb EcoRI-PvuII fragment, the 0.35-kb EcoRI-PstI fragment (with the 5'-overhang produced by PstI removed by mung bean nuclease), and the 0.4-kb EcoRI-EarI fragment (with the 3'-overhang produced by EarI filled in with Klenow enzyme) were subcloned into the EcoRI and SmaI sites of pGEX-4T-1 to generate pGEXp66Delta P2, pGEXp66Delta P1, and pGEXp66Delta E, respectively. To construct pGEXp66Delta B, the 0.6-kb EcoRI-Bsp120I fragment of pBSp66N was subcloned into the EcoRI and NotI sites of pGEX-4T-1.

Northwestern Blot Determination of RNA Binding Activity

pT7p66 was transformed into strain BL21(DE3) (Novagen) containing the isopropyl-1-thio-beta -D-galactopyranoside-inducible T7 RNA polymerase gene, and expression of p66 was induced by the addition of isopropyl-1-thio-beta -D-galactopyranoside, as recommended by the manufacturer. To analyze the binding of RNA to recombinant eIF3-p66 by Northwestern analysis, transformed BL21(DE3) lysate proteins were fractionated by SDS-PAGE and transferred to a nitrocellulose membrane (BA085, Schleicher and Schuell). 32P-Labeled beta -globin mRNA was incubated with the membrane (Northwestern blotting) as described previously (20).

A full-length GST-p66 fusion protein and its deletion derivatives were expressed in strain BL21(DE3) and purified by affinity chromatography as recommended by the manufacturer (Pharmacia). In the case of GST-p66-D2, the purified fraction contained a single polypeptide of the expected size (73.7 kDa) and was isolated with a yield of 0.5 mg/liter of E. coli culture. The other purified proteins were of the expected size but were obtained in lesser yields, along with numerous smaller polypeptides, presumably a result of degradation. We scanned Coomassie-stained gels and used NIH Image software to quantitate the amounts of the expected products, which varied between 0.25 and 0.005 mg/liter of E. coli culture. To conduct Northwestern analysis of these GST-p66 proteins, the same molar amount of each full-length deletion form was subjected to SDS-PAGE and blotted with 32P-labeled beta -globin mRNA as described above.

cDNA Sequences Generated from the EST Data Base

To conduct phylogenetic analyses of the Mov-34 family in the data base, the following three cDNA sequences were assembled from EST sequences obtained entirely from the data base. Although the resulting sequences are hypothetical, we checked the validity of each one by two criteria: (i) multiple independent occurrence of overlapping EST sequences and (ii) high similarity between the deduced human and mouse polypeptide sequences.

Human 34-kDa Mov-34 Homolog

During a BLAST search of sequences related to the human p47 sequence in the EST data base, we found four related ESTs showing a good match at the N-terminal conserved region (regions II-IV in Fig. 5B). They are T32502 (human), T17206 (human), W71047 (mouse), and W98454 (mouse). More ESTs were sought that overlap these ESTs. The correct ORF was deduced by comparing the human sequences with the mouse ones. Thus, four human ESTs, T17206 (1-350; N13 was deleted, N119 is A according to W98545), W22816 (71-321; C inserted after C277 based on W44355), T88904 (439-320; C408 deleted based on W32482), and W45708 (340-1; N328 is T based on H92903), which contain fewer apparent errors, were edited to make a continuous 1042-bp cDNA sequence (deposited in GenBankTM as accession number U70735). This encodes a 297-amino acid protein of 33,553 Da. The ORF (nucleotides 44-937) was confirmed by a number of overlapping murine ESTs that encode a nearly identical peptide sequence. The proposed initiator ATG is surrounded by GTGATGG, a good consensus sequence (21). Nucleotides 1-25 do not match either murine or rat sequences despite overlapping with other human ESTs, supporting the initiation codon assignment at nucleotides 44-46. The 3'-UTR contains a polyadenylation signal, AATAAA.


Fig. 5. Protein sequences and alignments of eIF3-p47, eIF3-p40, and related proteins. A, the human eIF3-p47 sequence aligned with its C. elegans and A. thaliana homologs. The complete amino acid sequence of eIF3-p47 (GenBankTM accession number U94855) is aligned with its C. elegans and A. thaliana homologs, CeD2013_7 (GenBankTM accession number Z47808) and Ath 32-kDa Mov34 homolog (GenBankTM accession number U54561; see "Experimental Procedures"). The alignment was conducted with the Pileup program (GCG). The conserved regions named I-VI (see Fig. 6) are shaded in gray. Sequences that match the three partial peptide sequences determined with rabbit and HeLa eIF3 are underlined, and identical matched residues are in boldface type. B, human eIF3-p40 sequence aligned with part of the human eIF3-p47 sequence. The complete amino acid sequence of eIF3-p40 (GenBankTM accession number U54559) is shown and is aligned with a portion of the human eIF3-p47 sequence (residues 90-223). The alignment was conducted with the Bestfit program (GCG). Identical residues are boxed. The eIF3-p40 sequence is 27% identical and 40% similar to that of p47 in this 140-amino acid-long region. The conserved regions named I-VI (see Fig. 6) are shaded in gray. The regions that match the five partial peptide sequences determined with rabbit eIF3 (Table I) are underlined, and identical matched residues are in boldface type.

[View Larger Version of this Image (73K GIF file)]


Mouse and Human 38-kDa Mov-34 Homologs

During a BLAST search for C. elegans F37A4_5 homologs in the EST data base, we found nine ESTs with a good match. They are N50270, W92668, AA004500, T33527, N31767, and T33374 from humans, and AA009202, W13206, and W87953 from mice. More ESTs were sought that overlap these ESTs. The human and mouse sequences encode nearly identical proteins, despite a number of conservative base changes in the DNA sequences. In addition, less well conserved regions flanking the presumed coding regions suggest that they are untranslated regions. Thus, we generated a hypothetical murine cDNA sequence from W87953 (1-448; G421 was deleted based on human AA004500), AA009202 (1-393), and W13206 (1-339; T inserted after T21 based on human N90861), which together make a continuous 1190-bp cDNA sequence (deposited in GenBankTM as accession number U70736). This encodes a 334-amino acid protein of 37,547 Da. The protein is named the mouse 38-kDa Mov-34 homolog.

To edit human ESTs encoding the 38-kDa homolog, six ESTs with fewer apparent errors were used to generate the cDNA sequence: T36298 (1-359), N31767 (347-407), AA004550 (46-111), W92668 (15-263; N62 is T, N168 is A, and G inserted after G184, all based on human N50270), N50615 (95-333), and N90861 (302-1). The cDNA sequence is 1277 bp long and encodes a 334-amino acid protein of 37,427 Da (deposited in GenBankTM as accession number U70734). The human 38-kDa homolog is 98.5% identical to the mouse homolog, although the coding region is only 91.5% identical at the DNA sequence level. This confirms the authenticity of the coding regions both for the mouse and human homologs. The 3'-UTR of the human sequence contains a polyadenylation signal, AATAAA.

Arabidopsis thaliana cDNAs Encoding Mov-34 Homologs

cDNA clones 115I17 (accession number T43107) and 139G4 (accession number T46657) that encode A. thaliana Mov-34 homologs were kindly provided by the Arabidopsis Biological Resource Center (Columbus, OH). The cDNA inserts (1.3 kb for 115I17 and 1.2 kb for 139G4) were cloned into SalI (5') and NotI (3') sites of pZL1 (Life Technologies, Inc). We made deletions with restriction enzymes and sequenced the resulting constructs as follows. The enzymes used were as follows: for 139G4, SalI (cuts at 0.15 kb from the 5'-end) and HindIII (at 0.5 kb); for 139G4, EcoRI (at 0.35 kb) and HindIII (at 0.5 and 1.0 kb). Two custom-made primers were made for each clone, and the entire sequence was determined for both strands. The cDNA sequences of clones 115I17 and 139G4 were deposited in GenBankTM as accession numbers U54560 and U54561, respectively. They encode a 308-residue protein of 34,726 Da and a 293-residue protein of 31,860 Da, which we have named here A. thaliana 35- and 32-kDa Mov-34 homologs, respectively. The 5'-proximal ATG of 139G4 is preceded by an in frame TAA codon located 46 bp upstream and lies in a good sequence context (ACCATGG), supporting the view that it serves as the initiator codon.

Partial Mov-34 Homolog Peptides Used in Fig. 5

The following four additional Mov-34-related sequences were identified in the data base: C. elegans M79827 (translation of nucleotides 1-270; 53% identity to residues 187-246 of the human 26 S proteasome S12); C. elegans D74615 (translation of nucleotides 38-208; 61% identity to residues 35-83 of human eIF3-p40); A. thaliana N96623 (translation of nucleotides 418-80; 38% identity to residues 239-350 of human eIF3-p40); and C. elegans D75207 (translation of nucleotides 147-360, N301 was deleted to continue the reading frame; 68% identity to residues 49-125 of the human 38-kDa Mov-34 homolog).


RESULTS

Cloning and Characterization of Human eIF3-p66 cDNA

To clone eIF3-p66 cDNA, a partial peptide sequence was obtained from the 66-kDa subunit of highly purified rabbit eIF3 and was used to identify human DNA sequences (ESTs) in the data base as described under "Experimental Procedures." Briefly, a human EST encoding the peptide was found, and its sequence was used to identify a homologous DNA in C. elegans (R08D7_3) that encodes a 64.2-kDa protein. The C. elegans DNA was then used to identify four overlapping human ESTs, one of which (T32020) encodes a 64.0-kDa protein. This DNA in pBluescript SK+ was sequenced and renamed pBSp66.

pBSp66 contains a 1867-bp insert with a 1647-bp ORF that encodes a putative 63,932-Da protein of 548 amino acid residues. The DNA sequence is deposited in GenBankTM (accession number U54558); the amino acid sequence is shown in Fig. 1. The 5'-proximal AUG is surrounded by the sequence AAGAUGG, which compares favorably with the consensus sequence for strong initiator codons (21). It is preceded by a 70-bp 5'-UTR, which contains an in frame TAA termination codon, located 66 bases upstream from the 5'-proximal AUG. A second AUG codon occurs at the fifth codon downstream, but its consensus sequence (TTCAUGG) is rather weak and the AUG probably does not contribute significantly to initiation. The 3'-UTR has 133 nucleotides with a polyadenylation signal AAUAAA beginning at nucleotide 1821, followed by a string of 17 A residues. The cloned cDNA appears to be nearly full-length, since Northern blots (not shown) produce a single hybridization signal at 1.9 kb.


Fig. 1. Aligned amino acid sequences of human eIF3-p66 and its C. elegans homolog. The human eIF3-p66 sequence in pBSp66 (GenBankTM accession number U54558) is aligned with the C. elegans R08D7_3 sequence (GenBankTM accession number S24459). The alignment was conducted with the Bestfit program (GCG). Identical residues are marked by a vertical line; similar residues are indicated by two dots and are defined as KR, VLI, DE, NQ, ST, and FYW. The region that matches the partial peptide sequence determined with rabbit eIF3 is underlined, and identical matched residues are in boldface type. The N-terminal and C-terminal ends of GSTp66 deletions defined in Fig. 4A are indicated by hooked arrows. The hydrophilic, highly basic segment that contains the RNA-binding region is highlighted by a gray box.

[View Larger Version of this Image (65K GIF file)]


The DNA sequence of the pBSp66 insert was compared with other known human sequences by searching the EST data base. Identical or nearly identical (>95%) partial sequences were found in ESTs from different tissues as shown in Table II. Three independent ESTs (T32801, T31738, and T35950) in addition to T32020 (same as pBSp66) contain 5'-UTRs with an in frame termination codon upstream of the putative initiator codon, ruling out the possibility that an AUG codon further upstream might be the site of initiation. The frequency and wide distribution of ESTs matching pBSp66 indicate that p66 is probably expressed in all tissues as a fairly abundant protein, as observed for the eIF3-p110, -p48, and -p36 subunits (9).2 Additional EST data base searches with the eIF3-p66 protein sequence show that eIF3-p66 is conserved in higher eukaryotes. Besides the C. elegans protein expressed from R08D7_3, eIF3-p66 is similar to proteins encoded by the plant A. thaliana (ESTs T04726, F19941, and H37238, 52-59% identical), rice (EST C19631, 55% identical), the protozoan Caenorhabditis briggsae (ESTs R05053 and R03891, 58-66% identical), the protozoan Toxoplasma gondii (EST W66189, 46% identical), and the fruit fly D. melanogaster (AA390516, AA390505, AA264287, AA264798, 59-60% identical).

Table II. Expression of eIF3 subunit mRNAs in different human tissues


Tissue Total number of ESTsa Number of ESTs encodingb
p66 p47 p40

Fetal brain 15,321 0 1  (0.006) 0
Infant brain 69,693 6  (0.009) 8  (0.01) 3  (0.004)
Brain 14,942 1  (0.007) 2  (0.01) 3  (0.02)
Fetal liver spleen 92,148 5  (0.005) 3  (0.003) 2  (0.002)
Placenta 45,157 6  (0.01) 29  (0.06) 1  (0.002)
Fetal heart 26,642 7  (0.03) 11  (0.04) 12  (0.05)
Heart 4150 1  (0.02) 0 0
Fetal lung 14,274 4  (0.03) 8  (0.06) 4  (0.03)
Lung 9115 1  (0.01) 1  (0.01) 0
Melanocyte 22,901 14  (0.06) 10  (0.04) 0
Breast 17,316 2  (0.01) 20  (0.1) 0
Multiple sclerosis 13,936 2  (0.01) 0 2  (0.01)
Fibroblastsc 6384 4  (0.06) 5  (0.08) 0
Skeletal muscle 2846 0 7  (0.2) 1  (0.04)
Testis 1577 2  (0.1) 2  (0.1) 1  (0.06)
White blood cells 934 4  (0.4) 1  (0.1) 3  (0.3)
Uterus 333 1  (0.3) 1  (0.3) 0
Kidney 168 1 1 0
Skin 139 0 2 1
Others 77,898 12 16 21
Total 435,735 73  (0.02) 126  (0.03) 52  (0.01)

a The total number of ESTs in each library was based on Ref. 43.
b Percentages of ESTs found in each library are in parentheses.
c Most of the fibroblast ESTs (6350) are from "Soares senescent fibroblasts."

Hydrophobicity profiles of both eIF3-p66 and the R08D7_3 protein are strikingly similar and largely hydrophilic throughout the coding region (data not shown). One of the hydrophilic regions near the N terminus, shaded in Fig. 1, is rich in arginine and lysine (19.4% Arg and 15.3% Lys in residues 87-158 of human eIF3-p66; 19.8% Arg and 9.9% Lys in residues 95-175 of R08D7_3) and is characterized further as an RNA-binding domain (see below).

Two methods were used to demonstrate that pBSp66 actually encodes the p66 subunit of eIF3. First, the coding region of the insert in pBSp66 was inserted into the E. coli expression vector pT7-7 (22) to form pT7p66, which was induced to express the cDNA in vivo as described under "Experimental Procedures." SDS-PAGE analysis of the bacterial lysate after a 90-min induction showed a band at 66 kDa (data not shown), which was excised and used to affinity-purify antibodies present in a crude goat anti-eIF3 antiserum. After SDS-PAGE and immunoblotting, the affinity-purified anti-p66 antibodies recognize the p66 subunit in purified eIF3 and a protein of identical mobility in a HeLa lysate (Fig. 2A, lanes 1 and 2). The second method used to show that pBSp66 encodes p66 involves in vitro transcription of the cDNA insert in pT7p66 coupled with translation of the mRNA in a rabbit reticulocyte lysate. The resulting 35S-labeled proteins are analyzed by SDS-PAGE and autoradiography. As shown in Fig. 2B, a major radiolabeled polypeptide of 66 kDa that comigrates with eIF3-p66 is detected. The presence of immunoreactive or labeled proteins of greater mobility in both analyses most likely is due to partial proteolysis of p66, consistent with the previously noted sensitivity of the p66 subunit to degradation (23).


Fig. 2. SDS-PAGE analyses of human eIF3 subunits. A, immunoblot analysis of purified eIF3 and HeLa cell extracts with anti eIF3-p66, -p47, and -p40 antibodies. Purified eIF3 (200 ng) and HeLa cell extracts (20 µg) were fractionated by 10% SDS-PAGE (19), transferred to a nylon membrane, and subjected to Coomassie Blue staining (lane 8) or immunoblotting with antibodies affinity-purified with recombinant p66 (lanes 1 and 2), recombinant p47 (lanes 3 and 4), recombinant p40 (lanes 5 and 6), and crude anti-eIF3 antiserum (lane 7). The migration positions of molecular mass markers are indicated on the left. eIF3 subunits are identified on the right. Shown by an asterisk is an uncharacterized 42-kDa polypeptide, which may correspond to "p43" occasionally found in HeLa eIF3 (23). B, in vitro synthesis of recombinant eIF3-p66, -p40, and -p47. Plasmids pT7p66 (lane 1), pT7p40 (lane 2), and pT7p47 (lane 4), carrying, respectively, the p66, p40, and p47 cDNAs behind the T7 promoter, were expressed in the TnT coupled transcription/translation system (Promega) as described under "Experimental Procedures." The panel shows an autoradiogram of the 35S-labeled translation products fractionated by 10% (lanes 1 and 2) or 8-16% gradient (lane 4) SDS-PAGE. Purified rabbit reticulocyte eIF3 (lane 3) and human HeLa eIF3 (not shown) were subjected to electrophoresis in parallel with lanes 1 and 2 and with lane 4 and stained with Coomassie Blue to determine the migration positions of the subunits indicated on the right. Migration positions of molecular mass standards (in kDa) are indicated on the left. C, Northwestern blot of recombinant p66 expressed in E. coli. Aliquots of uninduced (lane 1) and induced (lane 2; 90-min isopropyl-1-thio-beta -D-galactopyranoside induction) cultures of BL21(DE3) carrying pT7p66 were harvested, and cells were suspended in Laemmli buffer, followed by heat treatment. The protein extract was fractionated with a 8-16% Tris-glycine gradient gel (Novex) in the presence of SDS and transferred onto a BA084 membrane (Schleicher and Schuell). The renatured proteins on the membrane were probed with 32P-mRNA as described under "Experimental Procedures." The radioactive bands at 36 kDa in both lanes are due to an unidentified E. coli RNA-binding protein and serve as an internal control for the amount of lysate analyzed.

[View Larger Version of this Image (53K GIF file)]


To exclude the possibility that the 66-kDa protein in eIF3 preparations is a copurifying contaminant, we immunoprecipitated purified eIF3 with affinity-purified anti-p170 antibodies as described elsewhere (9). The immunoprecipitate was fractionated by SDS-PAGE, and the gel was silver-stained (Fig. 3A) and immunoblotted with affinity-purified antirecombinant p66 antibodies (Fig. 3B). Although the presence of p66 in the silver-stained gel is obscured by bovine serum albumin, p66 is detected with the affinity-purified antibodies (Fig. 3B, lane 3). These results demonstrate that p66 is stably associated in a complex with the p170 subunit of eIF3.


Fig. 3. Immunoprecipitation of eIF3 subunits with anti-p170 antibodies. A, silver staining. Highly purified HeLa eIF3 was incubated with no antibody (lane 1) or affinity-purified anti-p170 antibody (lane 2). The immune complexes were isolated with gamma -bind Protein G beads (Pharmacia) and analyzed by 10% SDS-PAGE and silver staining as described under "Experimental Procedures" and previously (9). Lane 3, 20 ng of purified eIF3. The migration positions of molecular weight standards, bovine serum albumin (BSA) and anti-p170 antibodies (Ab) are labeled on the left. Those of eIF3 subunits are labeled on the right. Shown by an asterisk is the uncharacterized 42-kDa polypeptide. B, immunoblotting. eIF3 was treated with preimmune serum and anti-eIF3 antibodies affinity-purified with p170, and the immunoprecipitates were analyzed by SDS-PAGE and immunoblotting with affinity-purified antibodies to recombinant p66 (upper panel), recombinant p47 (middle panel), and recombinant p40 (lower panel). Only a portion of each immunoblot is shown. Lane 1, immunoprecipitation control without antibody; lane 2, control with preimmune serum for the goat anti-eIF3 antibodies; lane 3, immunoprecipitate with anti-p170 antibodies. The rather weak reaction with the anti-p66 antibodies is probably due to their very low titer (23).

[View Larger Version of this Image (40K GIF file)]


Analysis of the RNA Binding Activity of Recombinant eIF3-p66 Forms

The p66 subunit in eIF3 is known to bind to RNA (24). We therefore determined whether or not the recombinant 66-kDa polypeptide can bind RNA. The bacterial extract that expresses p66 from pT7p66 was fractionated by SDS-PAGE, blotted onto a nitrocellulose membrane, and probed with 32P-labeled beta -globin mRNA as described in the legend to Fig. 2. Detection of a radiolabeled band at 66 kDa (Fig. 2C) only when expression is induced indicates that recombinant p66 is an RNA-binding protein. The results support the view that pBSp66 encodes the RNA-binding p66 subunit of eIF3.

The RNA-binding domain in p66 was localized by analyzing the RNA-binding activity of deletions made in a GST-p66 fusion protein as described under "Experimental Procedures." Equimolar amounts of the deleted forms depicted in Fig. 4A were subjected to Northwestern analysis (Fig. 4B) with radiolabeled beta -globin mRNA as probe, as described above. An N-terminal deletion of 86 residues (GST-p66-D1) binds RNA, whereas a deletion of 143 residues (GST-p66-D2) does not, suggesting that the N-terminal boundary of the RNA-binding domain lies between residues 86 and 143. C-terminal deletions up to residue 118 (GST-p66-Delta B, -Delta E, and -Delta P1) retain RNA binding activity, whereas a deletion to residue 81 does not. Thus, the C-terminal boundary is located between positions 81 and 118. By combining these results, it is concluded that the RNA-binding domain of p66 is located between residues 86 and 118. The actual RNA-binding domain could be considerably larger, however, if we allow the possibility that it contains multiple redundant elements of which N-terminal or C-terminal subsets would be sufficient for RNA binding, and the RNA-binding domain might coincide with the above mentioned highly basic hydrophilic segment.


Fig. 4. RNA-binding domain of human eIF3-p66. A, structures of GST-p66 and its deletion derivatives. GST-p66 and its deletion derivatives, GST-p66-D1, GST-p66-D2, GST-p66Delta P1, GST-p66Delta P2, GST-p66Delta E, and GST-p66Delta B, were produced in BL21(DE3) and purified as described under "Experimental Procedures." Boxes represent the structures of these fusion proteins. Filled boxes denote the GST portion, and empty boxes denote the p66 portion. Numbers in parentheses identify the N-terminal or C-terminal residue of each construct. RNA binding ability of each fusion protein as determined by experiments shown in panel B is summarized on the right. The structure of full-length eIF3-p66 is depicted at the bottom. The minimal RNA-binding domain and the basic region that does not match well with the C. elegans protein R08D7_3 are shown by lightly shaded and darkly shaded boxes, respectively. B, Northwestern blotting of GST-p66 and its deletion derivatives. 1.25, 2.5, and 5 pmol of the proteins shown in A were subjected to electrophoresis in 8-16% Tris-glycine gels in the presence of SDS, blotted on a nitrocellulose membrane, and allowed to bind 32P-mRNA as described under "Experimental Procedures." The membrane was subjected to autoradiography. Positions of molecular weight standards are shown on the right. Arrowheads indicate the migration positions of each fusion protein determined by Coomassie staining.

[View Larger Version of this Image (41K GIF file)]


Construction and Characterization of a cDNA Encoding eIF3-p47

A cDNA encoding the entire p47 subunit of eIF3 was constructed as described under "Experimental Procedures." Briefly, two eIF3-p47 peptide sequences (Table I) match to a protein encoded by a human EST whose DNA was used as a hybridization probe to screen a lambda  cDNA library. A cDNA that encodes nearly full-length p47 was isolated as pBSp47-17, and the missing five codons encoding the N terminus were deduced from other EST sequences. Full-length cDNA was obtained by PCR amplification of pBSp47-17 DNA with an upstream primer that carries the missing DNA region and a suitable downstream primer and replacement of the N-terminal region of pT7p47-17 with this PCR product to yield pT7p47.

The ORF of the 1231-bp reconstructed cDNA encodes a protein with 357 amino acids and a calculated mass of 37,540 Da. The DNA sequence is deposited in GenBankTM with accession number U94855, and the amino acid sequence is shown in Fig. 5A (labeled human p47). The two internal peptide sequences mentioned above, along with a third determined later (Table I), are underlined in Fig. 5A. The 5'-UTR of this cDNA is 6 bp long, and the AAGATGG sequence surrounding the putative initiation codon (underlined) matches well the consensus sequence for strong initiation codons (21). The 3'-UTR contains 142 nucleotides and a polyadenylation signal AATAAA followed by a string of 9 A residues. Although the 5'-UTR is relatively short, Northern blot analysis of HeLa mRNAs identified a band of 1.3 kb hybridizing to a pBSp47-17 probe (data not shown), suggesting that few nucleotides are missing from the 5'-UTR. As with eIF3-p66, ESTs corresponding to eIF3-p47 mRNA are found in essentially all tissues (Table II), as expected for a putative housekeeping protein.

Evidence that the insert in pBSp47-17 encodes a large portion of eIF3-p47 was obtained by immunoblot analysis with antibodies affinity-purified against the recombinant protein. pT7p47-17, which carries nearly the entire coding region for p47 (see "Experimental Procedures" for construction) was introduced into E. coli BL21(DE3), which contains the inducible T7 RNA polymerase. After induction and expression in E. coli, recombinant p47-like protein was used to obtain affinity-purified anti-p47 antibodies as described under "Experimental Procedures." Immunoblot analysis of purified eIF3 generates a single reactive band that comigrates with the p47 subunit of eIF3 upon SDS-PAGE (Fig. 2A, lane 3). When a HeLa cell lysate was analyzed, not only p47 but also a second reactive band is detected that migrates as a 40-kDa protein (lane 4). The results provide evidence that pT7p47-17 contains DNA encoding eIF3-p47.

To demonstrate that pT7p47 contains DNA encoding authentic, full-length p47, in vitro transcription/translation of the cDNA was carried out as described under "Experimental Procedures." pT7p47 expresses a single major radiolabeled protein that precisely comigrates with eIF3-p47 (Fig. 2B, lane 4). The full-length cDNA also was subcloned into pET28C expression vectors (Novagen) to yield pETp47 (without a poly(A) tail) and pETp47(A) (with a poly(A) tail). Following in vitro transcription/translation, results similar to those with pT7p47 were obtained with either construct (not shown), indicating that the presence of a poly(A) tail in the transcript is not required for efficient translation of the mRNA.

The comigration of the encoded products with authentic eIF3-p47 upon SDS-PAGE confirms the conclusion that the 5'-proximal ATG is correctly identified as the initiator codon. Finally, immunoprecipitation with anti-p170 antibodies was performed as described for p66 above. Immunoprecipitates dependent on anti-p170 contain p47, as determined by immunoblotting with antibodies affinity-purified from the recombinant protein (Fig. 3B). The results of these analyses strongly support the view that the cloned cDNA in pT7p47 encodes eIF3-p47, a protein present in complexes with eIF3-p170.

Cloning and Characterization of a cDNA Encoding the 40-kDa Subunit of eIF3

A cDNA carrying the entire coding region for eIF3-p40 was cloned through partial peptide sequencing, hybridization screening of a human cDNA library, and PCR amplification, as described in detail under "Experimental Procedures." A plasmid excised in vivo from a positive clone in the lambda ZAP library (Stratagene) is called pBSp40-5 and carries a 1245-bp insert. Starting at the 5'-proximal ATG, which lies about 290 bp downstream from the 5'-end, the insert encodes a hypothetical 29.5-kDa protein containing all five peptide sequences shown in Table I for p40. In vitro transcription/translation of the insert generates only a 30-kDa product upon SDS-PAGE (data not shown), presumably from this 5'-proximal ATG, although the sequence upstream from the ATG contains no in frame stop codon. This suggests that the cDNA lacks 5'-sequences encoding the N terminus of p40. As with eIF3-p47, the missing N-terminal codons based on other EST sequences were supplied by PCR amplification of pBSp40-5 DNA as described under "Experimental Procedures" to yield pBSp40N, which carries the full-length coding region for p40.

The insert in pBSp40N contains an ORF that encodes a 352-amino acid protein with a calculated mass of 39,905 Da, consistent with the size of p40 upon SDS-PAGE. The 1280-bp insert in pBSp40N contains a 5'-UTR of 5 nucleotides and a 3'-UTR of 219 bp, the latter with a polyadenylation signal AATAAA at position 1216, followed by a string of 57 A residues. A Northern blot signal at 1.3 kb with a pBSp40-5 probe (results not shown) indicates that the insert is nearly full-length.

That pBSp40N encodes the 40-kDa subunit of eIF3 is shown by experiments similar to those for eIF3-p66 and eIF3-p47. Affinity-purified antibodies prepared from crude serum with recombinant p40 expressed in E. coli recognize a 40-kDa band in gels with purified eIF3 and a HeLa lysate (Fig. 2A, lanes 5 and 6). In vitro transcription/translation generates a single radiolabeled protein that comigrates with eIF3-p40 upon SDS-PAGE (Fig. 2B, lane 2). Furthermore, p40 is present in immunoprecipitates formed with anti-p170 antibodies (Fig. 3B). The results provide strong evidence that pBSp40N encodes eIF3-p40 and that this protein is in a complex with eIF3-p170.

eIF3-p40 and eIF3-p47 Are Members of a Novel Mov-34 Protein Family

A BLAST search for proteins similar to eIF3-p40 and eIF3-p47 indicates that their N-terminal halves are not only similar to each other (Fig. 5B) but also are related to that of a mouse protein called Mov-34. In particular, eIF3-p47 is 29% identical to this protein throughout the coding region except at the very N-terminal alanine/proline-rich region of ~70 amino acids. The human homolog of Mov-34 has been identified as the S12 subunit of the 26 S proteasome (25, 26).

Although there are reports on four additional Mov-34-related proteins from humans, C. elegans, S. cerevisiae, and Schizosaccharomyces pombe (27-29), the extent of their sequence similarity to Mov-34 is very weak, and relationships between members of this family of proteins were not obvious. Thus, additional related proteins were sought in the data base to better establish relationships between them. One and two hypothetical proteins were identified from C. elegans and S. cerevisiae, respectively. In addition to these three proteins and the eight Mov-34 homologs discussed above, eight more groups of ESTs were obtained, each of which seems to encode a distinct protein species of this family. Among these, two A. thaliana EST DNAs were provided by the Arabidopsis Biological Resource Center (Columbus, OH), sequenced, and found to encode different Mov-34 homologs of 32- and 35-kDa that are most similar to eIF3-p47 and the 26 S proteasome S12 subunit, respectively (see "Experimental Procedures" for details). Two other human EST groups collectively encode 34- and 38-kDa Mov-34 homologs (see also "Experimental Procedures"). Recently, the human 38-kDa Mov-34 homolog has been identified as the JAB1 transcriptional coactivator (30).

We then conducted a multiple sequence alignment of all 19 proteins, and the resulting evolutionary tree is depicted in Fig. 6A. We refer to the entire group as the Mov-34 family, since Mov-34 was the first characterized member of the family (31). These analyses define a family of proteins peculiar to eukaryotes and indicate that there are two major evolutionary branches of the Mov-34 family, one containing (among others) eIF3-p47 and the second containing eIF3-p40 and others. Similarity plots of the sequences of these proteins led to the identification of six conserved regions in the N-terminal half of the proteins (Fig. 6B). Comparison of the sequences in these six regions shows the extent of the sequence conservation (Fig. 6C) and reinforces the notion that all are members of the same family of proteins.


Fig. 6. eIF3-p47 and eIF3-p40 are members of a novel Mov-34 protein family. Panel A, phylogenetic relationships among 19 Mov-34 family proteins. The human eIF3-p40 (protein 11) and eIF3-p47 (protein 2) sequences are compared with 17 related sequences described under "Experimental Procedures." The evolutionary tree was constructed with the Pileup program (GCG). Other sequences used and their GenBankTM or SwissProt accession numbers are as follows: 1) A. thaliana 32-kDa Mov-34 homolog (U54561); 3) C. elegans CeD2013  7 (Z47808); 4) human S12 subunit of 26 S proteasome (998688); 5) mouse Mov-34 (P26516); 6) A. thaliana 35-kDa Mov34 homolog (U54560); 7) C. elegans EST (M79627); 8) S. cerevisiae YOR261c (Z75169); 9) human 34-kDa Mov34 homolog (U70735); 10) C. elegans EST (D74615); 12) A. thaliana EST (N96623); 13) S. cerevisiae MPR1 (P43588); 14) S. pombe PAD1 (P41873); 15) C. elegans CeF37A4_5 (P41883); 16) C. elegans EST (D75207); 17) human 38-kDa Mov34 homolog (U70734), or JAB1 transcription coactivator; 18) S. cerevisiae D0888 (X99000); and 19) human c6-1a (P46736). A proposed subgroup name is given on the right. Panel B, similarity plot of the aligned protein sequences shown in panel A. The output file, generated from the Mov-34 family of proteins by the Pileup program, was analyzed by the Plotsimilarity program (GCG). The relative similarity scores are plotted as the function of residue positions in the aligned sequences. Six prominent peaks are named I-VI. Panel C, six conserved regions of the Mov-34 family of proteins. The 19 protein sequences in the region encompassing peaks I-VI are aligned. The numbers representing each protein in the Mov-34 family are the same as those given in panel A. Residues that occur more than five times at a certain position are highlighted with a black box or in boldface type. The number of residues separating the conserved regions is shown between the regions. A consensus sequence comprising residues that are highlighted is given at the bottom (C). The residues in the consensus sequence are shown in boldface type when there are fewer than two exceptions.

[View Larger Version of this Image (57K GIF file)]


It is noteworthy that yeast S. cerevisiae has only three members of the Mov-34 family despite knowledge of the complete genome sequence, and none of these lies in either the eIF3-p47 or eIF3-p40 subgroup (Fig. 6B). Besides the A. thaliana and C. elegans homologs already mentioned, a protein encoded by the fruit fly D. melanogaster (AA263296, AA246922, AA246857, AA439766, AA391268), the parasitic nematode Brugia malayi (AA257575, AA032112), T. gondii (N82268), and Zea mays (T18442) belongs to the eIF3-p40 subgroup; and proteins encoded by zebrafish (H56800), D. melanogaster (AA264189), the blood fluke S. mansoni (L47038 and L47008), and rice (D24940 and D15411) belong to the eIF3-p47 subgroup. Therefore, these sequences probably encode eIF3-p40 or p47 subunits. It appears that proteins of the Mov-34 family may have diverged in higher multicellular eukaryotes to serve at least two or more functions, one being to act as subunits of eIF3.


DISCUSSION

Identification of eIF3 Subunit Polypeptides

The cloning and characterization of cDNAs encoding human eIF3-p66, eIF3-p47, and eIF3-p40 are reported here. The following facts support the authenticity of the cloned coding regions. The calculated masses of the proteins encoded by eIF3-p66 and eIF3-p40 cDNAs (63.9 and 39.9 kDa) are consistent with their assigned masses measured by SDS-PAGE (66 and 40 kDa). In vitro transcription/translation of the two cDNAs generates polypeptides that migrate in SDS-PAGE at precisely the same positions as the corresponding subunits in highly purified eIF3 (Fig. 2B). Although the calculated mass of eIF3-p47 (37.5 kDa) is smaller than the mass assigned by SDS-PAGE (47 kDa), the polypeptide synthesized in vitro from p47 cDNA comigrates in SDS-PAGE with the authentic eIF3-p47 subunit. Thus, it appears that the p47 subunit migrates anomalously in SDS-PAGE, possibly due to the high proline content in the N-terminal region. Antibodies from a crude goat antiserum raised against purified rabbit eIF3 that were affinity-purified against recombinant p66, p40, and p47 specifically recognize the corresponding proteins in purified eIF3 (Fig. 2A). Furthermore, one, five, and three internal partial amino acid sequences of peptides obtained, respectively, from the p66, p40, and p47 subunits of purified rabbit or HeLa eIF3 match amino acid sequences deduced from the corresponding cloned cDNAs.

Analysis of the sequences of eIF3-p66, -p47, and -p40 by the PROSITE program (GCG) identifies numerous possible sites of post-translational phosphorylation by protein kinase C, casein kinase II, and protein kinase A. Regulation of eIF3 activity by phosphorylation is an intriguing possibility that is being investigated currently. In addition, these sequences contain sites for glycosylation and myristoylation, but there is no evidence that such modifications are found in eIF3 subunits.

The sequence of p66 does not contain any obvious RNA-binding motif (Fig. 1). We mapped the RNA-binding domain of this protein to a 33-residue region (residues 86-118) within a 71-amino acid N-terminal hydrophilic segment that is rich in arginine and lysine (Fig. 4). The fact that the corresponding region of the C. elegans homolog R08D7_3 also is rich in these residues but shows little sequence similarity to the human p66 RNA-binding domain supports the inference that the positively charged nature of this domain contributes to its RNA binding activity. Previously characterized RNA-binding domains including the RGG repeat and arginine-rich motif are rich in positively charged residues (32). Another unclassified RNA-binding domain rich in positively charged residues is found in eIF4B (33, 34). eIF3-p66 and its C. elegans homolog also carry a prominent acidic tail region (Fig. 1). This feature, comprising a negatively charged region together with an RNA-binding domain, is reminiscent of eIF1A (20, 35) and of some of the ribosomal proteins (36). Indeed, eIF3-p66 was shown to cross-link to 18 S rRNA in a complex formed with the 40 S ribosome (37).

In searching the data bases for proteins homologous to human eIF3-p40 and eIF3-p47, we found that these proteins are members of a novel Mov-34 family of proteins (Fig. 6). Since members of this family are involved in diverse functions involving multisubunit complexes, perhaps they function in macromolecular assembly. The amount of p47 appears to be less than other subunits in several independently purified eIF3 preparations (Figs. 2A and 3A), suggesting that only a fraction of eIF3 complexes contains both the p40 and p47 subunits. This issue is being addressed by construction of eIF3 complexes with subunits carrying affinity tags, followed by rapid and gentle purification. However, the occurrence of related subunits in a single multiprotein complex is not unexpected even among translation initiation factors, since the guanine nucleotide exchange factor eIF2B contains three subunits that are similar to one another in sequence and constitute a regulatory domain in eIF2B (38).

Characterization of mammalian eIF3 proteins previously has been limited primarily to immunochemical and gel electrophoretic analyses, which have led to the identification of eight subunits (15, 23, 39). Recent progress in cDNA cloning (see the Introduction), partial peptide sequencing, and analyses utilizing EST data bases (Table I) have established that eIF3 comprises 10 nonidentical subunits. Two important revisions were made concerning the p115 and p47 bands observed upon one-dimensional SDS-PAGE. Recent experiments show that the p115 band actually consists of two proteins, p110 and p116, which are similar to the yeast eIF3 subunits NIP1 and PRT1, respectively (8, 9). Although p47 is strongly antigenic and has elicited high titer antibodies in our goat anti-eIF3 antiserum, its amount in the eIF3 complex sometimes appears lower than other subunits, a situation masked by the presence of a previously unrecognized 48-kDa subunit in the 47-kDa band. This protein, called p48, has been identified as the Int-6 oncoprotein and is entirely different from p47 and other eIF3 subunits.2 The p116, p110, p66, p48, p47, p40, and p36 subunits of eIF3 all coimmunoprecipitate with affinity-purified antibodies to the p170 subunit (9) (see also Footnote 2 and Fig. 3). On the other hand, eIF3-p35, which is strongly recognized by anti-eIF3 antibodies, is either loosely or not at all associated with the eIF3 complex, since this putative subunit does not co-immunoprecipitate with anti-p170 antibodies (see Fig. 3 and Ref. 9).

Conservation of eIF3 in Eukaryotes

Table III summarizes possible eIF3 subunit proteins that are found in the data base. The subunit composition of eIF3 may be<