A proteasome from the methanogenic archaeon Methanosarcina thermophila.

A 645-kDa proteasome was purified from Methanosarcina thermophila which had chymotrypsin-like and peptidylglutamyl-peptide hydrolase activities and contained α (24-kDa) and β (22-kDa) subunits. Processing of both subunits was suggested by comparison of N-terminal sequences with the sequences deduced from the α- and β-encoding genes (psmA and psmB). Alignment of deduced sequences for the α and β subunits revealed high similarity; however, the N-terminal sequence of the α subunit contained an additional 24 amino acids that were not present in the β subunit. The α and β subunits had high sequence identity with α- and β-type subunits of proteasomes from eucaryotic organisms and the distantly related archaeon Thermoplasma acidophilum. The psmB gene was transcribed in vivo as a monocistronic message from a consensus archaeal promoter. The results suggest that proteasomes are more widespread in the Archaea than previously proposed. Southern blotting experiments suggested the presence of ubiquitin-like sequences in M. thermophila.

A 645-kDa proteasome was purified from Methanosarcina thermophila which had chymotrypsin-like and peptidylglutamyl-peptide hydrolase activities and contained ␣ (24-kDa) and ␤ (22-kDa) subunits. Processing of both subunits was suggested by comparison of N-terminal sequences with the sequences deduced from the ␣and ␤-encoding genes (psmA and psmB). Alignment of deduced sequences for the ␣ and ␤ subunits revealed high similarity; however, the N-terminal sequence of the ␣ subunit contained an additional 24 amino acids that were not present in the ␤ subunit. The ␣ and ␤ subunits had high sequence identity with ␣and ␤-type subunits of proteasomes from eucaryotic organisms and the distantly related archaeon Thermoplasma acidophilum. The psmB gene was transcribed in vivo as a monocistronic message from a consensus archaeal promoter. The results suggest that proteasomes are more widespread in the Archaea than previously proposed. Southern blotting experiments suggested the presence of ubiquitin-like sequences in M. thermophila.
Proteasomes are prevalent in eucaryotes (Eucarya domain) where they have at least three distinct endopeptidase activities which include hydrolysis of peptide bonds on the carboxyl side of hydrophobic, basic, and acidic amino acid residues (chymotrypsin-like, trypsin-like, and peptidylglutamyl-peptide hydrolyzing activities, respectively) (1). It is proposed that the 20 S eucaryotic proteasome is the "catalytic core" in a larger 26 S complex that degrades proteins labeled with ubiquitin in an ATP-dependent process (1).
A 20 S proteasome from the archaeon Thermoplasma acidophilum has been extensively characterized. The thermoplasma proteasome has ␣ and ␤ (25.8-and 22.3-kDa) subunits with significant identity to the sequences of all described eucaryotic 20 S proteasomes (2,3). The quaternary structure is also highly conserved with the eucaryotic proteasome (4). The fully assembled thermoplasma proteasome is a barrel-like structure of four stacked rings (5). Each of the two inner rings are comprised of 7 ␤ subunits, and each outer ring contains 7 ␣ subunits (5). Assembly of the proteasome proceeds by formation of the ␣ ring which is proposed to "chaperon" assembly of the ␤ subunit ring (6). The integrity of the N-terminal region of the ␣ subunit is necessary for proper assembly of rings. The ␤ subunits are synthesized with a propeptide that is processed during assembly. Recently, the crystal structure of the thermoplasma proteasome was published (5), and it is proposed that this proteasome is a novel threonine protease (7).
Attempts to demonstrate eucaryotic-like proteasomes from procaryotes (Bacteria and Archaea domains) other than thermoplasma have been unsuccessful (8 -10). The results have led to the suggestion that thermoplasma, an atypical member of the Archaea, is the only procaryote containing proteasomes (10). Recently, an enzyme with a quaternary structure similar to the 20 S proteasome was isolated from the eubacterium Rhodococcus; however, the primary structure of the eubacterial enzyme has low identity to the archaeal and eucaryotic proteasomes (11). Thus, questions remain regarding the distribution of proteasomes in the Bacteria and Archaea domains (procaryotes), as well as the origin and evolution of proteasomes from the Eucarya and Archaea domains (10).
Here we provide evidence for an enzyme highly identical to the eucaryotic-like 20 S proteasome in a representative of methanogenic microbes, the largest group known for the Archaea. These results suggest that proteasomes are more widespread among the Archaea than previously proposed. Although proteolysis is almost certain to be of fundamental importance for methanogenic microbes, the process has not been investigated and the isolation of proteolytic enzymes has not been reported. The discovery of proteasomes in methanogenic microbes marks an entry for investigating the physiology, biochemistry, and molecular biology of protein turnover in this major group of strict anaerobes.

MATERIALS AND METHODS
Proteasome Purification and Characterization-Methanosarcina thermophila was grown on sodium acetate (12). Cells were harvested (10 g wet weight) and resuspended in a 3-fold volume (w/v) of TD buffer (20 mM Tris-HCl, 1 mM dithiothreitol, pH 7.2) and passed through a French pressure cell at 20,000 psi followed by centrifugation for 30 min at 16,000 ϫ g. The cell extract (1250 mg of protein in 30 ml) was applied to a Q-Sepharose (Pharmacia) column (2.5 ϫ 28.5 cm) that was developed with a linear NaCl gradient (0 -400 mM in 600 ml of TD buffer). Peak Q-Sepharose fractions that contained chymotrypsin-like activity were pooled (310 mg of protein), diluted 2-fold to 140 ml, and applied to a Q-Sepharose column (2.5 ϫ 28.5 cm) previously equilibrated with 250 mM NaCl. The column was developed with a linear NaCl gradient (250 -310 mM in 180 ml of TD buffer). The active fractions were pooled (16.5 mg of protein), diluted 2-fold with TD buffer to 112 ml, and then concentrated approximately 20-fold by batch elution with 1 M NaCl from an Econo-Pac High Q column (Bio-Rad). The concentrated protein solution (16 mg of protein) was applied in 1-ml samples to a Sepharose 6B (Pharmacia) gel filtration column (1.2 ϫ 55 cm) equilibrated with 150 mM NaCl in TD buffer. The column, calibrated with high-molecular weight protein standards (Sigma), was also used to estimate the molecular mass of the proteasome complex. In the final step of purification, the active fractions (0.8 mg of protein) were loaded onto a High Q column (Bio-Rad) which was developed with a linear NaCl gradient was obtained. An accurate measurement of peptide hydrolyzing activity at each step of the purification was not possible due to nonspecificity of the assay and interference by the inherent fluorescence of cellular cofactors present in cell extract. However, based on the amount recovered from cell extract, it is estimated that the proteasome constitutes a minimum of 0.02% of the total soluble cell protein.
N-terminal sequencing was as follows. Subunits of the purified proteasome were separated by SDS-PAGE (16) using 10% polyacrylamide. The subunits were electroblotted onto a polyvinylidene difluoride (Immobilon-P) membrane (Millipore) and sequenced by automated Edman degradation at the University of Florida-Interdisciplinary Center for Biotechnology Research protein chemistry core facility.
Cloning and Sequencing-An M. thermophila Sau3A1 genomic library, prepared in phage vector GEM-11 (Promega), was screened with a probe specific for the gene encoding the ␦ subunit of the CO dehydrogenase enzyme complex (cdhD). A 16-kb Sau3A1 genomic fragment, containing the complete cdh operon and the psmB gene, was identified by DNA sequence analysis ( Fig. 1). A manuscript describing the cloning, gene organization, and DNA sequence of the cdh operon is in preparation. 2 The psmA gene was cloned as follows. The above described genomic library was screened with a degenerate oligonucleo- The oligonucleotide was 3Ј-end labeled with digoxigenin-11-ddUTP (2Ј,3Ј-dideoxyuridine 5Ј-triphosphate) (Boehringer Manneheim Biochemicals). Colony/PlaqueScreen nylon membranes (DuPont NEN) were used for plaque lifts according to the manufacturer. Hybridization for plaque screening and Southern blotting was for 12 h at 58°C in 5 ϫ SSC (where 1 ϫ SSC is 0.15 M NaCl and 15 mM sodium citrate, pH 7.5), 0.1% N-lauroylsarcosine, 0.02% SDS, 1% blocking reagent (Boehringer Manneheim). Membranes were washed twice in 2 ϫ SSC, 0.1% SDS at 21°C for 15 min and then twice in 0.5 ϫ SSC, 0.1% SDS at 58°C for 15 min. Digoxigenin-11-ddUTP label was detected colorimetrically according to manufacturer (Boehringer Manneheim). SalI, HincII, PvuII and XmnI restriction enzymes (New England Biolabs) were used to subclone the psmA gene from a 17-kb Sau3AI genomic fragment for DNA sequence analysis.
The DNA sequence of both strands of the psmB and psmA genes was determined according to the dideoxy chain termination method (17). The nucleotide sequence of the M. thermophila psmA and psmB genes have been submitted to the GenBank data base and assigned the accession numbers U30483 and U22157.
Protein Sequence Analyses-GenBank, EMBL and SwissProt data bases were searched at the National Center for Biotechnology Information (Bethesda, MD) using the BLAST network server (18) at the University of Florida-Interdisciplinary Center for Biotechnology Research computing facility in conjunction with the Genetics Computer Group program (Madison, WI) (19). CLUSTAL (20) (28); ScPRS1 (10); MtAlpha and MtBeta (this report).
Primer Extension Assays-The 5Ј end of a psmB-specific mRNA was mapped by avian myeloblastosis virus reverse transcriptase (Promega) primer extension using the oligonucleotide 5Ј-GTACAAACTACTC-CTACG-3Ј complementary to nucleosides ϩ36 to ϩ53 of the psmB gene. The 5Ј end of a psmA-specific mRNA was mapped, using similar procedures, with the oligonucleotide 5Ј-TCCGTCAGGGCTGAAAAC-3Ј complementary to nucleosides ϩ37 to ϩ54 of the psmA gene.
Southern Blotting-Genomic DNA was digested to completion with the restriction endonucleases indicated. The restriction fragments were separated by electrophoresis and transferred onto nylon membranes (Boehringer Manneheim) (33). All other procedures were as described (33) except as follows. The degenerate oligonucleotide used as a psmAspecific probe, as well as the hybridization conditions, were the same as described for cloning the psmA gene. A 259-bp HincII-BamHI fragment from psmB was random primed labeled with digoxigenin-dUTP and used as a psmB-specific probe for hybridization at 51°C according to the supplier's recommendations (Boehringer Manneheim). The wash stringency was 51°C using 0.5 ϫ SSC containing 0.1% SDS. A 0.22-kb BglII-KpnI fragment from plasmid YEP96 (34), specific for a synthetic yeast ubiquitin gene, was random prime labeled with digoxigenin-dUTP and used as a ubiquitin gene probe for hybridization at 45°C (Boehringer Manneheim). The wash stringency was 45°C using 0.5 ϫ SSC containing 0.1% SDS.

Identification of psmA and psmB Encoding the ␣ and ␤
Subunits of a Proteasome Purified from M. thermophila-DNA sequencing downstream of the M. thermophila cdh operon revealed an open reading frame (psmB) (Fig. 1) with a deduced amino acid sequence having high identity (up to 51%) with ␤-type subunits of proteasomes from phylogenetically divergent organisms (Fig. 2). The psmB gene encoded a putative 210-amino acid protein (PsmB) with a calculated anhydrous molecular mass of 22,981 Da. A potential ribosome binding site was identified upstream of psmB (Fig. 6) complementary to the sequence at the 3Ј terminus of the 16 S rRNA in methanosarcina (3Ј-UCCUCCACUA) (37).
A proteasome was purified from M. thermophila which contained 24-(␣) and 22-kDa (␤) subunits (see below). Analysis of the ␣-subunit revealed two N-terminal sequences that were identical except for the length (Fig. 3). A GEM-11 clone bank containing M. thermophila genomic DNA was screened with a degenerate oligonucleotide probe based on the N-terminal sequence. Southern blot analysis of the DNA from a hybridizing phage isolate identified a 980-bp EcoRV-XmnI fragment which hybridized to the probe. Sequence analysis of the fragment identified an open reading frame (psmA) with a putative translational start corresponding to the longer of the two ␣ subunit N-terminal sequences (Fig. 7). The psmA gene encoded a putative 246-amino acid protein (PsmA) with a calculated anhydrous molecular mass of 27,139 Da. PsmA was highly identical (up to 53%) to ␣-type proteasome subunits from phylogenetically diverse organisms (Fig. 3). The DNA sequence downstream of the only consensus ribosome binding site contained three potential translational start sites, two of which corresponded to the N termini determined for the purified ␣ subunit and a third located two codons upstream of psmA (Fig. 7). Two of the potential start sites are separated from the consensus ribosome binding site by 7 and 13 bases which is typical for archaeal genes (40). Although a translational start site corresponding to the shorter of the two N termini (25 bases from the consensus ribosome binding site) cannot be ruled out, it is more probable that PsmA is processed to yield the shorter N-terminal sequence.
Alignment of deduced sequences for the ␣ and ␤ subunits revealed high similarity (46%); however, the N-terminal sequence of the ␣ subunit contained an additional 24 amino acids that were not present in the ␤ subunit (Fig. 4). When the CLUSTAL program was used to generate a dendrogram relating all proteasome subunit sequences currently available, PsmA and PsmB clustered with the ␣and ␤-type subunit group, respectively, in the recently proposed classification scheme (10) and were closest relatives to the thermoplasma proteasome subunits (Fig. 5). No established proteinase motif was identified in either PsmA or PsmB using PROFILESCAN (41), a result which is in agreement with all proteasome subunit sequences to date.
In Vivo Transcription of psmA and psmB-Northern hybridization revealed a psmB-specific transcript of approximately 800 bp in M. thermophila cells grown on acetate, trimethylamine, or methanol (data not shown). The results suggest that proteasomes are produced during growth on all three substrates. Furthermore, psmB is transcribed in vivo as a monocistronic message suggesting that psmA and psmB are not likely to be part of the same operon. Primer extension analysis of total RNA isolated from methanol-or trimethylamine-grown cells identified one major product using an oligonucleotide primer specific for psmB (Fig. 6). The 5Ј end of the mRNA mapped to the G base located Ϫ101 bases relative the predicted translational start site. This putative transcriptional start site was within a 4-bp sequence with 75% identity to the archaeal boxB (ATGC) consensus and began 20-bp downstream of a 8-bp sequence with 87% identity to the boxA (TTTA(T/A)ATA) consensus archaeal promoter (40). In general, transcription of genes from the methanogenic Archaea occurs at a purine-pyrimidine dinucleotide in boxB which is separated by 16 -23-bp downstream from boxA.
Primer extension analysis of total RNA isolated from trimethylamine-grown cells identified two products using a psmA-specific oligonucleotide. The 5Ј ends of the mRNA mapped to the G bases located Ϫ117 and Ϫ48 relative to the predicted translational start site (Fig. 7). No sequences resembling the consensus boxA were at the expected 16 -23-bp upstream of either of the two G bases. Potential stem-loop structures were detected upstream of both 5Ј mRNA ends. Although transcriptional start sites cannot be ruled out, the results are more consistent with mRNA processing at the G bases.
Proteasome Purification and Characterization-A 645-kDa enzyme was purified from M. thermophila which contained ␣ and ␤ subunits of approximately 24-and 22-kDa (Fig. 8). The N-terminal sequence of the ␤ subunit was identical to residues 10 -19 deduced from psmB (Fig. 2) which indicates it is proc- essed to expose an N terminus with an active-site threonine similar to the ␤ subunits of proteasomes from the Eucarya and thermoplasma (7). Analysis of the ␣ subunit revealed two Nterminal sequences corresponding to amino acids 1-25 and 5-22 deduced from psmA (Fig. 3). The purified M. thermophila enzyme displayed chymotrypsin-like activity catalyzing hydrolysis of Suc-Ala-Ala-Phe-AMC (1.2 nmol of 7-amino-4-methylcoumarin min Ϫ1 mg Ϫ1 protein) containing an aromatic residue adjacent to the leaving group (AMC). The enzyme also had peptidylglutamyl-peptide hydrolase activity catalyzing the hydrolysis of Cbz-Leu-Leu-Glu-␤-NA (8.9 nmol of ␤-naphthylamine min Ϫ1 mg Ϫ1 protein) containing an acidic residue adjacent to the leaving group (␤-NA). Little or no activity was detected with Pro-Phe-Arg-AMC which contains an arginine residue adjacent to the leaving group. The absence of trypsinlike activity is a property similar to the 673-kDa thermoplasma proteasome; however, the thermoplasma proteasome does not have significant peptidylglutamyl-peptide hydrolase activity (4). The results demonstrate that the enzyme purified from M. thermophila is a proteasome with the ␣ and ␤ subunits encoded by psmA and psmB.
Evidence for M. thermophila Sequences Similar to the Yeast Ubiquitin Gene-When Southern blots of M. thermophila genomic DNA were probed with a synthetic yeast ubiquitin gene, 3.5-kb PstI and 2.1-kb EcoRI fragments hybridized specifically with the probe (Fig. 9). No hybridization of the ubiquitin gene probe was observed with Escherichia coli genomic DNA (data not shown). These results suggest that the M. thermophila genome contains ubiquitin-like sequences. DISCUSSION The results document a proteasome in M. thermophila representing anaerobic methanogenic microbes, the largest known group in the Archaea. Therefore, proteasomes outside the Eucarya are not restricted to the atypical archaeon thermoplasma as previously suggested (10). T. acidophilum is a thermoacidophile that can grow aerobically utilizing glucose as an energy source and is phylogenetically distant from the strictly anaerobic methanosarcina (42) which obtain energy for growth by converting simple one-and two-carbon substrates to methane at neutral pH. Thus, proteasomes occur in physiologically and phylogenetically diverse Archaea. The results lessen the probability that the thermoplasma proteasome originated by horizontal gene transfer from the Eucarya (10) and supports the proposal that eucaryotic proteasomes evolved from an archaeal predecessor (2). Multiple DNA sequence alignment of the T. acidophilum and M. thermophila ␣ and ␤ proteasomal subunit genes revealed high identity (46 -60%) among all four genes with the highest identity between ␣ subunit genes (not shown). This result suggests that the gene encoding the ␣ subunit is more closely related to an ancestral proteasome gene from which both the ␣ and ␤ subunit genes derived.
Thermoplasma and methanosarcina are classified to the Euryarchaeota kingdom (42  an archaeal proteasome in microbes from the Crenarchaeota kingdom (10); however, additional surveys are necessary before it can be concluded that the archaeal proteasome is unique to the Euryarchaeota.
The availability of ␣ and ␤ subunit sequences for a second proteasome from an archaeon that is physiologically and phylogenetically distant from thermoplasma will allow comparisons to guide site-directed mutagenesis experiments for identification of amino acids involved in assembly, the catalytic mechanism, substrate targeting, regulation, and thermostability of archaeal and eucaryal proteasomes. These experiments will be especially productive because the crystal structure is known for the thermoplasma proteasome (5). Proteasomal ␣ subunits from the Eucarya and thermoplasma (3) display putative NLS and cNLS motifs exemplified by 50 sdKKvR 55 and 202 EEgEElkapE 211 (Fig. 3) for thermoplasma (where upper case letters conform to consensus sequences). The cognate methanosarcina sequences ( 49 vdKRit 54 and 201 EgkfdagtlE 210 ) (Fig. 3) have low identity with the putative thermoplasma NLS and cNLS motifs, suggesting these sequences are not strictly conserved in the archaeal proteasome. It is postulated that either a target for the NLS and cNLS exist in thermoplasma or that these sequence motifs existed prior to evolution of the corresponding eucaryotic receptor (3). Two-dimensional PAGE reveals multiple forms of the thermoplasma ␣ subunit indicating it is modified which alters the pI (43). The thermoplasma ␣ subunit contains the sequence 32 KKGST 36 (Fig. 3) which has identity to cAMP/cGMP-dependent phosphorylation sites, where serine is phosphorylated (3). The thermoplasma ␣ subunit also contains a putative tyrosine autophosphorylation site ( 112 LVKRVADQMQQYTQYGGVRPY 132 ) (Fig. 3), where the underlined tyrosine is phosphorylated. The cognate methanosarcina sequences 31 KRGTT 35 and 111 ISKKICDHKQTYTQYG-GVRPY 131 (Fig. 3) have high functional similarity, suggesting they may be phosphorylation sites as predicted for the thermoplasma and eucaryal ␣ subunits (3). The N-terminal sequence deduced for the methanosarcina ␣ subunit is extended relative to the ␤ subunit (Fig. 4), as is the case for thermoplasma (2). Deletion of this N-terminal extension in the thermoplasma ␣ subunit prevents assembly of the ␣ ring, demonstrating the importance of N-terminal sequences for assembly of an active proteasome (6). The N-terminal sequences may also be involved in recognition of substrates because the ␣ rings are located on the ends of the barrel-like proteasome where the substrate is thought to enter (6). Unlike the proteasome purified from thermoplasma (3), N-terminal sequencing of the methanosarcina ␣ subunit was not blocked. The sequencing revealed two N ter- mini resulting from either dual translational start sites or processing. The significance is unknown; however, it is conceivable that subunits with different N termini may assemble into proteasomes with different substrate specificities.
The proteasome from M. thermophila is the first proteolytic enzyme isolated from methanogenic microbes for which there is considerable biochemical and physiological understanding (44). The discovery of proteasomes in M. thermophila will aid in understanding fundamental properties of the enzyme, including physiological roles in methanogenic microbes and other members of the Archaea. The native molecular mass and subunit composition of the methanosarcina proteasome was comparable to the thermoplasma and eucaryal 20 S proteasome; however, the results do not rule out the existence of a larger complex similar to the eucaryal 26 S complex which degrades proteins labeled with ubiquitin in an ATP-dependent process. Recently, ubiquitin has been identified in T. acidophilum (45). The thermoplasma proteasome degrades partially unfolded and ubiquitin-associated proteins (46), suggesting the possibility of ubiquitin-dependent proteolysis in this microbe. The hybridization of a M. thermophila genomic fragment to a yeast ubiquitin gene probe suggests that homologous sequences are present in this methanogenic microbe. Several physiological roles can be envisioned for ubiquitin-dependent proteolytic pathways in M. thermophila. The methanosarcina are the most versatile methanogenic microbes, having the ability to utilize several growth substrates and regulate the synthesis of catabolic enzymes in response to the growth substrate. Two-dimensional PAGE reveals more than 100 mutually exclusive peptides present in acetate-and methanol-grown M. thermophila (47), including subunits of the CO dehydrogenase enzyme complex which comprise approximately 10% of the cellular protein during growth on acetate (48). Proteolytic pathways for rapid turnover of specific enzymes would be advantageous when cells switch from one substrate to another. The rapid turnover of specific proteins could also be important during periods of stress such as heat shock or exposure to oxygen. All of these potential physiological roles of the archaeal proteasome pose interesting questions for further investigation.