Photolyase of Myxococcus xanthus, a Gram-negative eubacterium, is more similar to photolyases found in Archaea and "higher" eukaryotes than to photolyases of other eubacteria.

We report the identification of the gene encoding a DNA photolyase (phrA) from the Gram-negative eubacterium Myxococcus xanthus. The deduced amino acid sequence of M. xanthus photolyase indicates that the protein contains 401 amino acids (Mr 45,071). By comparison of the amino acid and DNA sequences with those of other known photolyases, it has been found that it is more similar to the deduced amino acid sequences of the photolyases of "higher" eukaryotes than to the photolyases of other eubacteria. Recombinant plasmids carrying M. xanthus phrA rescue the photoreactivation activity of an irradiated strain of Escherichia coli with a deletion in phrA. This rescue is light-dependent.

We report the identification of the gene encoding a DNA photolyase (phrA) from the Gram-negative eubacterium Myxococcus xanthus. The deduced amino acid sequence of M. xanthus photolyase indicates that the protein contains 401 amino acids (M r 45,071). By comparison of the amino acid and DNA sequences with those of other known photolyases, it has been found that it is more similar to the deduced amino acid sequences of the photolyases of "higher" eukaryotes than to the photolyases of other eubacteria. Recombinant plasmids carrying M. xanthus phrA rescue the photoreactivation activity of an irradiated strain of Escherichia coli with a deletion in phrA. This rescue is light-dependent.
Photolyases play an important role in the repair of damage to DNA by ultraviolet radiation (1). The sequences of photolyase genes from numerous organisms, both prokaryotic and eukaryotic, have been reported (for summaries, see Refs. 2 and 3). Although all known photolyases share some similarities in amino acid sequence, they appear to fall into two distinct classes in which the photolyases within a class show strong amino acid sequence similarities but show weak similarity to members of the other class. Photolyases from a number of diverse microorganisms including both fungi and eubacteria constitute one class. The other class includes photolyases from a teleost fish, a marsupial, an insect, and an Archaeum (Methanobacterium thermoautotrophicum) (3).
In this paper, we describe the identification and characterization of phrA from the Gram-negative eubacterium Myxococcus xanthus. M. xanthus is a soil-dwelling eubacterium that has light-inducible carotenoid pigments for protection from photolysis (4). Cells secrete catabolic enzymes to extracellularly digest their prey and macromolecules in their environment. If a population of cells becomes starved for any one of several essential nutrients, aggregates of 10 5 to 10 6 cells are formed in which individual cells differentiate into spores. phrA is linked to the chemotaxis genes of M. xanthus (the frz operon), but is transcribed in the opposite orientation (5,6). The deduced amino acid sequence of this photolyase is similar to the photolyases of other eubacteria in the highly conserved carboxyl terminus domain. However, it is similar to the photolyases of vertebrates and insects throughout the entire protein (3). Nevertheless, the gene is able to rescue photoreactivation activity in a phrA Ϫ mutant of Escherichia coli. The classification of the M. xanthus photolyase with the photolyases of organisms in other kingdoms rather than with photolyases of other eubacteria suggests that the evolution of the two forms of photolyase is ancient.

Strains, Plasmids, and Growth
Conditions-Strains and plasmids are described in detail in Table I. SY2 was a generous gift from A. Yasui (Institute of Development, Aging and Cancer, Tohoku University, Sendai, Japan). E. coli was grown in LB medium (11). M. xanthus was grown in CYE medium (12). Components of growth media were manufactured by Difco. Salts were purchased from Fisher. Other, nonradioactive, chemicals were purchased from Sigma.
Cloning phrA and Recombinant DNA-Standard techniques were used for the construction of recombinant DNA, restriction by endonucleases, and analysis by agarose gel electrophoresis (13). The subcloning of the region of DNA encoding M. xanthus phrA is shown in Fig. 1. Plasmids are described in Table I.
Sequencing of DNA-Single-stranded DNA was purified from M13KO7 lysates of JM107 carrying either pML118 or pMW119 as described in the Promega Lab Manual (Promega, Madison, WI). Singlestranded template DNA was sequenced using a modification of the 7-deaza-dGTP Sequenase kit from U. S. Biochemical Corp. using [ 35 S]deoxycytidine 5Ј-(␣-thio)triphosphate (1000 -1500 Ci/mmol, 12.5 mCi/ml, DuPont NEN). In addition to the reagents supplied by the manufacturer in the Sequenase kit, an extension mix consisting of 7-deaza-dATP, 7-deaza-dGTP, dCTP, and dTTP (180 M each) and 50 mM NaCl was used. The extension mix is equivalent to the extension mix supplied by the manufacturer with the standard Sequenase kit except that the 7-deaza forms of dATP and dGTP are substituted for the standard nucleotides. 1.0 l of extension mix was used with 1.5 l of termination mix from the 7-deaza-dGTP Sequenase kit. The idea to include both 7-deaza-dATP and 7-deaza-dGTP in the sequencing reaction came from the catalog description of a sequencing kit manufactured by Pharmacia Biotech Inc. that is not available in the United States. After the termination reaction, terminal transferase (Promega) was added according to the manufacturer's instructions with dATP at a final concentration of 1 mM for 30 min at 37°C (14). 17 * This work was supported by National Institutes of Health Grant GM20509. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EMBL Data Bank with accession number(s) U44437. was used to align overlapping DNA sequences for both strands of the DNA.
Determination of Codon Adaptation Index-The codon adaptation index (CAI) 1 for open reading frames was determined as described by Sharp and Li (19). A compilation of the codon usage of 22 protein coding regions of M. xanthus (20) was used to calculate the relative adaptiveness of each codon (w): among synonymous codons, the number of times that each is used is divided by the number of times that the optimum codon is used. For example, GAU and GAC are synonymous for aspartate. In the Shimkets's compilation, GAU is used 54 times while GAC is used 319 times. The values for the relative adaptiveness of each codon are therefore 0.17 and 1, respectively. The optimum synonymous codon always has the value of 1. The CAI of the protein coding region is equal to the product of the relative adaptiveness of each codon used in the protein raised to the power of 1 divided by the number of codons in the protein. For a protein of 500 amino acids, CAI ϭ (w 1 ϫ w 2 ϫ . . . w 500 ) 1/500 (see Equation 6 in Ref. 19).
Alignment of Similar Protein Sequences-MACAW (see above) was used to align protein sequences identified by BLAST to be similar to M. xanthus photolyase. Statistical parameters used are described in figure legends and text. Further modifications of the alignment were made by the researchers according to their best judgment.
Prediction of Protein Secondary Structure-The program Peptidestructure (21) of the GCG (Genetics Computer Group, Madison, Madison Wisconsin) analysis package was used to predict the structure of E. coli and M. xanthus photolyases.
Photoreactivation-Photoreactivation was assayed in a manner similar to that described previously (2). SY2 strains were grown overnight under selection in LB medium containing chloramphenicol (25 g/ml) and ampicillin (100 g/ml). The overnight culture was diluted 1:100 into LB and shaken vigorously at 37°C. Cells were harvested during exponential growth by centrifugation at 3,000 ϫ g, at 4°C, for 5 min. Cells were resuspended at 10 7 cells/ml in M9 medium (11). Aliquots (1.0 ml) of cells were placed in 50-mm plastic tissue culture dishes (Falcon 3002). The dishes were placed in a UV Stratalinker 1800 (Stratagene, La Jolla, CA), the lids were removed, and the cells were irradiated with ultraviolet light (254 nm) at the power setting described in the text. The cells were further incubated for 1 h in darkness or exposed to ambient light from fluorescent illumination on a laboratory bench to allow photoreactivation to occur. After 1 h of incubation, cells were diluted in M9 and plated on LB agar (1.5%). Plates were incubated at 37°C overnight in the dark. Colonies were counted to determine recovery from UV irradiation.
Primer Extension and Identification of the Start Site for Transcription-Total RNA was purified from M. xanthus DZF1 in exponential growth phase using 6 M guanidinium isothiocyanate, 50 mM Tris, pH 6.8, 0.5% Sarkosyl, and 1 mM ␤-mercaptoethanol (22). Primer extension was performed using an avian myeloblastosis virus reverse transcriptase primer extension kit (Promega). The primer (GCTGGGC-CAGCCGGGGA) is complementary to the mRNA at positions 700 -716 of Fig. 2. In lieu of 5Ј-end-labeling of the primer, 6 Ci of [ 35 S]deoxycytidine 5Ј-(␣-thio)triphosphate were added to the extension reaction.

Sequencing and Identification of Open Reading Frames-
DNA was sequenced and open reading frames were identified as described under "Materials and Methods." The sequence shown in Fig. 2 has a GϩC content of 68 mol%, which is consistent with previous observations of 67-71 mol% for M. xanthus DNA (for a review, see Ref. 20). Organisms with this level of GϩC in their genomes tend to use codons that have a G or C residue in the third position (23). Indeed, Shimkets's (20) analysis of 22 protein coding regions of M. xanthus demonstrated that 91% of the codons contain a G or C residue in the third position. The first position is 70% GϩC, the second position is 47% GϩC. These observations make it possible to predict which open reading frames are likely to be translated by determining codon usage and GϩC content at the three positions. An additional analysis, Codon Adaptation Index, has been described for determining the likelihood that a protein is highly expressed (19,24). The codon usage table compiled by Shimkets (20) was used to determine the relative adaptiveness of each codon for M. xanthus (Table II). The eight open reading frames of Ն100 amino acids were analyzed for GϩC content at each position of the codon, and the CAI was determined (Table  III). The analyses suggest that only one of the open reading frames is likely to be a protein coding region. The amino acid sequence for this open reading frame was compared to the protein data banks using the BLAST algorithm with the BLO-SUM62 substitution matrix (15,16) as described under "Materials and Methods." Open reading frame 7 (ORF7) has strong similarity to photolyases of a number of organisms with greater similarity to the photolyases of Methanobacterium, Vertebrata, and Insecta than with Eubacteria (Table IV).
Demonstration of Photoreactivation Activity in phrA Ϫ E. coli 1 The abbreviations used are: CAI, codon adaptation index; ORF, open reading frame; MTHF, 5,10-methenyltetrahydrofolylpolyglutamate.  (10) into pUC118 and pUC119 for singlestranded DNA sequencing (9). Note that the lacZ and phrA promoters are both in a "clockwise" orientation in pBB12 and pML118. The two promoters are in opposite orientations in pMW119.  Fig. 2. Primer extension of total RNA from M. xanthus shows that the start site for the phrA mRNA is at nucleotide 486 (data not shown) and lies within the photolyase open reading frame (Fig. 2). A promoter that is consistent with the consensus sequence and the proper spacing for 70 promoters of M. xanthus was inferred to have a Ϫ35 region beginning at nucleotide 452 and a Ϫ10 region beginning at nucleotide 477 (Table V). The first translation start codon accompanied by a ribosome binding site occurs in the cluster MMDM beginning at nucleotide 619. The putative ribosome binding site, GGAGG, begins at nucleotide 613 (Fig. 2). This ribosome binding site is consistent with the known 3Ј sequence of M. xanthus 16 S RNA (3ЈHO-UCUUUCCUCCACUA; Ref. 34) and is the same as that suggested for M. xanthus dsg (35). Consensus spacing of ribosome binding sites with the start of translation suggests that the start of translation would begin with the second AUG after the ribosome binding site (within approximately 10 bases of the ribosome binding site; Ref. 36). These

Photolyase of M. xanthus, a Gram-negative Eubacterium
assumptions predict that M. xanthus photolyase would have a length of 401 amino acids and be 45,071 daltons in size. Fig. 3 shows the alignment of M. xanthus to six other pho-tolyases to which BLAST found similarity (Table IV): the four photolyases to which it is most similar, as well as the photolyase of Streptomyces griseus (the eubacterial photolyase to

TABLE V 70 Promoters of Myxococcus xanthus
The promoter sequence for phrA was determined by inspection of the region upstream of the nucleotide identified by primer extension as the start site of the messenger RNA. Other promoter sequences were obtained from the literature: lon (29), lonD (30), gufA (4), vegA (31), tps (26), MAEP ORF (32), and frzA and frzZ (K. G. Trudeau, M. J. Ward, and D. R. Zusman, submitted for publication).

Gene
Ϫ35 region Ϫ10 region Photolyase of M. xanthus, a Gram-negative Eubacterium which BLAST found it to be most similar) and the photolyase of E. coli for which the crystal structure has recently been published (37). As predicted by BLAST, M. xanthus photolyase is more similar to the four photolyases of "higher" eukaryotic origin than to the two eubacterial photolyases. Note that the region of similarity at the amino terminus extends into the beginning of the open reading frame of M. xanthus which is unlikely to be translated. No DNA homology was detected in  Table IV). Accession numbers for the different genes are given in the legend to Table IV. Amino acids are denoted by the single-letter code. Identical amino acids are denoted a white letter on a black background.
Non-identical, conservative changes (BLOSUM62 values Ն 1) are denoted by a white letter on a gray background. The putative first translated methionine in M. xanthus photolyase is underlined. The structure of E. coli is from (37): open rectangles are ␣-helices, black rectangles are ␤-sheets, hatched rectangles are 3 10 helices. Residues involved in FAD binding are marked with aˆbelow the residue. Ͼ indicates a residue involved in binding FAD through H 2 O. q denotes residues that form hydrogen bonds with MTHF. E denotes residues which interact with MTHF through H 2 O.  Fig. 3 is expressed relative to the 468 amino acids of ORF7 (Fig. 2). Conservation was determined as described in Fig. 3 The GϩC mol% of the coding region for PhrA as well as for the region shown in Fig. 4 were determined using DNA Inspector Ile.
Photolyase of M. xanthus, a Gram-negative Eubacterium this region (MACAW, data not shown). Based upon the alignment shown in Fig. 3, M. xanthus photolyase was found to be 15-16% identical to the eubacterial photolyases while it is 30% identical to the eukaryotic photolyases shown (Table VI). As shown in Fig. 3, the carboxyl terminus is the most highly conserved region of photolyase. An analysis of the DNA in this region using MACAW finds that among these seven distantly related organisms there is still a conserved region of homology among the eukaryotic-Methanobacterium group of photolyases and the photolyase of M. xanthus (Fig. 4). MACAW also found significant homology between S. griseus and E. coli DNAs in the region shown in Fig. 4. However, MACAW did not detect significant homology between M. xanthus and either of the other two eubacterial DNAs. The homology among the group including M. xanthus persists even though the GϩC content of the other genes relative to M. xanthus is markedly different (Table VI). It has been suggested that the differences between this group of photolyases and the eubacterial photolyases in this region of the DNA arose through a deletion (2).

DISCUSSION
In this paper we describe an open reading frame, ORF7, which encodes a DNA photolyase (phrA) from M. xanthus. Although the M. xanthus photolyase is not significantly similar to that of E. coli, the cloned gene does, nevertheless, rescue photoreactivation in E. coli.The increased sensitivity of E. coli SY2 (phrA Ϫ ) to UV irradiation (as demonstrated by the lower dosage of UV radiation need to kill 99.99% of the cells) when expressing M. xanthus photolyase is consistent with previous observations (38). The difference in the efficiency of photoreactivation among the three plasmids may be due to the orientation of phrA relative to the lacZ promoter of the vectors; the orientation of ORF7 in pMW119 is such that an antisense mRNA could be transcribed from the lacZ promoter, leading to a reduction in the effective expression of the protein. Alternatively, since the 70 promoter identified by inspection of the DNA sequence is unlikely to function efficiently in E. coli, it may be that increased efficiency of photoreactivation by the pBB12 and pML118 clones is due to transcription of phrA from the lacZ promoter of the vector (Fig. 1).
The crystal structure of photolyase from E. coli has recently been published (37). When the regions of similarity of the M. xanthus photolyase were compared to the structure of E. coli photolyase, it was found that 8 of the 13 amino acids involved in the FAD binding site are identical. An additional 3 of the 13 are conserved (Fig. 3). This suggests that the FAD binding site of photolyases has been conserved throughout evolution. We do FIG. 4. DNA homology in the carboxyl terminus of photolyases from diverse organisms. MACAW was used to align the DNA sequences of the seven photolyases shown in Fig. 3. Significant homology was detected among Monodelphis domestica, Drosophila melanogaster, Carassius auratus, M. thermoautotrophicum, and M. xanthus. MACAW did not detect homology between this group and either of the other two eubacterial sequences. MACAW did detect homology between E. coli and S. griseus in this region. Sequences are aligned in accordance with the alignment of the amino acid sequence in Fig. 3. Accession numbers for the genes are given in the legend to Table IV. not know if M. xanthus photolyase belongs to the group of photolyases, including that from E. coli, which use 5,10-methenyltetrahydrofolylpolyglutamate (MTHF) as the second chromophore. Therefore, the state of conservation of the MTHF binding site is unclear.
GCG Peptidestructure (see "Materials and Methods") was used to predict the structure of M. xanthus and E. coli photolyases. It correctly predicted the pattern of alternating ␣-helices and ␤-sheets in E. coli photolyase (37) and suggested that the pattern is conserved in M. xanthus photolyase (data not shown). The program did less well in predicting the structure of the carboxyl terminus of E. coli photolyase; therefore, it is not possible to predict the degree of conservation of structure in this region of the protein. However, the greater degree of conservation of amino acids in this region suggests that structure is likely to be conserved here as well. Indeed, there is homology among the DNAs in this region among the photolyases most closely related to that of M. xanthus (Fig. 4).
The conservation of an untranslated amino terminus of the open reading frame containing phrA was unexpected. MACAW and visual inspection detect no significant homology of the DNAs (data not shown). The evidence that this region is untranslated is strong. (i) Primer extension shows the beginning of the mRNA to be internal to this region; (ii) there is a good promoter associated with the start of transcription, but there is no other identifiable promoter 5Ј to this region; and (iii) the first potential ribosome binding site associated with this region is associated with an appropriate start codon (ATG) at amino acid 68 of ORF7 (Fig. 2). What is the selective pressure for this conservation? Perhaps there has been an evolutionarily recent rearrangement in the 5Ј region of the M. xanthus gene with very little drift in the region that is now the promoter and no longer translated. The pattern of alternating ␣-helices and ␤-sheets is conserved in the untranslated region (GCG Peptidestructure (see above)), further suggesting that this region was recently under selection.
Codon usage by phrA is consistent with that observed for other M. xanthus genes (20). It was hoped that the CAI would enable us to estimate if the amount of photolyase in the cell was regulated by the use of rare codons as described for E. coli (19,39). However, the data base for determining the CAI for M. xanthus open reading frames is not large enough to be used as an indicator of protein abundance. Nevertheless, a score of 0.4 or less correlated with open reading frames unlikely to be expressed based upon codon usage; while a score of 0.7 correlated with open reading frames likely to be expressed based upon codon usage. The CAI for M. xanthus phrA in E. coli is 0.3, suggesting that its abundance in E. coli would be similar to that of LacI with a CAI of 0.296 (19). In comparison, analysis of the published codon usage of E. coli phrA (39) using the relative adaptiveness of codons in E. coli (19) gives a CAI of 0.281 for E. coli phrA in E. coli.
The observation that, based upon amino acid similarity, the photolyase of M. xanthus belongs in a class containing none of the other sequenced eubacterial photolyases was surprising. It suggests that the divergence of the two classes of photolyase from a parent gene occurred in the evolutionarily distant past. Both classes of photolyase are found in the Archaea as well as the eubacteria (the photolyase of Halobacterium halobium belongs to the class containing eubacterial and fungal photolyases; Refs. 4 and 33). In all likelihood, the "eukaryotic" form of photolyase is more common among the eubacteria than has been observed to date.