An iron-sulfur cluster in the family 4 uracil-DNA glycosylases

: The 25-kDa Family 4 uracil-DNA glycosylase (UDG) from Pyrobaculum aerophilum has been expressed and purified in large quantities for structural analysis. In the process we observed it to be colored and subsequently found that it contained iron. Here we demonstrate that P. aerophilum UDG has an iron-sulfur center with the EPR characteristics typical of a 4Fe4S high potential iron protein. Interestingly, it does not share any sequence similarity with the classic iron-sulfur proteins, although four cysteines (which are strongly conserved in the thermophilic members of Family 4 UDGs) may represent the metal coordinating residues. The conservation of these residues in other members of the family suggest that 4Fe4S clusters are a common feature. Although 4Fe4S clusters have been observed previously in Nth/MutY DNA repair enzymes, this is the first observation of such a feature in the UDG structural superfamily. Similar to the Nth/MutY enzymes, the Family 4 UDG centers probably play a structural rather than a catalytic role. iron-sulfur


INTRODUCTION
Uracil-DNA glycosylases are ubiquitous DNA repair enzymes responsible for the excision of uracil bases from DNA as the first step in a base excision repair pathway. Uracil arises in DNA either as a result of the hydrolytic deamination of cytosine residues in G:C base-pairs (1), or due to incorporation of deoxyuridine monophosphate (instead of thymidine monophosphate) opposite adenine during DNA replication (2). If left uncorrected, the former process would cause G:C to A:T transition mutations (1), while the latter could result in the disruption of specific regulatory DNA-protein interactions (3).
Hyperthermophilic organisms are at especially high risk of DNA damage by cytosine deamination, which is significantly enhanced by elevated temperature (4). Since hyperthermophiles do not exhibit any greater susceptibility to this type of damage they presumably possess more effective repair enzymes (5). However, despite the detection of UDG activity in several hyperthermophiles (6) no sequences homologous to the archetypal E.coli ung-encoded enzyme were initially apparent in archaeal genomes. Subsequently, UDGs were identified in hyperthermophilic eubacteria and archaea (7)(8)(9) with more obvious homology to a second family of uracil base-excision repair enzymes typified by the human thymine DNA glycosylase (TDG) (10) and the bacterial MUG (11). These G:T/U mismatchspecific enzymes (Family-2) are structurally and mechanistically related to the UNG-type UDGs (Family-1) (12,13) and unite the UNG-type and thermophile enzymes (Family-4) into a uracil-DNA glycosylase superfamily (14). Pyrobaculum aerophilum is a hyperthermophilic archaeon isolated from a boiling marine water hole, and growing optimally at 100°C and pH 7.0 (15). A fosmid-based genomic map of the 1.7 Mb P.aerophilum genome was constructed and used to identify 474 putative genes (16), but no homologues of the UNG or MUG/TDG UDG families were initially identified.
Following the identification of TmUDG, a novel UDG weakly related to E.coli MUG, in the thermophilic eubacterium Thermotoga maritima, a homologous ORF was identified in P.aerophilum encoding a new protein (designated PaUDG) with significant homology to TmUDG (9). Here we show PaUDG to be an iron-sulphur protein with the characteristics of a 4Fe4S High Potential Iron Protein centre (HIPIP). Comparison of amino acid sequences and molecular modelling identified residues constituting the iron-sulphur cluster, and suggests this to be a common, though not universal, structural feature of the Family-4 UDGs.

EXPERIMENTAL PROCEDURES Expression and purification of Pa-UDG
Pa-UDG was expressed in E.coli strain BL21 (DE3) pLysS from plasmid pET28-Pa-UDG essentially as described (9), with an N-terminal His 6 tag. The cell pellet was resuspended in buffer A (50 mM Tris pH 8, 100 mM NaCl, 10% glycerol), supplemented with 'Complete' EDTA free protease inhibitor cocktail (Roche), and stored at -20ºC. Cells were lysed by thawing, followed by a brief sonication on an ice / ethanol slurry (15 x 9s bursts with 9s cooling between bursts). The lysate was clarified by centrifugation at 50000xg and the supernatant was then incubated for 5 minutes at 80ºC to denature and precipitate the thermolabile E.coli proteins. The sample was cooled on ice, clarified by centrifugation at at Hauptbibliothek Universitaet Zuerich Irchel. Bereich Forschung on March 10, 2014 http://www.jbc.org/ Downloaded from -5 -50000xg then loaded onto a 5 ml Ni-NTA column pre-equilibrated in buffer A. The flowthrough was discarded, as was a subsequent 10 column volume wash of buffer A supplemented with 10 mM imidazole. Pa-UDG was eluted in 5 column volumes of buffer A supplemented with 300 mM imidazole. The sample fractions were identified in the first instance by SDS-PAGE analysis (15% acrylamide), and subsequently by their yellow colour.
Sample fractions were pooled, and their volume reduced (if required) to 10 ml by concentration in a Centriprep 20 spin concentrator (5 kD cut off) (Amicon). The sample buffer was then exchanged using a desalting column pre-equilibrated in buffer B (50 mM Sodium phosphate pH 7.5, 10 mM NaCl, 10% glycerol, 1 mM DTT, 'Complete' EDTA free protease inhibitors). A cation-exchange step was then used to complete the purification. During initial preparations, an HR5/5 Mono S column (Amersham-Pharmacia) was chosen, but during later preps an XK26/10 column packed with SP-Sepharose Fast Flow resin (Amersham-Pharmacia) was selected instead. Flow rates used were as recommended by the manufacturer for the column selected. In both cases the sample was applied to a column already equilibrated in buffer B. Both the flow-through and a 5 column volume buffer B wash were discarded. Bound protein was eluted via a linear NaCl gradient (10 -500 mM) over 20 column volumes. The purified protein fractions were pooled and concentrated (as above), then transferred into buffer A supplemented with 1 mM DTT using a PD10 desalting column (BioRad). Purity was assessed by Coomassie stained SDS-PAGE (15% acrylamide), and the protein was stored in aliquots at -70ºC.

Spectroscopy
Ultra-violet/visible spectroscopy was carried out using a Shimadzu UV-2401PC recording spectrophotometer. Continuous Wave Electron Paramagnetic Resonance (CW-EPR) spectra were obtained using a JEOL RE1X spectrometer equipped with an Oxford Instruments liquid helium cryostat. Samples were analysed as prepared, following reduction with sodium dithionite, and following oxidation with potassium ferricyanide.

RESULTS
The His 6 -tagged Pa-UDG was overexpressed in BL21(DE3) cells using a pET28c(+)-Pa-UDG construct (9). The protein was purified from the cell lysate by means of heat treatment, immobilised metal-ion chromatography, and cation exchange chromatography to give an essentially pure sample migrating with an approximate molecular mass of 25 kD on SDS-PAGE ( Figure 1A), while MALDI-TOF mass spectrometry gave a more precise mass of 24.248 kD ( Figure 1B). Both results were consistent with the theoretical mass for His-tagged Pa-UDG (24.628 kD). N-terminal analysis of the purified protein prior to and following removal of the His 6 -tag by digestion with thrombin confirmed its identity as Pa-UDG. Uracil-DNA glycosylase activity of the purified protein at 70°C was confirmed as described (6).
The pure protein was dialysed against a minimal buffer of 50 mM Tris-HCl pH 8.0, 100 mM NaCl and 1mM DTT for concentration and subsequent crystallographic analysis. The protein was highly soluble, and could be concentrated to > 30 mg ml -1 . Unexpectedly, dilute Pa-UDG (~1 mg ml -1 ) was observed to be yellow in colour, and this colour intensified to dark olive and eventually brown as the sample was concentrated by ultrafiltration. The retention and concentration of the colour against a 5 kD cutoff membrane suggested a high molecular weight protein-associated chromophore rather than a small molecule contaminant. Consistent  (21). However, to our knowledge, Pa-UDG is the first example of such a feature in the uracil-DNA glycosylase structural superfamily (14).

Location of cluster-ligand residues.
Iron-sulphur clusters of the HIPIP-type are usually attached via tetrahedrally directed bonds from the iron atoms to the Sγ atoms of four cysteine residues in the polypeptide chain. The Pa-UDG sequence contains six cysteine residues of which four are totally conserved in the characterised Thermotoga maritima and Archeoglobus fulgidus Family-4 UDGs (7,8), and in many homologous archaeal and eubacterial (putative) UDG sequences (FIGURE 5). These four cysteine residues are not totally conserved throughout Family-4 homologues, the first and third being replaced by aromatic residues in Rickettsia for example, nor are they restricted to hyperthermophiles, being present in Family-4 UDG homologues from spirochaetes, mycobacteria, Clostridia and Deinococcus radiodurans.
In previously-described HIPIP-type cuboidal iron-sulphur proteins, the sequence distribution of cysteine ligands varies considerably and consensus can only be obtained within protein families. The putative ligands in the Family-4 UDGs conform to a pattern : C-X 2 -C-X n -C-X (14)(15)(16)(17) -C, where 'n' ranges from 70-100. This is quite distinct from the Nth/MutY DNA repair enzymes, which show a much more localised consensus pattern : C-X 4 PX-C-X 2 -C-X (6-8) -C, nor does it resemble any other known distributions of cysteine ligands in other iron-sulphur proteins characterised to date. If, as we suggest, these conserved cysteines act as ligands, then Pa-UDG must be able to fold so that the N-terminal C-X 2 -C motif comes into sufficiently To date, no structure for a Family-4 UDG has been reported. However, sequence threading and profile analysis techniques suggest that Family-4 UDGs will have a similar overall fold to the bacterial Family-2 MUG enzymes (14). Mapping the Pa-UDG sequence on to the crystal structure of E.coli MUG (12,13) locates the central pair of putative iron-sulphur cluster ligands on the surface exposed face of helix four and the loop that precedes it (FIGURE 6A).
Cysteine residues at these positions (corresponding approximately to residues 72 and 87 in the MUG structure) would be well located to provide two ligands for a 4Fe4S cluster. The Nterminal C-X 2 -C motif occurs in a segment of the Pa-UDG sequence that precedes the Nterminus of MUG, and topologically equivalent residues cannot therefore be located in the known MUG structure. However, the N-terminus of MUG is on the same face of the protein as

Functional role of an iron-sulphur cluster
Iron-sulphur clusters occur in a wide range of enzymes, primarily as redox active co-factors participating directly in electron-transfer catalytic mechanisms. However, cuboidal 4Fe4S clusters have also been identified in non-redox enzymes, most notably in the Nth/MutY family of DNA repair enzymes (20,22,23). A variety of biochemical and biophysical studies suggest that the 4Fe4S cluster in these enzymes is not directly involved in catalysis (24). Instead, it functions as a structural 'cross-link' analogous to disulphide bonds or Zinc-fingers, which nonetheless contributes to substrate recognition by maintaining the structure of protein segments involved in DNA interactions (25)(26)(27). On the basis of the structural homology between the Famly-4 enzymes and the Family-2 bacterial MUG, the deduced site of the 4Fe4S cluster in Pa-UDG suggests that it would not participate directly in glycosylase activity.
However, the central pair putative conserved cysteine ligands map to the beginning and end of a loop segment in MUG that is involved in contacts with the DNA phosphate backbone (FIGURE 6C) (12,13) so that, as in the Nth/MutY enzymes, the 4Fe4S cluster might probably play a role in substrate recognition but not catalysis. Determination of the precise role of the cuboidal 4Fe4S cluster in Family-4 uracil-DNA glycosylases must await the results of structural and mutagenesis studies, which are ongoing.

Figure 2. UV-visible Spectrum of Pa-UDG
In addition to the normal expected peak at 278 nm attributable to aromatic amino acid residues, the UV-visible spectrum of purified Pa-UDG shows an additional broad absorbance peak around 383 nm, giving the protein a yellow colour.  c) Purified species 1 as prepared showed no EPR signal (1), but developed a strong signal characteristic of an oxidised high potential iron protein (HIPIP) 4Fe4S centre on addition of the oxidant potassium ferricyanide (1ox). As prepared, the spectrum of purified Species 2 (2) still showed a weak signal above g=2.00 suggesting the presence of some oxidised 3Fe3S clusters, possibly reflecting damaged centres.
Addition of ferricyanide to this protein also produced a strong characteristic HIPIP signal (2ox). Neither of these preparations showed any detectable g=1.94 signal due to the presence of the reducible 4Fe4S centre observed in the initial preparation. The EPR spectra of species 1 is fully consistent with only a single type of iron-sulphur centre.

Figure 5. Comparative Alignment of MUG/TDG and Family-4 UDG Sequences
Sequences of the MUG/TDG enzymes from human, mouse, fission yeast (schpo),