Structure and ubiquitin interactions of the conserved zinc finger domain of Npl4.

Ubiquitylated proteins are directed into a large number of different cellular pathways through interactions with effector proteins that contain conserved ubiquitin binding motifs. Here, we report the solution structure and ubiquitin binding properties of one such motif, the Npl4 zinc finger or RanBP2/Nup358 zinc finger (NZF) domain. Npl4 NZF forms a compact module composed of four antiparallel beta-strands linked by three ordered loops. A single zinc ion is coordinated by four conserved cysteines from the first and third loops, which form two rubredoxin knuckles. Npl4 NZF binds specifically, but weakly, to free ubiquitin using a conserved 13TF14 dipeptide to interact with the "Ile-44" surface of ubiquitin. Our studies reveal the structure of this versatile class of protein binding domains and provide a means for identifying the subset of NZF domains likely to bind ubiquitin.

Protein ubiquitylation plays an important role in a large number of biological processes, including intracellular proteolysis, DNA repair, transcription, translation, signal transduction, cell cycle progression, organelle assembly, protein trafficking, and virus budding (1)(2)(3). Cells therefore dedicate an extensive array of machinery to the enzymology of ubiquitin transfer. Upon ubiquitylation, proteins must be directed to the correct intracellular locale, and pathways that utilize ubiquitin as a targeting signal therefore have effector proteins that specifically recognize ubiquitylated proteins. A number of such ubiquitin binding proteins are now known, and their study has led to the identification and characterization of several conserved ubiquitin binding motifs, including the ubiquitin-interacting motif (UIM), 1 the ubiquitin-associated domain, and the ubiquitin E2 variant/UBC-like domain (2, 4 -7). Each of these motifs forms an independent folding domain that can bind ubiquitin in vitro, although the molecular details of ubiquitin recognition are not yet understood in any case.
Given the generality of ubiquitin (Ub) as a targeting signal, it is likely that additional ubiquitin recognition motifs remain to be identified and characterized. One example is the recently discovered Npl4 zinc finger/RanBP2 zinc finger domain (NZF) (see Fig. 1A), which was first identified in Npl4, a protein that binds ubiquitin in vitro and mediates several different ubiquitindependent functions of the AAA-ATPase p97/Cdc48p. A hallmark of NZF domains is the presence of four regularly spaced cysteine residues, which suggests that the motif may be a metal binding module (8 -11). NZF motifs appear to be quite prevalent, and our protein data base searches revealed 243 unique peptide sequences with four cysteines and an asparagine with the same spacing as those of Npl4 NZF (Fig. 1A). This ensemble of sequences also exhibited significant conservation at 12 of 24 of the remaining residues, which presumably reflects the conserved structure and function(s) of this common domain.
In addition to Npl4 itself, several other NZF-containing proteins are known to be involved in ubiquitin-dependent processes. One example is Vps36p, a protein required for vacuolar protein sorting of ubiquitylated proteins in yeast (11). Isolated NZF domains from both Npl4 and Vps36p bind mono-and poly-Ub chains in vitro, consistent with the idea that these NZF domains serve to recognize ubiquitylated protein substrates in vivo (10). However, not all NZF-like domains bind ubiquitin, suggesting that those that do may represent a specialized subset of a much larger domain superfamily that shares a common three-dimensional fold (10).
The prevalence of the NZF domain and its apparent role in ubiquitin recognition in several important biological pathways led us to study the molecular basis for NZF/Ub interactions. Toward this end, we have analyzed the sequence conservation across putative NZF domains, characterized the metal and ubiquitin binding properties of the Npl4 and Vps36p NZF domains, determined the solution structure of the NZF domain from Npl4, mapped the interaction surfaces in the Npl4 NZF/Ub complex, and tested the importance of a conserved, dipeptide motif (TF) at NZF positions 13 and 14 for ubiquitin binding.

Data Base Searches for Putative NZF Domains
The SwissProt, PIR, PRF, and KEGG GENE data bases were searched with the pattern X 6 CX 2 CX 3 NX 6 CX 2 CX 5 , which yielded 243 unique matches in 166 different proteins (duplicates were removed manually). The apparent correlation between Asn-16 and Trp-7 (72%; see Fig. 1A) was tested by a search with the pattern X 4 WXCX 2 CX 10 CX 2 CX (5), which produced 161 matches (in 114 proteins), 84% of which had asparagine at the variable position corresponding to Asn-16 in Npl4 NZF. Searches with the more stringent RanBP2like pattern (12) (and see, on the World Wide Web, www.expasy.ch/cgibin/nicedoc.pl?PDOC50199) X 4 WXCX 2 CX 3 NX 6 CX 2 CX 5 , with both the Trp and Asn positions fixed, revealed greater conservation at the variable positions but was judged overly stringent, since it excluded the known NZF domain of Vps36p (10). Searches of the PDBSTR data base yielded no matches, indicating that the motif has not previously been characterized structurally.

Protein Expression and Purification
DNA encoding NZF domains from rat Npl4 (amino acids 580 -608) and yeast Vps36p (amino acids 177-205) were cloned into pGEX-4T expression vectors. These constructs encoded GST-NZF fusion proteins with thrombin cleavage sites between the GST and NZF domains (13). Mutations were introduced into the Npl4 expression construct by QuikChange TM mutagenesis (Stratagene).
Protein expression in DH5␣ Escherichia coli carrying the expression plasmids was induced with 0.5 mM isopropyl-1-thio-␤-D-galactopyranoside (A 600 ϭ 0.4). After 4 h at 23°C, the cells were harvested by centrifugation and stored at Ϫ70°C prior to protein purification. All steps in Npl4 NZF protein purification were performed at 4°C, except where noted. Pellets from 6 liters of cells were resuspended in 40 ml of buffer A (10 mM sodium phosphate, pH 7.4, 150 mM NaCl, 5 mM ␤-mercaptoethanol (BME), 10 M ZnCl 2 ), sonicated to lyse the cells, and centrifuged for 1 h at 39,000 ϫ g to clear insoluble debris. The soluble GST-Npl4 NZF fusion protein was purified by affinity chromatography on an FF 16/10 glutathione-Sepharose column (Amersham Biosciences). The bound protein was loaded and washed with ϳ10 column volumes of buffer A and then eluted with 20 mM reduced glutathione in 50 mM Tris-HCl, pH 8.0, 5 mM BME, 10 M ZnCl 2 .
Thrombin cleavage left two nonnative residues at the N terminus of Npl4 NZF (NH 2 -Gly-Ser 2 ). These residues are included in our numbering scheme but are not shown in any of the figures. Purified Npl4 NZF was characterized by SDS-PAGE, N-terminal sequencing (NH 2 -Gly-Ser-Thr-Ser-Ala), and matrix-assisted laser desorption ionization mass spectrometry (calculated mass ϭ 3374.7 Da; observed mass ϭ 3374.7 Da). The stoichiometry of bound zinc was analyzed by determining the protein concentration using optical spectroscopy (⑀ 280 ϭ 5840 M Ϫ1 cm Ϫ1 ), and the zinc concentration was analyzed by flame atomic absorption spectrometry. Unlabeled and 15 N-labeled ubiquitin were expressed and purified as described (14).

X-ray Absorption Spectroscopy
Pure Npl4 NZF was dialyzed against 2 liters of XAS buffer (either 20 mM sodium phosphate, pH 5.5, 50 mM NaCl, 5 mM BME or 20 mM Tris-HCl, pH 8.0, 50 mM NaCl, 5 mM BME), concentrated by ultrafiltration (Amicon 3), adjusted to a final concentration of 30% (w/v) glyc-erol, and snap-frozen in a Lucite cell covered in Kapton tape at a final protein concentration of 1 mM.
Data were collected at beamline 7-3 at the Stanford Synchrotron Radiation Laboratory using an Si(220) double crystal monochromator detuned 50% for harmonic rejection. Fluorescence excitation spectra were recorded using a 30-element germanium solid-state array detector. Samples were maintained at 10 K using a continuous flow liquid helium cryostat (Oxford). XAS spectra were measured at 5-eV steps in the pre-edge, 0.35-eV steps in the edge (9640 -9690 eV), and 0.05-Å Ϫ1 increments in the EXAFS region, integrating from 1-30 s in a k 3 weighted manner over 45 min. EXAFS data represent the average of nine scans. X-ray energies were calibrated with simultaneous measurement of a zinc foil absorption spectrum, assigning the first inflection point to 9659 eV. Similar spectra were obtained at both pH 5.5 and 8.0.
XAS data were analyzed using the Macintosh OS X version of the EXAFSPAK program, which integrates Feff version 7.0 to generate theoretical models (see, on the World Wide Web, www-ssrl.slac.stanford.edu/exafspak.html). Data reduction employed second-order polynomial base-line flattening in the pre-edge and three-region cubic spline flattening throughout the EXAFS region. Data were converted to kspace using E 0 ϭ 9680 eV. Meaningful EXAFS data were limited to 10,325 eV due to monochrometer imperfections. EXAFS data of crystallographically characterized model complexes were fit using amplitude and phase functions calculated in Feff version 7.0 for a zinc-sulfur interaction. This resulted in a scale factor of 1 and an E 0 of Ϫ15.25, which were used for data fitting of the NZF protein. Coordination numbers were fixed at half-integer values, whereas R and 2 were allowed to float during the final fits.
Data Collection and Assignments-NMR spectra were recorded at 18°C on a Varian Inova 600-MHz spectrometer equipped with a tripleresonance 1 H/ 13 C/ 15 N probe and z axis pulsed field gradients. Backbone and side chain assignments were made using a standard suite of triple resonance experiments (15,16), except that two-dimensional versions of the HNCACB and HN(CA)CO experiments were collected and analyzed (Fig. 2). Side chain assignments were completed using H(CCO)NH-TOCSY, (H)C(CCO)NH-TOCSY, 15 N-edited TOCSY, and 13 C-edited NOESY experiments, with aromatic resonances assigned using a combination of 1 H/ 13 C HSQC and 13 C-edited NOESY experiments centered on the aromatic carbon resonances (125 ppm). Stereospecific assignments for 28 of 36 ␤-methylene protons were obtained using a combination of HNHB, HN(CO)HB (17), 15 N-edited TOCSY, and NOESY data. Resonance assignments were complete except for amide proton and nitrogens for the first three amino acids, methionine methyls, and His H⑀1 aromatic protons.
Structure Determination-Npl4 NZF structures were refined using torsion angle dynamics in DYANA (28) and then regularized in CNS (29) through a gentle simulated annealing with 362 NOE interproton restraints, four hydrogen-bonding restraints, and 13 J HNHA scalar coupling restraints (30). Initial rounds of refinement used only NOE data, which defined the general fold of the domain and revealed the stereochemistry about the coordinated zinc. Final refinements added EXAFSderived restraints for zinc-S␥ (2.33-2.37 Å), (zinc-C␤ 3.25-3.61 Å), and S␥-S␥ distances (3.78 -4.15 Å) to ensure approximate tetrahedral zinc coordination as well as hydrogen-bonding restraints and 3 J HNHA coupling constants.
NOE assignments and structure calculations were independently validated using the automated assignment/structure calculation package (CANDID within CYANA) (31), which produced essentially the same structure as was determined manually. Structures were analyzed using PROCHECK-NMR (32), MOLMOL (33), and INSIGHT II (MSI) ( Table I). Structure figures were created with PYMOL (DeLano Scientific).

Chemical Shift Perturbation Mapping Experiments
Chemical shift perturbation experiments were performed at 18°C in NMR buffer. To identify the ubiquitin binding sites on the Npl4 NZF domain, unlabeled ubiquitin was titrated into 1.5 mM 15 N-labeled Npl4 NZF at final concentrations of 0 -1.5 mM. To identify the Npl4 NZF binding site on ubiquitin, unlabeled Npl4 NZF was titrated into 0.15 mM 15 N-labeled ubiquitin at final concentrations of 0 -0.15 mM. 1 H/ 15 N HSQC spectra (18) were collected at each titration point, and normalized chemical shift changes (␦) were calculated using the equation ␦ ϭ 25((␦ HN ) 2 ϩ (␦ N /5) 2 ) 0.5 (34,35). Amide chemical shift assignments for human ubiquitin were obtained from the VLI Research, Inc. Web site at www.vli-research.com/ubshifts.htm.

Ubiquitin Binding
Ubiquitin binding affinities for wild type and mutant GST-Npl4 NZF proteins and wild type GST-Vps36p NZF captured on anti-GST antibody biosensor surfaces were quantified as described previously (36). All measurements were performed at 20°C in 25 mM Tris-HCl, pH 7.0, 5 mM BME, 2 M ZnCl 2 .

RESULTS
The NZF Is a Zinc Binding Module-The Npl4 and Vps36p NZF constructs used in our studies span the conserved NZF regions and the sequence and numbering scheme for the rat Npl4 NZF polypeptide (Npl4 NZF) are given in Fig. 1A. Npl4 NZF was expressed in E. coli, and the soluble protein was purified to homogeneity under nondenaturing conditions (see "Experimental Procedures"). The presence of a single, covalently bound zinc ion was demonstrated using atomic absorption spectroscopy (0.93 mol of zinc/mol of Npl4 NZF) and EXAFS spectroscopy (Fig. 1, B and C). The best fit model to the transformed EXAFS spectrum revealed that the zinc coordination sphere contained four sulfur ligands (n ϭ 4.56) and an average Zn-S bond length of 2.33 Å. The Debye-Waller factor (0.00454 Å) was consistent with a slightly distorted tetrahedral geometry about the zinc center. We therefore conclude that Npl4 NZF coordinates a single, tetrahedral Zn 2ϩ ion using the four conserved cysteines.
NMR Studies of Npl4 NZF-Nearly complete resonance and stereospecific assignments for Npl4 NZF were obtained using standard 1 H/ 13 C/ 15 N heteronuclear NMR experiments (see Fig.  2 and "Experimental Procedures"). As expected for a Zn(Cys) 4 metal center, the four cysteine C␤ resonances exhibited chemical shifts (31.5-32.5 ppm) that were intermediate between free cysteine (28.3 ppm) and oxidized cysteine (41.2 ppm). The structure of Npl4 NZF was initially calculated using 362 NOE FIG. 1. Primary sequence, conservation, and zinc binding by the Npl4 NZF domain. A, the primary sequence, numbering scheme, and secondary structure of the Npl4 NZF domain are shown above with the four cysteines that coordinate zinc circled in red. Shown below is the sequence conservation in putative NZF/Ran BP2-like domains from 243 sequences with four cysteines and one asparagine in the same spacing as Npl4 NZF. Residues present in more than 18% of NZF domains are shown explicitly, with bar heights representing their relative frequencies (see "Experimental Procedures"). B, untransformed EXAFS spectra (black) and fit to a Zn(Cys) 4 model (green) of Npl4 NZF. C, Fouriertransformed EXAFS spectra (black) and fit (green) of Npl4 NZF. Npl4 NZF Domain Structure restraints, which were sufficient to define both the protein structure and the zinc coordination geometry. Subsequent refinements included restraints for observed hydrogen bonds (eight restraints), J HNHA scalar-coupling (13 restraints), and tetrahedral zinc coordination (4 experimental, 10 idealized restraints). The final ensemble of 20 low energy structures was of high quality ( Fig. 3 and Table I), with root mean square deviations from the mean structure of 0.11 Ϯ 0.05 (backbone) and 0.72 Ϯ 0.15 Å (all heavy atoms).
The rubredoxin knuckle was initially recognized in the ironbinding centers of rubredoxins and has subsequently been observed at the zinc coordination sites in a number of proteins (39 -42). Rubredoxin knuckles are six-residue loops that connect two ␤-strands and display two metal binding Cys ligands (from loop positions 1 and 4). The canonical knuckle configuration orients the backbone amide nitrogens at positions 3 and 4 to hydrogen-bond with the first cysteine S␥ atom and orients the backbone amide at position 6 to hydrogen-bond with the second cysteine S␥. This unusually tight metal binding loop is favored by a glycine residue at position 5, which adopts a   (28). ͗CNS͘ is the same ensemble after 1000 steps (15 ps each) of simulated annealing at 25 K, 1000 slow cooling steps to 0 K, and 10,000 steps of restrained Powell minimization in cartesian space (anneal.inp protocol) (29).
b Only meaningful and nonredundant restraints as determined by the DYANA CALIBA function. c Two upper limit distance restraints were used to define each hydrogen bond. d NA, not applicable. e Energies for structures input into CNS (from DYANA) were estimated within the generate_easy.inp program after initial regularization without restraints.
f Violations from DYANA have units of Å , while violation energies from CNS are in kcal/mol. g Determined using PROCHECK-NMR (32). h Superposition and overall rmsds were calculated using the program MOLMOL (33).

Npl4 NZF Domain Structure
positive backbone torsion angle. Variations in the details of this hydrogen bonding scheme are seen in proteins within the rubredoxin family, for example in rubrerythrin (Protein Data Bank entry 1DVB) (43). The knuckles in Npl4 NZF are slightly different from canonical rubredoxin knuckles, because both NZF knuckles lack glycines at position 5 (Figs. 1A, 3C, and 4). This opens up the loops slightly and alters the predicted hydrogen bonding pattern (see Fig. 3C and legend). Moreover, in the second knuckle, the side chain amide nitrogen of Asn-16 donates a hydrogen bond to the first cysteine (Cys-23, loop position 1). Nevertheless, the overall trajectories in the two Npl4 knuckles are similar to one another and to other rubredoxin knuckles (Fig. 4).
As summarized in Table II, nearly all of the residues that are conserved across NZF domains perform identifiable structural roles. The NZF tertiary structure is stabilized by a small hydrophobic "core," which lies just below the zinc and consists primarily of the conserved Trp-7 residue (Figs. 1A and 3). The Trp-7 H⑀1 proton is hydrogen-bonded to the Thr-20 carbonyl oxygen of strand 3, and the indole ring makes a series of hydrophobic contacts with residues from both strands S1 and S4. One face of the Trp-7 ring is shielded from solvent by the guanidinium group of Arg-30, and the other is shielded by the Asn-16 side chain. The conserved Asn-16 side chain also serves to bridge strands 2 and 3, since its amide nitrogen and oxygen hydrogen-bond across to the Thr-20 hydroxyl and Cys-23 sulfur, respectively (Fig. 3C).
Interaction Surfaces in the NZF⅐Ub Complex-The Npl4 NZF motif can bind both mono-and poly-Ub, as analyzed by affinity co-purification (10). Backbone amide NMR chemical shift perturbation experiments were used to map the interaction surfaces of both proteins within the 1:1 Npl4 NZF⅐Ub complex (Fig.  5). The Npl4 NZF⅐Ub complex was in fast exchange, as is typical for complexes with dissociation constants in the high micromolar range (see below). Chemical shift changes observed in titration experiments therefore reflected the populationweighted average of shifts for the free and bound species.
A series of 1 H/ 15 N HSQC spectra of 15 N-labeled ubiquitin were collected during the stepwise titration of 0 -1 eq of unlabeled Npl4 NZF (Fig. 5). Comparison of the spectra revealed that 15 of 70 observable ubiquitin backbone amide resonances shifted significantly upon complex formation (␦ Ն 2; see Fig. 5A). The shifted ubiquitin residues generally clus-tered about the three C-terminal strands of the ␤-sheet (with the exception of Thr-14), which presumably corresponds to the NZF binding surface. A prominent feature of the binding site is an exposed, hydrophobic surface formed by Ub residues Leu-8, Ile-44, Val-70, Leu-71, and Leu-73 (highlighted in Fig. 5B). This corresponds to the Ile-44 surface of ubiquitin, which has been shown by genetic analyses to function in endocytosis, proteasomal degradation, and human immunodeficiency virus budding (44,45). Ubiquitin residue Lys-48 also shifted significantly upon NZF binding, which is notable because polyubiquitin chains linked via the Lys-48 side chain target proteins for proteasome degradation. Our studies do not reveal, however, whether NZF binding is favored or disfavored by conjugation at this position.
In the reciprocal experiment, a series of 1 H/ 15 N HSQC spectra of 15 N-labeled NZF were collected during the stepwise titration of 0 -1 eq of unlabeled ubiquitin (Fig. 5C). 10 of 25 observable NZF backbone amide resonances shifted significantly upon ubiquitin binding (␦ Ն 2). All 10 shifted residues were clustered about the rubredoxin knuckles above the zinc coordination site and include the zinc ligands Cys-12, Cys-23, and Cys-26 as well as Thr-13, Phe-14, and Met-25. Like its complementary counterpart on ubiquitin, the binding surface on NZF is primarily hydrophobic but also includes residue Thr-13. Thus, it appears that the interface between NZF and Ub is likely to resemble many other protein-protein interfaces that exhibit buried hydrophobic cores as well as key complementary hydrophilic interactions that provide orientation and specificity.
Ubiquitin Binding by the NZF Domain-To quantify the interaction between Npl4 NZF and free ubiquitin, we performed biosensor binding experiments in which pure recombinant ubiquitin was allowed to bind to immobilized GST-NZF (Fig. 6A). Ubiquitin bound to the Npl4 NZF surface with rapid, reversible kinetics, and the interaction was specific as ubiquitin did not bind to a control GST surface (Fig. 6A, inset). A fit of the Npl4 NZF/Ub binding data to a simple 1:1 model yielded a dissociation constant (K D ) of 122 Ϯ 2 M. We speculate that this relatively weak binding affinity may reflect the fact that Npl4 NZF normally recognizes ubiquitylated proteins (rather than free ubiquitin) and may therefore gain additional binding energy through contacts with the conjugated C-terminal region of ubiquitin, additional contacts elsewhere in the Npl4 protein complex, and/or avidity effects.
We note that the affinity of the Npl4 NZF/Ub interaction is similar to those observed for other well characterized ubiquitin recognition motifs. One example is the UIM, a widespread motif with well established ubiquitin binding functions. Measurements from several groups show that UIMs bind to ubiquitin with dissociation constants in the high micromolar range. Specifically, Shekhtman and Cowburn (46) used NMR chemical shift titrations to estimate that the UIM motif from Hr bound ubiquitin with a dissociation constant of 230 Ϯ 50 M, and Raiborg et al. (47) performed biosensor experiments to show that ubiquitin bound to a large, immobilized Hrs fragment spanning the UIM (residues 1-289), with a dissociation constant of 300 M.
Biosensor binding experiments were also performed to quantify the interaction of ubiquitin with the immobilized NZF domain of yeast Vps36p (Fig. 6B). Both the affinity (K D ϭ 199 Ϯ 17 M) and specificity (not shown) of the Vps36p NZF/Ub interaction were similar to those of the Npl4 NZF/Ub interaction. The observation that Vps36p NZF binds ubiquitin is not surprising from a biological perspective, since Vps36p is required for the sorting of ubiquitylated proteins into the yeast vacuole (11). However, the observation is of interest from a structural Hydrophobic core C9 Zinc coordination P(Q) 10 Rubredoxin knuckle metal binding loop C12 Zinc coordination T/G13 Positive backbone torsion angle at position 5 of the rubredoxin knuckle (G), ubiquitin binding (T) Y/F/L14 Ubiquitin binding N16 Interstrand hydrogen bonding A/P18 i ϩ 1 position of type-1 turn R/A(T) 20 Unclear from Np14 NZF structure K(H)22 Surface-exposed side chain C23 Zinc coordination C26 Zinc coordination G(S) 27 Positive backbone torsion angle at position 5 of the rubredoxin knuckle P29 C terminus of strand 4 K/R30 Surface-exposed side chain P(T) 31 Domain terminus a Conserved residues in NZF domain (see Fig. 1). The Np14 NZF residue is given in parentheses when it does not correspond to the consensus sequence.
Npl4 NZF Domain Structure perspective, because Vps36p NZF has an asparagine residue in place of the conserved Trp-7 residue that forms the hydrophobic core of Npl4 NZF. Our experiments demonstrate that the Vps36p NZF can nevertheless form a ubiquitin binding module, despite the absence of this hydrophobic core residue.
Our chemical shift mapping experiments indicated that ubiquitin probably binds to Npl4 NZF by contacting the exposed 13 TF 14 Npl4 dipeptide. As discussed below, NZF motifs appear to fall into several different classes that are characterized by distinct dipeptide sequences at positions 13 and 14. We suggest that NZF motifs that bind ubiquitin contain 13 TF 14 , whereas those that bind other ligands have other pairs of amino acids at positions 13 and 14. To test this idea, the Npl4 NZF 13 TF 14 residues were mutated singly and in pairs to LV and NE, which are the residue pairs found in NZF motifs from RanBP2 and Mdm2, respectively. As shown in Fig. 6C, all four single mutations increased the K D for ubiquitin binding at least 25-fold, and the two double mutants did not bind detectably to ubiquitin (K D Ͼ 10 mM). We therefore conclude that the 13 TF 14 dipeptide is an essential component of the ubiquitin binding epitope of Npl4 NZF. DISCUSSION Our studies demonstrate that the Npl4 NZF domain forms a tightly folded zinc binding module and provide a structural rationale for nearly every conserved residue in this domain family (Table II). NZF now joins the ubiquitin-associated domain (48,49), ubiquitin E2 variant/UBC-like domain (16,50,51), and UIM 2 as a structurally characterized ubiquitin binding domain. Interestingly, at least three of these structurally distinct motifs (the NZF, ubiquitin E2 variant/UBC-like domain, and UIM) contact ubiquitin on the hydrophobic patch that surrounds the exposed surface of the C-terminal three strands of the ␤-sheet (the Ile-44 surface) (16,46,49,(53)(54)(55), 2 highlighting the importance of this surface for ubiquitin function.

Similarities between Npl4 NZF and Other Metalloproteins-
The Npl4 NZF bears an unexpected resemblance to the "␤ ribbon" zinc fingers found in a series of transcription-related proteins from archaea and eucarya, including TFIIs and RPB9 (56 -59). In all of these proteins, a single Zn 2ϩ ion is coordinated by four Cys residues presented by the knuckles that link two ␤-hairpins. As shown in Fig. 4A, zinc binding loops from Npl4 NZF and TFIIs are similar and overlay with heavy atom r.m.s. deviation of less than 1.2 Å. The overall topologies of the zinc binding domains of TFIIs (Protein Data Bank entry 1TFI), RPB9 (1QYP), and Npl4 are also similar, although Npl4 NZF is more compact than the other two and seems to represent a "stripped down" version of the fold. More generally, the zinc center of Npl4 NZF is also structurally related to the iron binding sites of the rubredoxin protein family, particularly to ruberythrin (Fig. 4B). Despite these structural similarities, however, there are no obvious functional connections between Npl4 NZF and the other proteins, beyond the possibility that all may mediate protein-protein interactions. Instead, the similarities appear to reflect the fact that the NZF fold provides a particularly favorable geometry for metal binding.
Ubiquitin Binding and Conservation of the NZF Domain-We have identified 243 sequences from 166 different proteins that exactly match the identity and spacing of the four cysteines and single asparagine of Npl4 NZF (see "Experimental Procedures" and Fig. 1A). Others had previously noted the presence of a highly related "zinc finger" sequence motif found in the nuclear pore protein, Ran BP2 (and related proteins) (60 -63). Searches based upon the RanBP2/Nup358 zinc finger motif have required the presence of residues equivalent to the Npl4 NZF cysteines as well as Trp-7 and Asn-16 but allowed the spacing between the first two cysteines to vary between 2 and 4 residues (12). These search criteria identify a comparable number of proteins with ours because they include family members that do not match our stringent spacing criteria but exclude NZF proteins like Vps36p (that lack the Trp-7 residue).
Thus, it appears that more than 200 known proteins contain NZF motifs, with the precise estimate depending upon the search criteria. Indeed, it will probably be difficult to come up with criteria that will unambiguously define the NZF motif given that there may be a continuum of protein structures that link the NZF and ␤ ribbon zinc motifs.
The prevalence of the NZF motif makes it important to understand the features that dictate different NZF functions. As shown above, the Npl4 and Vps36p NZF motifs both bind specifically to ubiquitin, as do the NZF motifs from two other proteins, TAB2 and RBCK2 (10). However, the NZF domains from two other proteins, Mdm2 and RanBP2/Nup358, do not detectably bind ubiquitin in vitro (10). Interestingly, all four known NZF motifs with ubiquitin binding activities display the 13 TF 14 dipeptide, whereas the two known NZF motifs that do not bind ubiquitin have very different residues at these posi-tions (LV in RanBP2 and NE in Mdm2). Note that on the basis of structural considerations alone, glycine should be heavily overrepresented at residue 13, since this position in the rubredoxin knuckle adopts a positive backbone torsion angle (39). Thus, the presence of nonglycine residues at position 13 suggests a functional role in ligand binding.
Consistent with this idea, our chemical shift mapping studies demonstrate that the 13 TF 14 dipeptide forms the primary binding site for Ub on Npl4 NZF (with the T13 amide exhibiting the greatest chemical shift change of any NZF residue upon Ub binding). The side chains of Thr-13, Phe-14, and Met-25 form an exposed surface on the Npl4 NZF motif, and our experiments indicate that this is the site of ubiquitin binding. We further suggest that Thr-13 and Phe-14 (and possibly also Met-25) may be generally used by NZF domains whose biological function is to recognize ubiquitin. Consistent with these ideas, our mutagenesis experiments demonstrate that both Thr-13 and Phe-14 are essential for ubiquitin binding, and preliminary structural studies of the Npl4 NZF⅐Ub complex confirm that the TF side chains contact the ubiquitin Ile-44 surface. 3 To explore this idea further, we tested for correlations between the identities of residues at positions 13 and 14 of the 243 NZF domains identified by our initial search criteria. Strikingly, we found that when a threonine is present at position 13, a large hydrophobic residue (⌽) almost always follows at position 14 (97% correlation, 63/65; Phe ϭ 32, Tyr ϭ 16, Leu ϭ 15). In contrast, when any residue except threonine is present at position 13, a large hydrophobic residue follows only 16% of the time (29/178 total). This analysis again suggests that the NZF motif forms a scaffold that presents the surface-exposed 13 TF 14 dipeptide (and surrounding residues) to recognize the hydrophobic Ile-44 surface of ubiquitin.
Our sequence analyses also revealed that the LV dipeptide co-varies at NZF positions 13 and 14 (LV; 67% co-variation). We speculate that this dipeptide may also form a recognition epitope for another NZF domain. Interestingly, several proteins, including RanBP2/Nup358 and Nup153, contain multicopy tracts of NZF domains that bind the GDP form of the Ran GTPase (9,64). These tracts are composed primarily of multiple copies of 13 LV 14 -containing NZF domains, suggesting that RanGDP may be the preferred protein binding partner for this class of NZF domain. The RanBP2 zinc fingers also bind exportin-1 (65), which is apparently yet another protein that can be recognized by the NZF domain.
Biological Implications-The NZF motif was first recognized in the mammalian protein, Npl4, which is a subunit of the heterodimeric UN complex (together with Ufd1) (10). UN is one of at least two alternative adapters that target the AAA-ATPase p97/Cdc48p to specific protein substrates. The UN complex is required for at least three p97/Cdc48p-mediated reactions, including retrotranslocation of ubiquitylated proteins from the ER into the cytosol (ERAD) and the topologically related ubiquitin-dependent processing of transcription factors Spt23 and Mga2 in the ER membrane and subsequent mobilization by Cdc48p (RUP) (66,67). p97-UN is also required for an as yet undefined reaction that leads to the formation of a closed nuclear envelope after mitosis (68).
Although the exact mechanistic role of UN in ERAD and RUP remains to be determined, UN can bind ubiquitin conjugates by means of the Npl4 NZF domain and can simultaneously recruit the p97 ATPase (10). One possibility is therefore that UN functions by recognizing ubiquitylated protein substrates in the ER membrane and recruiting p97, which then uses the energy of ATP hydrolysis either to pull its substrates out of the membrane or to separate them from other proteins. Interestingly, the other known p97 adapter, p47, which functions in the reassembly of the Golgi apparatus (69) and expansion of the nuclear envelope after mitosis (68), probably also links the ATPase to ubiquitin conjugates (10). In that case, however, ubiquitin recognition is mediated by a ubiquitin-associated domain in p47, and the ubiquitylated substrate remains to be identified.
Vps36p provides another example in which an NZF motif could help to recruit ubiquitylated proteins into a biological pathway. We have shown that the Vps36p NZF can bind ubiquitin (10 and this work), and others have shown that Vps36p is a member of the ESCRT-II complex, which is required for sorting of ubiquitylated proteins into multivesicular bodies (11). It is therefore plausible to suppose that Vps36p uses its NZF motif to recognize ubiquitylated protein substrates, since they are sorted through the multivesicular body pathway. In further analogy to Npl4, there is even the potential that Vps36p could also act as an adaptor for Vps4p, a AAA ATPase that functions late in this pathway (70 -72).
Our studies also suggest that many other less well characterized proteins that contain NZF motifs of the "TF" class will also exhibit ubiquitin binding activities. Strengthening this suggestion is the fact that a number of these proteins contain additional domains that are linked to various aspects of ubiquitin biochemistry. Examples include proteins with E3 RING finger motifs that bind ubiquitin-conjugating (E2) enzymes (e.g. Ubc7ip3; NCBI entry Q9BYM8) (10,73), proteins that bind E3 ubiquitin transferases (e.g. RYBP, NP_036366) (52), proteins with their own ubiquitin-like domains (e.g. Sharpin, NP_112415), and proteins that contain other known ubiquitin binding domains (e.g. MGC45404; NP_690000). There are also a number of proteins with NZF motifs of the TF class without previously known links to ubiquitin, and we suggest that investigations in this direction are likely to be fruitful.
In summary, the NZF motif is a zinc binding module that can be incorporated into multifunctional proteins and used to bind ubiquitin, RanGDP, exportin-1, and probably other proteins. Our studies reveal the NZF structure, define the NZF/Ub interface, and provide criteria for distinguishing functional variants of this versatile scaffold.