Structure and Biochemical Function of a Prototypical Arabidopsis U-box Domain*

U-box proteins, as well as other proteins involved in regulated protein degradation, are apparently over-rep-resented in Arabidopsis compared with other model eu-karyotes. The Arabidopsis protein AtPUB14 contains a typical U-box domain followed by an Armadillo repeat region, a domain organization that is frequently found in plant U-box proteins. In vitro ubiquitination assays demonstrated that AtPUB14 functions as an E3 ubiquitin ligase with specific E2 ubiquitin-conjugating enzymes. The structure of the AtPUB14 U-box domain was determined by NMR spectroscopy. It adopts the (cid:1)(cid:1)(cid:2)(cid:1) fold of the Prp19p U-box and RING finger domains. In these proteins, conserved hydrophobic residues form a putative E2-binding cleft. By contrast, they contain no common polar E2 binding site motif. Two hydrophobic cores stabilize the AtPUB14 U-box fold, and hydrogen bonds and salt bridges interconnect the residues corresponding to zinc ion-coordinating residues in RING domains. Residues from a C-terminal (cid:2) -helix interact with the core domain and contribute to stabilization. The Prp19p U-box lacks a corresponding C-terminal (cid:2) -helix. Chemical shift analysis suggested that aromatic residues exposed at the N terminus and the C-terminal (cid:2)

The ubiquitin proteolytic pathway plays an important role in regulated protein degradation (1). Proteins designated for degradation are covalently modified by attachment of a ubiquitin polymer and degraded by the 26 S proteasome. A ubiquitin-activating enzyme (E1) 1 catalyzes ATP-dependent formation of a thioester bond between ubiquitin and itself and transfers the activated ubiquitin to a ubiquitin-conjugating enzyme (E2). Formation of an isopeptide bond between ubiquitin and a substrate is facilitated by a ubiquitin-protein ligase (E3) that can bind both the E2-ubiquitin complex and the substrate. Members of the HECT and RING protein families are the best characterized E3 ligases, but recently U-box proteins have also been shown to function as E3s (2,3). This may be the general function of U-box proteins, although they were initially suggested to function as ubiquitin chain assembly factors (E4s) (4). Recently, research on U-box proteins, especially the carboxyl terminus of Hsc70-interacting proteins (CHIP), has focused on their ability to interact with molecular chaperones and selectively ubiquitinate unfolded proteins. Thus ubiquitination can also function in protein quality control (5).
The U-box motif is a peptide chain that contains ϳ70 amino acid residues, with characteristics suggesting that it is a structural variant of the RING fold but lacks the signature zincbinding amino acids of the RING domain (6). Recently, the first three-dimensional structure of a U-box domain was published (7). The structure of the yeast pre-mRNA splicing factor Prp19p U-box was determined by NMR spectroscopy and verified the structural similarity to the RING domain stabilized by an extensive hydrogen-bonding network as a replacement for the zinc binding sites of the RING domains.
Only two U-box proteins are present in yeast, and typically six are present in animals (5). By contrast, Arabidopsis contains many U-box proteins (8). The largest class of Arabidopsis U-box proteins contains an ARMADILLO (ARM) repeat region linked to the U-box. ARM repeats are short leucine-rich protein-interacting motifs (9), first identified in the segment polarity protein, armadillo, from Drosophila melanogaster (10). Only a few plant U-box proteins have been characterized biochemically, and so far three-dimensional structures have not been available. The Brassica U-box protein ARC1 binds to the S-locus receptor kinase, the female determinant of pollen selfincompatibility (11), via its ARM repeats (12). It was suggested that ARC1 promotes ubiquitination and degradation of compatibility factors in the pistil leading to pollen rejection (13). Arabidopsis also contains a CHIP orthologue that functions as an E3 in vitro (14). AtCHIP is up-regulated by certain stress conditions and overexpression of AtCHIP rendered Arabidopsis more sensitive to both low and high temperatures, suggesting a link between protein ubiquitination and stress responses in plants.
Ubiquitin-dependent protein degradation has been shown to play important roles in plant growth and development (15). The abundance of U-box proteins, their ability to interact functionally with E2 enzymes to ubiquitinate protein substrates (16), and their expected important physiological roles in plants make structure determination of an Arabidopsis U-box domain of interest. Here we report the NMR solution structure of a prototypical Arabidopsis U-box domain from the AtPUB14 E3 protein (8).

EXPERIMENTAL PROCEDURES
Identification and Analysis of Predicted Arabidopsis U-box Proteins-Arabidopsis U-box proteins were identified by BLAST homology searches (17) and Inter Pro (18) and SMART (19) database searches. Redundant sequences were identified by ClustalW sequence alignments (20). Non-redundant U-box proteins were analyzed for domain architecture and intrinsic sequence features by SMART and BLAST domain analysis.
For production and purification of recombinant protein for structure determination, AtPUB14-(249 -321)-GST was transformed into Escherichia coli BL21(DE3), grown in minimal medium (M9 minimal medium supplemented with 100 mg/ml ampicillin, 3 g/liter ( 15 NH 4 ) 2 SO 2 and 4 g/liter [ 13 C 6 ]glucose or [ 12 C 6 ]glucose) at 37°C for 16 -18 h and then diluted 1:100 in labeled minimal medium. The cells were incubated at 37°C until an A 600 of 0.6 -0.7 was reached. After induction with 0.2 mM isopropyl-␤-D-thiogalactopyranoside, the culture was incubated at 15°C for 16 -18 h. Cells were harvested by centrifugation for 15 min at 7,000 ϫ g and lysed in 20 mM sodium phosphate, pH 7.5, 0.15 M NaCl, 0.1% Triton X-100, 1 mM dithiothreitol (DTT), 1 mM phenylmethylsulfonyl, and 0.1 mg/ml lysozyme (Miles-Seravac) by sonication. Insoluble cell material was removed by centrifugation for 15 min at 12,000 ϫ g, and the supernatant was passed through a 0.45-m HA filter (Milli-Pore). The fusion protein was allowed to bind to a glutathione-Sepharose column, and the column was washed with phosphate buffer (20 mM sodium phosphate, pH 7.5, 0.15 M NaCl, 1 mM DTT) and subsequently with Tris buffer (50 mM Tris-HCl, pH 8.0, 0.15 M NaCl, 1 mM DTT). The column was incubated with Tris buffer supplemented with 10 mM ATP and 5 mM MgSO 4 for 30 min at 4°C and then washed extensively with Tris buffer to elute E. coli chaperone Dnak.
Using 25 units of bovine thrombin (Amersham Biosciences) per milligram of fusion protein, the protease was dissolved in Tris buffer supplemented with 2.5 mM CaCl 2 , and the fusion protein was cleaved on-column for 2 h at room temperature. Cleaved AtPUB14-(249 -321) protein was eluted, pooled, and separated from thrombin by benzamidine affinity chromatography (Amersham Biosciences). Eluted protein was concentrated, and the buffer was changed to phosphate buffer (20 mM sodium phosphate, pH 7.5, 0.15 M NaCl, 1 mM DTT) using Amicon Centriprep (Millipore). Matrix-assisted laser desorption ionization time-of-flight mass spectrometry and SDS-PAGE were used to examine the purity of the protein. The final product contained five additional N-terminal residues, GSPEF. However, the protein used for the NMR analysis will be referred to only as PUB14-(249 -321).
Reverse Transcription-PCR Analysis of Expression-Arabidopsis plants were grown in a greenhouse at 21°C with 18-h light and 6-h darkness. Tissues were harvested and frozen in liquid nitrogen. Total RNA was isolated by LiCl precipitation, phenol/chloroform extraction, and ethanol precipitation followed by DNase treatment in a reaction coupled to RT using oligo(dT) primers and murine leukemia virus reverse transcriptase. 1 g of mRNA was used, and 5% of the cDNA products were used for PCR amplification. Pyrococcus furiosus DNA polymerase was used for PCR in the following PCR program: 30 min at 95°C, then 30ϫ (45 s at 94°C, then 45 s at 45°C, then 3 min at 72°C), then 15 min at 72°C. The primer sets were 5Ј-GTTTCACATCAACAT-TGTGGTCATTGG and 5Ј-GAGTACTTGGGGTAGTGGCATCC for EF1␣ (elongation factor, constitutively expressed control) and 5Ј-CC-TCGGACTCAGGAACAT and 5Ј-TCATGGAACAGTAGTTAC for AtPUB14.
In Vitro Ubiquitination Assay-The assays were performed essentially as described previously (21). Reaction mixtures (30 l) contained 0.5 g of recombinant protein, 20 ng of rabbit E1 and E2 (UbcH5b or UbcH13, Boston Biochemicals), and 10 g of His 6 -ubiquitin (Sigma) in a buffer containing 50 mM Tris at pH 7.4, 2 mM ATP, 5 mM MgCl 2 , and 2 mM DTT. When indicated the mixture also contained 1 l of E. coli extract. The reactions were incubated for 2 h at 30°C and analyzed by Western blot.
Inter-residue NOEs were automatically assigned by CYANA (26) and checked by computer-assisted spectrum inspection. Structures of PUB14-(249 -321) were calculated using standard XPLOR-simulated annealing protocols (27) and iteratively analyzed for violations of experimental constraints. Finally, 200 structures were calculated of which none had NOE violations of Ͼ0.5 Å and angle violations of Ͼ5°. From these, the 30 lowest energy structures were chosen for water refinement using CNS (28). After the water refinement, none of the 30 structures had NOE violations of Ͼ0.4 Å and angle violations of Ͼ5°.
From the sample of 30 water-refined structures an ensemble of 20 structures comprising those with the lowest number of violations and with the lowest energy was chosen to represent the PUB14-(249 -321) U-box structure. These structures were analyzed and checked using Procheck-NMR (29), XPLOR (27), WHAT-IF (30), and NACCESS (31). For a comparison of related structures a set of six RING structures and one U-box structure was selected using CE (32) and MODELLER 6.2 (33). The selection criterion was an alignment of more than 80% of their C ␣ atoms within 2.5 Å of the structure of PUB14-(249 -321). For the structures in the sets determined by NMR spectroscopy, only model 1 was used for comparison. The structures were displayed using the molecular graphics computer programs Web Lab Viewer and MOLMOL (34).
HYDROPRO Prediction of the Stokes Radius-HYDROPRO was used to compute the Stokes radius of the calculated PUB14-(249 -321) U-box structure (36). The three lowest energy structures of the ensemble of 20 structures representing the PUB14-(249 -321) U-box were used for the calculations, and an average of the three resulting Stokes radii was calculated. Standard parameters were used, and molecular mass and temperature were set to 8754 Da and 298 K, respectively.
Coordinates-The coordinates for each of the 20 structures in the selected ensemble of PUB14-(249 -321) U-box structures together with all NMR assignments have been deposited at the Protein Data Bank under the accession code 1T1H.

RESULTS
Arabidopsis U-box Proteins-The Arabidopsis thaliana genome encodes for many U-box proteins (8). To select a representative U-box protein for structural and functional studies we analyzed the Arabidopsis genome for predicted U-box pro-teins using BLAST (17) homology searches and keyword searches of the SMART (19) and InterPro (18) (19) and InterPro (18) domain database searches and analyzed for sequence motifs and domains using BLAST and SMART. For each group of U-box proteins the presence in the sequence of the U-box and other types of domains is shown. Several U-box ARM repeat proteins were also predicted to contain coils; these are not shown. The number of proteins in each group is indicated to the right of each structural outline.
Outlines are not drawn to scale, and the number of domains or motifs may vary in the different proteins. The domains and motifs shown were all recognized by SMART and InterPro except UND defined by Mudgil et al. (16). b, sequence alignment of U-box domains. The sequence of five prototypical Arabidopsis U-box domains (at top) aligned with the sequences of four additional Arabidopsis U-box domains, the yeast Prp19p U-box domain, the human BRCA1, RAG1, c-Cbl, Rbx1, and MAT1, and the rice EL5 RING finger domains. The Arabidopsis proteins were named as suggested previously (8,14), and in addition the genomic loci are shown. Residues forming ␣-helix and ␤-strands in the AtPUB14 and ScPrp19p U-box domains are shown as ␣ and ␤, respectively, and residues of importance for the interaction between the E3 ubiquitin ligase c-Cbl and the E2 enzyme UbcH7 are marked by a star (37). The symbols "∧∧∧" indicate that 22 amino acids were removed from Rbx1 to fit the alignment. c, comparison of the hydrogen bond network in AtPUB14 and ScPrp19p U-box sequences. Residues marked red on black correspond to the zinc ion binding site 1 in the RING motif. Residues marked white on black correspond to the zinc ion binding site 2 in the RING motif. The stars above and below the sequences mark positions that are involved in hydrogen bond formation in the respective U-boxes (this work and Ref. 7) using the same code (red on black, site 1; white on black, site 2) as the marking in the sequences. Residues marked black on gray are the hydrophobic E2 binding site residues. large fraction of these proteins contains one or more ARM repeat regions (9). Many of the proteins identified by the search also contain an ϳ200 amino acid residue N-terminal domain recently named UND (U-box N-terminal domain (16)).
AtPUB14, a Model Arabidopsis U-box Protein-The sequences of the Arabidopsis U-box domains are highly conserved (Fig. 1b) (8), and the sequence similarity extends beyond the C-terminal border of the Prp19p U-box fragment of known tertiary structure (7). This makes structure determination of a typical Arabidopsis U-box domain important. The 632-amino acid residue Arabidopsis AtPUB14 protein (8) was selected for structure studies, because it contains a typical Arabidopsis U-box domain (Fig. 1b) and has a domain architecture consisting of a UND domain (16) followed by a U-box domain and several ARM repeats (Fig. 1a). Furthermore, based on the number of available expressed sequence tag clones the At-PUB14 gene is relatively well expressed compared with other U-box genes. This was confirmed by RT-PCR analysis of different Arabidopsis tissues, which suggested that the AtPUB14 gene is expressed in several tissues, including flowers, green siliques, seeds, and rosettes (Fig. 2).
The structure of the complex between the RING domain from c-Cbl and the E2 UbcH7 (37) serves as a model for interactions between RING domain E3 and E2 enzymes. According to this, Ile 383 and Trp 408 of c-Cbl form a hydrophobic groove that interacts with a phenylalanine in UbcH7, and Trp 408 is involved in determining the specificity of the E2/E3 interactions (37). Identical amino acids are present in corresponding positions in the ATPUB14 U-box domain and in many of the Arabidopsis U-box domains (Fig. 1b) (8). The Arabidopsis U-box domains also contain a proline in a position that is part of the hydrophobic E2 interaction surface of c-Cbl. These similarities suggest that AtPUB14 is an E3 ubiquitin ligase.  3. a, E3 activity of AtPUB14. GST-PUB14-(249 -632) was incubated for 2 h at 30°C in the presence of bacterial lysate, rabbit E1, human E2 as indicated, ATP, and ubiquitin and subjected to immunoblotting using an anti-ubiquitin antibody. b, auto-ubiquitination activity of At-PUB14. GST-PUB14-(249 -632) was incubated for 2 h at 30°C in the presence of rabbit E1, human UbcH5a, ATP, and ubiquitin and subjected to immunoblotting using an anti-GST antibody. As negative control one of the components was omitted from the reaction mix. The arrow shows the position corresponding to purified GST-PUB14-(249 -632). AtPUB14 fused to GST (GST-PUB14-(249 -632)) or MBP (MBP-PUB14-(219 -632)) were soluble and contained putative substrate-binding ARM repeats. For the assays GST-PUB14-(249 -632) was incubated with recombinant ubiquitin, rabbit E1, and human recombinant E2 (UbcH5b or UbcH13) in the presence of E. coli proteins. UbcH5b was used, because it interacts functionally with several U-box and RING proteins (3,21). The reactions were probed by Western blotting using anti-ubiquitin or anti-GST antibodies for the detection. When the reaction mixture contained E. coli proteins and was analyzed using anti-ubiquitin antibodies, GST-PUB14-(249 -632) mediated ubiquitination of protein substrates in the reaction mixture (Fig. 3a). This modification was dependent on both ubiquitin, ATP, an E1, the type of E2, and GST-PUB14-(249 -632). UbcH5b, but not UbcH13, functioned with GST-PUB14-(249 -632) showing that AtPUB14 requires a specific E2 for activity. In addition, the results showed that the UND domain is not required for in vitro ubiquitination activity of the Arabidopsis UND/U-box/ARM proteins, as suggested previously (16).
Auto-ubiquitination-E3s often undergo auto-ubiquitination, and auto-ubiquitination of U-box proteins has also been reported (2). Therefore, the ability of GST-PUB14-(249 -632) to undergo auto-ubiquitination was examined. When detection was performed using an antibody against GST, a substantial proportion of the 68-kDa GST-PUB14-(249 -632) protein exhib-ited a shift in electrophoretic mobility toward the top of the gel indicative of polyubiquitination. This reconfirmed the ability of the U-box domain to function as an E3 ubiquitin ligase.
Structure Determination of the PUB14-(249 -321) U-box Domain by NMR Spectroscopy-The ability of AtPUB14 to function as an E3 ubiquitin ligase has made structure determination of its U-box domain important for an in-depth understanding of how this function is executed. Analysis of the NMR spectra of PUB14-(249 -321) U-box resulted in the assignment of the NMR signals from more than 95% of the NMR active nuclei in the peptide backbone and the amino acid side chains. A total of 1083 non-redundant NOEs was assigned and applied in the structure calculations together with 88 backbone dihedral angle constraints derived from secondary chemical shift database mining using TALOS (Table I)  ␤-sheet of three strands formed by two strands in a ␤-hairpin (Pro 264 -Tyr 273 ) and an additional ␤-strand (Thr 302 -Asn 304 ), and of two long loops, loop1 (Tyr 251 -Asp 263 ) and loop2 (His 286 -Leu 301 ) (Figs. 1c and 4b). The structure is globular with helix1 packed against the antiparallel ␤-sheet and helix2 packed against the N-terminal loop. The structure is stabilized by a number of long range interactions (Fig. 4b) (Fig. 4c). The network apparently consists of at least three hydrogen bond donors from two side chains and one from the peptide backbone. A single carboxylate group (Glu 259 ) provides a pair of oxygen acceptors one of which may form a bifurcated hydrogen bond to two of the hydrogen donors.
The residues in AtPUB14 in the positions corresponding to the second zinc ion binding site in the RING motif are Thr 269 , Thr 272 , Cys 289 , and Ser 292 . A network of hydrogen bonds and salt bridges in the AtPUB14 U-box domain involves two of these residues, Cys 289 and Ser 292 , which together with Lys 291 and Glu 294 seem to replace the interactions in the second zinc ion binding site in the RING motif. This network is not defined very well in the ensemble of NMR structures. However, slow hydrogen exchange of the H␥ 1 both in Thr 269 and Thr 272 suggests that these form hydrogen bonds that stabilize the network. The analysis of the hydrogen bonds in the structures (Fig. 4d) indicates the presence of a salt bridge between the carboxylate of Glu 294 and the amino group of Lys 291 , which may stabilize the loop2 conformation, and a hydrogen bond between the side chain carbonyl of Gln 271 and the hydrogen of the sulfhydryl group of Cys 289 , which is the only interaction seen that connects the C-terminal string of the hairpin and loop2 as in the zinc ion site in the RING motif. There are a number of differences, between the ScPrp19p U-box and the AtPUB14 U-box in those residue positions, that are involved in the hydrogen bond network and that replace the corresponding zinc ion ligation sites in the RING motifs (Fig. 1c). It is noteworthy, however, that the residues surrounding the conserved E2 binding site residues in the two loops are engaged in the hydrogen bond network in the two U-box structures.
Structural Comparison of RING and U-box Folds-The typical mixed ␣ϩ␤ fold of the RING finger, found both in Prp19p and in AtPUB14 U-box, is seen in a number of protein structures in the Protein Data Bank. The scaffold of these structures is remarkably similar to ten RING structures, all of which are classified as belonging to the RING finger domain, C3HC4 family, of the SCOP database (38). An alignment of these 12 three-dimensional protein structures using combinatorial extension (32) and MODELLER 6.2 (33) revealed a total of seven proteins that align more than 80% of their C ␣ atoms within 2.5Å of the lowest energy structure of the PUB14-(249 -321) U-box domain (Table II). A superimposition of these eight structures is shown in Fig. 5 and includes the RING and U-box domains of c-Cbl, Rbx1, BRCA1, RAG1, EL5, Prp19p, MAT1, and AtPUB14. All of these proteins, except MAT1, have been shown to function as ubiquitin protein ligases (7, 39 -43). The similarity of the orientation of the secondary structures consisting of one ␣-helix (helix1) and three anti-parallel ␤-strands is obvious, but also the orientation and packing of the two loops against the ␣-helix is equivalent.
Interactions with E2 Enzymes-In the structure of the complex between c-Cbl and UbcH7, three residues, Ile 383 , Trp 408 , and Pro 417 , have been proposed to form a hydrophobic binding groove for the E2 interaction (37). In six of the seven structures aligning with AtPUB14, the proline and a large aliphatic residue are conserved, and they all contain a hydrophobic groove (Table II and Fig. 6). Substitution of Ile 26 in BRCA1 with an alanine abolished the ubiquitin ligase activity of the BRCA1-BRCA1-associated RING domain protein complex (44) demonstrating the importance of this aliphatic position. MAT1 contains an arginine in this position, which, besides being engaged in the hydrophobic groove, protrudes its guanidino group right in the middle of the putative E2 binding site. This could explain why no E3 activity has been demonstrated for MAT1. The U-box structure collapses upon mutation of the proline to alanine (7). This suggests that, besides its role in the E2 binding site, this proline is also mandatory for stabilizing the loop and the structural scaffold. In AtPUB14, van der Waals interactions of Pro 290 in loop2 are found mainly to occur in residues Tyr 273 in the ␤-hairpin, Trp 281 in helix1, and Cys 289 in loop2, but also in Ile 256 in loop1. The three residues in the putative E2 binding site in the AtPUB14 U-box are Ile 256 in loop1, Trp 281 in helix2, and Pro 290 in loop2. To establish the appropriate geometry for a binding site of this type the three elements that provide the residues have to be fixed in a well defined way relative to each other. In the RING motif this is in part ensured by the two zinc ion binding sites, both of which link the loop regions to a well defined secondary structure element. Furthermore, the two binding site residues are placed right between zinc ion binding site ligand residues in the two loops (Fig. 1c). In the AtPUB14 U-box the two hydrophobic cores and the long range hydrogen bonds and salt bridges may serve this purpose together with the network of local and long range hydrogen bonds that is observed in the structures. In comparison to the Prp19p U-box the loop1 position in the AtPUB14 U-box is further stabilized by the interactions to the C-terminal helix, which is not observed in the Prp19p U-box structure.
In addition to the hydrophobic binding surface in the E2 binding site it has been proposed that the exposed side of the common ␣-helix (helix1) in the RING and U-box domains is part of the binding site (Fig. 6) (7,37,43,44). Several of the residues in this site are polar and charged, however there is no clear sequence homology in this region (Fig. 1b). In c-Cbl Cys 404 , Ser 407 , and Ser 411 are part of the polar E2 interface, whereas in AtPUB14 the corresponding residues are Ser 277 , Lys 280 , and Ala 284 . In BRCA1 NMR studies have shown that several polar residues are in the proximity of the binding site for Ubc5Hc (44), and similar indications were seen for E2binding by EL5 (43). However, a comparison, which involves eight highly familiar U-box and RING structures, reveals that there is no common surface charge pattern in this part of the structure (Fig. 6). Hence, there is no common polar E2 binding site motif in these eight structures or in the Arabidopsis U-box proteins (Fig. 1b). Dimer Formation of the AtPUB14 U-box Domain-The Cterminal ␣-helix (helix2) in the AtPUB14 U-box domain is not present in the Prp19p U-box domain. However, a corresponding C-terminal ␣-helix is also found in the RING domain of RAG1, where it is involved in dimerization (45), and in the RING domain of BRCA1, where it forms intramolecular interactions with an N-terminal helix (Fig. 6) (46). The At-PUB14 U-box domain presents solvent-accessible non-polar residues from the N terminus (Tyr 251 and Phe 252 ) and from the C-terminal ␣-helix (Leu 313 and Trp 314 ). In RAG1, similar residues (Phe 280 , Phe 284 , Phe 344 , and Ile 347 ) form a homodimerization interface (45). This similarity may suggest that the AtPUB14 protein can form a dimer. Therefore, the molecular weight of the AtPUB14 U-box was determined by two different methods using gel filtration (data not shown) and pulsed field gradient NMR techniques, respectively. Both techniques measured a Stokes radius of 21 Ϯ 1 Å for the AtPUB14 U-box domain. This is different from the predicted Stokes radius of 17.6 Å of the monomeric AtPUB14 U-box structure determined by NMR spectroscopy and suggests that the AtPUB14 U-box might form a dimer in solution. An attempt to dissociate the dimer by dilution and monitoring this by chemical shift changes in the NMR spectrum was performed. At low concentration, 0.032 mM, of AtPUB14 Ubox a few very small but significant chemical shift changes were observed as well as specific line broadening effects (Fig.  7). It is of interest to note that, although these effects are very small, they coincide with the position of the dimerization site in RAG1, and they involve non-polar residues in the Nterminal region and in the C-terminal ␣-helix.
Although there is a strong indication from several different methods that the AtPUB14 U-box forms a dimer in solution, the present study offers no result to describe the structure of the interface between two monomer structures. The 15 N-1 H HSQC NMR spectrum has only one peak per residue, and there are no signs of signal doubling for any atoms in the spectra. This suggests that if the molecule forms a dimer it is highly symmetrical. The set of NOEs that has been used for the structure calculations defines the monomer structure well. Attempts to model a dimer structure, using otherwise assigned intramolecular NOEs from the suggested dimerization region as intermolecular NOEs between two molecules, never resulted in structures that complied with the new set of restraints. Because there are no additional unassigned NOEs in the spectra, this suggests strongly that the set of NOEs used in the structure calculations contains all intramolecular NOEs, and apparently there are no intermolecular NOEs in the set that can be used to describe the dimer interface. The similarity of the monomer structure to seven other RING and U-box structures ( Fig. 5 and Table II) is also in support of the structure determination of the monomeric form of AtPUB14 U-box being a representative structure of this even as a part of the suggested dimer.

DISCUSSION
By far the largest subgroup of Arabidopsis U-box proteins contains one or several ARM repeats C-terminal to a highly conserved U-box domain (Fig. 1a) (47). Important physiological functions have been assigned to these proteins (13,48), and the origin of the different functions may be due to differences in the binding specificities of the ARM repeats (47). There is increasing evidence that these proteins function as ubiquitin ligases (16) and thus participate in regulated protein degradation. The structure of the AtPUB14 U-box domain reveals similarities, as well as differences, to other RING/U-box domains and based on predictions allows considerations of the activity of the many plant U-box proteins and their specificity toward E2s.
UbcH5b, but not UbcH13, functions with AtPUB14 to ubiquitinate protein substrates. UbcH5b contains a phenylalanine in a position that is likely to interact with Trp 281 of the At-PUB14 U-box domain (37). Most of the Arabidopsis U-box domains contain a tryptophan in this position, but in some domains the position is occupied either by a histidine, a tyrosine, or a cysteine (Fig. 1b) (8). The same residues, and leucine, are also found in other eukaryotic U-box domains (Fig. 1b). This position was initially proposed to be a specificity determinant for the interaction with E2 enzymes based on the c-Cbl-UbcH7 complex structure (37). This has since been supported by independent studies. For example, Prp19p contains a tyrosine and functions specifically with Ubc3 (7), and U-box domains with a histidine functionally interact with Ubc4 and Ubc5 (2, 3). Whereas the identity of this position is essential for the activity of c-Cbl (39), containing a tryptophan, and Prp19p, containing a tyrosine (7), the leucine in BRCA1 is dispensable for the ubiquitination activity of the BRCA1-BRCA1-associated RING domain protein ubiquitin-ligase complex (44). This could be explained by the differential importance of this position depending on the chemical nature of the residue. Whereas leucine is one of several hydrophobic contributors to the hydrophobic E2-binding groove, the effect of the structurally more characteristic tryptophan and tyrosine could be more specific. However, it was recently demonstrated that five different Arabidopsis U-box proteins, all containing a tryptophan, were active with different E2s having a preference that did not correlate with their phylogenetic relationship (16). Thus, additional residues and regions must be of importance to specificity. The structure of AtPUB14 and comparisons suggest that the distribution of exposed charged and polar residues of the central ␣-helix of AtPUB14 may act as specificity determinants. In this structure element, highly conserved positions are mixed with positions showing sequence variance required for specificity (Fig. 1b). This structural information now allows rational examination of the specificity of the many Arabidopsis U-box domains and E2 enzymes.
Based on sequence alignments, the consensus sequence of the Arabidopsis U-box domains was expanded at the C terminus compared with U-box domains from other organisms. This region was predicted (49) to form an ␣-helix, which was confirmed by the determined structure of AtPUB14. In the RAG1 RING domain a similar C-terminal and an N-terminal ␣-helix exposes hydrophobic residues that form a dimerization surface (45). An N-terminal ␣-helix was not predicted for the Arabidopsis proteins, many of which contain several ␣-helix-breaking prolines in this region (Fig. 1b). The biochemically active AtPUB14 U-box fragment does not contain an N-terminal ␣-helix. This is in support of its absence from the Arabidopsis U-box protein motif. However, hydrophobic amino acids are exposed from both the N-terminal loop region and the C-terminal ␣-helix (helix2) of the AtPUB14 U-box domain and can explain the ability of the domain to form a dimer, in accordance with the chemical shift analysis (Fig. 7).
The C-terminal ␣-helix is also likely to play an important role in stabilization of the U-domain. The interactions between residues in loop1 and helix2 add, not only, to the stabilization of the molecule but also to formation of an important part of the hydrophobic binding site. The Arabidopsis U-box domains are thus additionally stabilized compared with what was reported for the Prp19p U-box domain (7). The AtPUB14 structure can be used to facilitate the understanding of the E2 recognition by the widespread and abundant plant U-box proteins. Moreover, the AtPUB14 U-box structure with its additional C-terminal ␣-helix also provides a framework to study the stability and folding of this small and commonly found protein fold.