The Structural Basis of DNA Target Discrimination by Papillomavirus E2 Proteins*

The papillomavirus E2 proteins regulate the transcription of all papillomavirus genes and are necessary for viral DNA replication. Disruption of the E2 gene is commonly associated with malignancy in cervical carcinoma, indicating that E2 has a role in regulating tumor progression. Although the E2 proteins from all characterized papillomaviruses bind specifically to the same 12-base pair DNA sequence, the cancer-associated human papillomavirus E2 proteins display a unique ability to detect DNA flexibility and intrinsic curvature. To understand the structural basis for this phenomenon, we have determined the crystal structures of the human papillomavirus-18 E2 DNA-binding domain and its complexes with high and low affinity binding sites. The E2 protein is a dimeric β-barrel and the E2-DNA interaction is accompanied by a large deformation of the DNA as it conforms to the E2 surface. DNA conformation and E2-DNA contacts are similar in both high and low affinity complexes. The differences in affinity correlate with the flexibility of the DNA sequence. Preferences of E2 proteins from different papillomavirus strains for flexible or prebent DNA targets correlate with the distribution of positive charge on their DNA interaction surfaces, suggesting a role for electrostatic forces in the recognition of DNA deformability.

The papillomavirus E2 proteins regulate the transcription of all papillomavirus genes and are necessary for viral DNA replication. Disruption of the E2 gene is commonly associated with malignancy in cervical carcinoma, indicating that E2 has a role in regulating tumor progression. Although the E2 proteins from all characterized papillomaviruses bind specifically to the same 12-base pair DNA sequence, the cancer-associated human papillomavirus E2 proteins display a unique ability to detect DNA flexibility and intrinsic curvature. To understand the structural basis for this phenomenon, we have determined the crystal structures of the human papillomavirus-18 E2 DNA-binding domain and its complexes with high and low affinity binding sites. The E2 protein is a dimeric ␤-barrel and the E2-DNA interaction is accompanied by a large deformation of the DNA as it conforms to the E2 surface. DNA conformation and E2-DNA contacts are similar in both high and low affinity complexes. The differences in affinity correlate with the flexibility of the DNA sequence. Preferences of E2 proteins from different papillomavirus strains for flexible or prebent DNA targets correlate with the distribution of positive charge on their DNA interaction surfaces, suggesting a role for electrostatic forces in the recognition of DNA deformability.
Networks of hydrogen bonds between amino acid side chains and the functional groups of DNA bases are a well documented mechanism of DNA sequence recognition by proteins. Not as well understood is the stereochemical and energetic basis by which protein-DNA binding affinity is modulated when all of these direct components of interaction are conserved. Biologically, such discriminatory abilities are essential when DNAbinding proteins have to select between multiple binding sites present among a vast excess of nonspecific DNA. This is the case in the papillomaviruses, where the primary transcriptional regulatory protein, E2, is confronted by numerous binding sites on the viral genome (Fig. 1a). These sites must be occupied in a defined order such that transcription and viral DNA replication proceed in a regulated fashion (1)(2)(3)(4)(5)(6)(7).
The papillomaviruses are a large family of double-stranded DNA viruses that vary in host species specificity, tissue tropism, and the clinical outcome of infection (8,9). The E2 protein has the same functions in the over 100 characterized papillomavirus strains: to activate or repress transcription in a contextdependent fashion and to facilitate the initiation of viral DNA replication via interactions with the viral replication protein E1. Differences in the details of transcription and replication control have been documented between the viral strains that infect different host tissues and between the papillomavirus strains that are associated with different conditions in humans such as warts or cervical cancer (2)(3)(4)(5)(6)(7). The E2 proteins from all viral strains have in common the fact that they bind a palindromic DNA sequence ACCgNNNNcGGT, referred to as the E2 binding site (E2BS; 1 lowercase letters indicate preferred nucleotides, and the NNNN region is called the "spacer"). However, there exist virus strain-specific differences in the abilities of various E2 proteins to discriminate between binding sites. The E2 proteins from the human papillomavirus (HPV) strains that infect mucosa (including the cancer-associated strains HPV-18 and HPV-16 and the wart-causing strain HPV-11) bind with significantly greater affinity to E2BS with spacers rich in AT base pairs (1, 10 -12). On the other hand, the E2 protein from bovine papillomavirus type 1 (BPV-1) displays no distinctive spacer sequence preference (12). The viral genomes reflect these trends; the mucosal HPV genomes have E2 binding sites with AT-rich spacers (Fig. 1a), while the genomes of the nonprimate animal viruses (including BPV-1) have no such predominance of AT-rich spacer-containing binding sites (13). The crystal structures of the E2 DNA-binding domain from strains BPV-1 (14), HPV-16 (15), and HPV-31 (16) have been reported, as has the co-crystal structure of BPV-1 E2 bound to DNA (17). However, there is no structural information yet on a DNA complex of any HPV E2 protein. Here we report the crystal structures of the E2 DNA-binding domain (E2/D) from human papillomavirus type 18 and its complexes with high and low affinity binding sites.
The HPV-18 E2-DNA interaction uses an induced-fit mechanism, with a large deformation of DNA and only modest local rearrangements of the protein. Comparisons of the various E2/D structures reveal an unexpected similarity between the quaternary structures of the HPV-18 and BPV-1 E2/D proteins as compared with those of the more closely related (in sequence, evolutionary distance of viral strain, and pathology) HPV-16 and HPV-31 E2/D proteins. HPV-18 E2/D binds with very different affinities to the two E2BS sequences used in the crystallographic studies, yet no base sequence-specifying protein-DNA contacts differ in the two complexes. Like the related HPV-16 E2/D protein, but in contrast to BPV-1 E2/D, HPV-18 * This work was supported by Grant CA66964 from the National Institutes of Health. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The  1. a, structure of the HPV-18 genome. The 7857-base pair circular genome consists of two regions coding for early (E) and late (L) genes. An 800-base pair noncoding region is present upstream of the E6 gene and is called the long control region (LCR). The major early promoter (P 105 ) and a differentiation-specific promoter are indicated by arrows. The long control region contains both transcriptional regulatory elements and the origin of replication. A region of the long control region (of HPV strains Ϫ18, Ϫ16, Ϫ11, and Ϫ31) that contains the four E2 binding sites (E2BS1-4) is detailed below. In each case, the spacer sequence is shown in blue. The inset is the consensus DNA-binding site of the E2 proteins and the sequences of the two binding-sites referred to as E2BS(AATT) and E2BS(ACGT). The identity elements are in red, and the preferred nucleotides are in green. b, amino acid sequence alignments of the HPV-18, HPV-16, and BPV-1 E2 DNA-binding domains. Identical residues are in red, and similar residues are in green. The residue numbering for HPV-18 E2 and BPV-1 E2 are shown as is the secondary structure of HPV-18 E2/D. The arrows represent ␤-strands, and zigzag lines indicate helices. c, ribbon diagram of the HPV-18 E2 DNA-binding domain.The two subunits (gold E2/D recognizes and discriminates against conformational flexibility in DNA. There exists a correlation between the distribution of positive charge on the DNA interaction surfaces of these proteins and their preferences for prebent or flexible DNA targets.

Protein and Oligonucleotide Preparation
His-tagged HPV-18 E2/D (amino acids 286 -365 and N-terminal residues GSHM that arose from the cloning procedure) was obtained by overexpression in Escherichia coli strain BL21(DE3)pLysS. The Histagged protein was purified by nickel affinity chromatography. The His tag was removed by thrombin treatment, and the cleaved protein was further purified by ion exchange chromatography (Fast-S; Amersham Pharmacia Biotech), resulting in pure protein as determined by silverstained SDS gels. BPV-1 E2/D was purified as described previously (17). All oligonucleotides were obtained from the Yale Keck Biotechnology Facility and purified by ion exchange chromatography (Mono-Q; Amersham Pharmacia Biotech) (17).

Crystallizations
All of the crystals were grown by vapor diffusion from hanging drops at room temperature.
is the observed structure factor amplitude and ͉F C ͉ is the structure factor amplitude from the refined coordinates.
c Free R, R factor calculated on 10% of the data excluded from refinement. d root mean square deviation.

TABLE II Equilibrium binding constants for HPV-18 E2/D
The design of oligonucleotides used in these experiments is shown in Fig. 3. Hairpin oligonucleotides of the sequence shown in (Fig. 3a) were used to determine the affinity of HPV-18 E2/D for E2BS with the spacer sequences indicated in the first section. Dumb-bell shaped oligonucleotides ( Fig. 3b) were used to introduce nicks in the phosphodiester backbone of the DNA probe. a K rel refers to K ACGT /K eq . b Ϯ65% refers to the confidence limits. c ʈ refers to a break in the sugar-phosphate backbone. d K rel refers to K intact spacer /K eq . and lavender) associate to form an eight-stranded antiparallel ␤-barrel. Disordered regions are represented by dotted lines. The topology is shown beside them. The ␤-barrel is formed by curling the figure toward the viewer such that the ␤ 2 strands at the edges can form hydrogen bonds with each other. The ribbon diagram was generated by RIBBONS (42).
reservoir solution containing 2.7 M ammonium sulfate and 0.1 M sodium acetate, pH 4.7. Crystals of dimensions 0.25 ϫ 0.25 ϫ 0.15 mm appeared in 3-5 days. Diffraction data were recorded on a RAXIS II image plate detector from a crystal maintained at Ϫ170°C.
BPV-1 E2/D-E2BS(AATT)-An equimolar mixture of BPV-1 E2/D and an annealed oligonucleotide of sequence CCAACCGAATTCGGTTG was made in 25 mM Tris, pH 7.5, 100 mM NaCl, 10 mM DTT. Drops were made with 6.25 mg ml Ϫ1 of the complex and an equal volume of a reservoir solution containing 30% PEG3350, 47 mM MES, pH 6.2, 2 mM CaCl 2 . Rod-shaped crystals of dimensions 0.15 ϫ 0.1 ϫ 0.08 mm appeared in 5-7 days. Diffraction data were recorded on a RAXIS II image plate detector from a crystal maintained at Ϫ170°C.

Structure Solution and Refinement
All images were indexed; the reflections were integrated, scaled, and postrefined with the HKL package (programs DENZO and SCALE-PACK (18)); and the structures were determined by molecular replacement using the program AMORE (19).
A polyalanine model constructed from the refined structure of the BPV-1 E2/D-E2BS(AATT) complex was used in Molecular Replacement to determine the structure of the HPV-18 E2/D-E2BS(AATT) complex. Using data from 15-4.8 Å, the correct solution had a correlation coefficient of 0.491.
A model consisting of one subunit of HPV-18 E2/D was constructed from the HPV-18 E2/D-E2BS(AATT) complex structure and used to obtain a Molecular Replacement solution for the HPV-18 E2/D structure. Using data between 15 and 4.5 Å, the correct solution had a correlation coefficient of 0.507.
Using the refined structure of the E2BS(AATT) complex as a model, rigid body refinement was conducted on the HPV-18 E2/D-E2BS(ACGT) complex data. Using difference maps, the correct oligonucleotide sequence was modeled.
All structures (except the HPV-18 E2/D-E2BS(ACGT) complex) were refined using a combination of simulated annealing, bulk solvent correction, positional refinement, and restrained individual B-factor refinement in CNS (20). Simulated annealing omit maps were systematically calculated and examined to minimize the effects of model bias. Only restrained grouped B-factors were refined for the HPV-18 E2/D-E2BS(ACGT) structure.
The final model of HPV-18 E2/D-E2BS(AATT) includes residues 287-323 and 329 -364 (residues 309 -312 had weak electron density, and some of these side chains were modeled as Ala). No unambiguous density corresponding to the C-terminal residue (Met 365 ) was present, and it is not included in the refined model.
In three out of four subunits that comprise the asymmetric unit, the final model of HPV-18 E2/D includes residues 287-322 and 329 -365 (residues 308 -312 had weak electron density, and these side chains were modeled as Ala). One subunit had clear density corresponding to residues 323-328, which are included in the model. Since crystal packing influences this region, the loop conformation is not included in any discussions.
The final model of HPV-18 E2/D-E2BS(ACGT) includes residues 287-323 and 328 -364 (residues 310 -312 had weak electron density, and some of these side chains were modeled as Ala). No unambiguous density corresponding to the C-terminal residue (Met 365 ) was present, and it is not included in the refined model.
The final model of BPV-1 E2/D-E2BS(AATT) includes residues 326 -410. There are two complexes in the asymmetric unit. In one complex, one 5Ј-overhanging C is flipped out packing against the protein surface, and no density corresponding to the other 5Ј-C is present. In the other complex, the C nucleotide at position Ϫ8 of one strand is flipped out and packed against the protein surface. The 5Ј-C then base pairs with G ϩ8 of the other strand. Since the DNA conformation at the 5Ј-and 3Ј-ends are clearly affected by crystal packing, all analyses of DNA conformation in the discussions are restricted to the central 14 base pairs that include the E2BS. For all structures, the final model had all non-Gly residues in allowed regions of the Ramachandran plot. The data and refinement statistics are summarized in Table I. A section of a composite simulated annealing omit map calculated on the HPV-18 E2/D-E2BS(AATT) co-crystal structure is shown in Fig. 2.

DNA Binding Studies
The oligonucleotides were engineered so that the termini of the DNA were either at one end or within the spacer region (Table II and Fig. 3) of the E2 binding site. This strategy allowed the introduction of breaks in the phosphodiester backbone of the E2 binding site. Oligonucleotides were purified by acrylamide gel electrophoresis, end-labeled, and annealed by heating the probe (2 nM concentration) to 90°C for 10 min and transferring immediately to ice for 10 min. Quantitative gel mobility shift experiments were conducted following published protocols (21, 22). Briefly, serial dilutions of the proteins were added to binding buffer resulting in a final mixture containing 2 nM ␥-32 P-labeled oligonucleotide, 150 g/ml bovine serum albumin, 5 g/ml sonicated salmon sperm DNA, 5.5 mM DTT, 22 mM HEPES, pH 7.9, 150 mM KCl, 5 mM MgCl 2 , and 10% (v/v) glycerol. The reaction mixtures were incubated in a water bath at 25°C for 30 min and then loaded onto prerun 12% polyacrylamide gels in 0.5ϫ TBE, pH 8.0. The electrophoresis was run at 200 V for ϳ90 min. The gels were dried and the reaction products visualized by exposure to phosphor storage plates, which were scanned using a PhosphorImager (Molecular Dynamics, Inc., Sunnyvale, CA).
The density of the electrophoretic band representing the protein-DNA complex was quantitated using the ImageQuant software. Binding isotherms were obtained by monitoring the density of the electro- phoretic band representing the protein-DNA complex as a function of protein concentration and analyzed by nonlinear least-squares analysis. The equilibrium binding constant, K, was determined by analysis of the titration curves against the coupled equations where ⌰ 1 is the DNA fraction unbound, ⌰ 2 is the DNA fraction complexed with protein, K is the equilibrium association constant, and [X] is the free active protein dimer concentration. The DNA binding activity of the HPV-18 E2/D protein preparations was determined from stoichiometric titrations (22); the values presented are corrected for this activity. Since the ␥-32 P-labeled oligonucleotide concentration is much lower than the equilibrium dissociation constants being measured, the approximation that total protein concentration is equal to free protein concentration is made. The standard state Gibbs free energy of binding was calculated from the equilibrium association constant by ⌬G 0 ϭ ϪRT ln K, where R is the gas constant and T is temperature. Each of the values of ⌬G 0 reported were determined by the global analysis of at least two independent titrations.

RESULTS
Overview of the Structures-The 1.9-Å crystal structure of HPV-18 E2/D and the 2.4-Å structure of its complex with the high affinity binding site E2BS(AATT) are described in detail below. In order to examine the stereochemistry at a low affinity HPV-18 E2/D-DNA interface, the structure of HPV-18 E2/D bound to E2BS(ACGT) has been determined. While the limited resolution (3 Å) of this crystal structure does not permit detailed analyses, it is of sufficient quality to allow comparisons of global features and direct protein-DNA interactions. To provide a direct comparison between the modes of DNA recognition of the HPV-18 and BPV-1 E2/D proteins, the 2.3-Å structure of BPV-1 E2/D bound to E2BS(AATT) has also been determined. This complex is similar in most respects to the previously reported crystal structure of BPV-1 E2/D bound to E2BS(ACGT) (17), and only salient features are presented here. In all of the discussions below, comparisons are drawn between the complexes of the HPV-18 or BPV-1 E2/D proteins with the same E2BS sequence, unless explicitly stated otherwise.
The HPV-18 E2 DNA-binding Domain-Each monomer of HPV-18 E2/D folds into an open-faced ␤-sandwich with a ␤-␣-␤-␤-␣-␤ topology (Fig. 1c). One ␣-helix in each subunit contains all of the amino acid residues involved in half-site recognition and is termed the "recognition helix." Two E2/D monomers associate such that the ␤-strands at the edges (␤ 2 and ␤ 4 ) form hydrogen bonds with their symmetry mates (Fig. 1c). This results in a continuous eight-stranded antiparallel ␤-sheet. Buried in the interface are numerous large side chains including Thr 287 , Ile 289 , His 291 , Lys 293 , Trp 320 , Trp 322 , Ile 332 , Thr 334 , Thr 336 , Met 363 , and Met 365 (Fig. 4a). The dimerization interface is extensive, occluding 1625 Å 2 of surface area. A herring bone type packing of four Trp residues in the dimer interface is common to all of the mucosal HPV E2 proteins (Fig. 4) and is likely to be a major contributor to the stability of the dimer. As a result of the intricate and closely packed dimer interface, any rearrangement of subunits upon DNA binding would entail a significant energetic penalty. There is a cavity at the center of the barrel, and solvent molecules have been modeled into the electron density clearly apparent in this region. Electron density corresponding to the loop between the recognition helix and ␤ 2 (residues 308 -312) is weak, indicating local disorder in this region. The loop between strands ␤ 2 and ␤ 3 is disordered.
Comparison of the Structures of E2 Proteins from Different Viral Strains-The E2 proteins from different viral strains differ in quaternary structure (Fig. 5a). The HPV-18 and BPV-1 E2/D proteins are alike in the relative orientation of their subunits while differing from the HPV-16 and HPV-31 E2/D proteins. When one subunit of each protein is superimposed, the nonsuperimposed recognition helices of HPV-16 and HPV-18 E2/D are related by an average translation of 7 Å (Fig.  5a). Functionally, these features of the E2 proteins are critical, since they dictate the spatial arrangement of side chains presented to the major grooves for DNA sequence recognition. The large variation in subunit orientation among the HPV E2/D proteins suggests either that they undergo unique subunit rearrangements upon DNA binding or that DNA is bent very differently in each of the protein-DNA complexes.
E2 dimer architecture is governed by two features: first, the alignment of strands ␤ 2 and ␤ 4 against their symmetry mates (Fig. 5b) and, second, the packing of side chains in the barrel core (Fig. 4). While the alignment of the ␤ 2 strands is invariant, E2 proteins from different viral strains differ in the register of their dyad-related ␤ 4 strands (Fig. 5b). When one subunit each of the HPV-18 E2/D and HPV-16 E2/D proteins are superimposed, residues Gly 361 -Thr 364 of HPV-18 E2/D align with residues Gly 361 -Ser 364 of HPV-16 E2/D. The nonsuperimposed subunit is out of register by two residues; Tyr 362 -Met 365 of HPV-18 E2/D lines up with Thr 360 -Met 363 of HPV-16 E2/D. As a result, while the Phe 362 residues of HPV-16 E2/D straddle a 2-fold axis, the corresponding Tyr 362 residues in HPV-18 E2/D are not symmetrically disposed about a 2-fold axis. In this regard, HPV-18 E2/D resembles BPV-1 E2/D, while the HPV-16 and HPV-31 E2/D proteins are similar to each other (23). It has been proposed that strand register is specified by cross-strand side chain interactions (24), and this is clearly evident in the case of these E2/D proteins. The invariant glycine in the ␤ 4 strand of all E2 proteins (Gly 361 in HPV-18 E2/D and HPV-16 E2/D and Gly 403 in BPV-1 E2/D) has the potential to destabilize the ␤-sheet. In HPV-18 E2/D, the ␤-sheet is disrupted at Gly 361 , with the side chain of Thr 364 participating in a cross-strand hydrogen bond while the ␤-strands diverge. In BPV-1 E2/D, the side chain of Phe 410 participates in a cross-strand rescue of the Gly 403 -mediated ␤-sheet destabilization. Such pairing of glycine and aromatic residues across ␤-strands provides an increase in protein stability (25). The aromatic side chains of Tyr 362 in HPV-18 and Phe 362 in HPV-16 E2/D do not participate in analogous cross-strand stabilization. Thus, the surfaces formed by the ␤ 4 strands of different E2/D proteins are not alike. Interestingly, this surface of BPV-1 E2 has been implicated in an interaction with the E1 helicase that is essential for viral DNA replication (26). The differences in structure described above could contribute to the inability of heterolo- gous E1 and E2 proteins (from different viral strains) to interact with each other (27).
The core of the ␤-barrel is densely packed in all E2 proteins. A common feature of the HPV-16 E2/D and HPV-31 E2/D dimer interfaces is the presence of an ordered solvent molecule bridging two histidine residues (His 290 from ␤ 1 of HPV-16 E2/D; Fig.  4). No such bridging molecule is present in either free or DNAbound HPV-18 E2/D. Instead, the corresponding His 291 side chain participates in an intrasubunit hydrogen-bonding network that also includes residues Trp 320 , Thr 334 , and Thr 364 (Fig. 4). Despite these differences in oligomerization, both HPV-18 E2/D and HPV-16 E2/D bury similar surface areas (1625 Å 2 for HPV-18 E2/D and 1553 Å 2 for HPV-16 E2/D), approximately 70% of which is nonpolar.
The HPV-18 E2/D-DNA Complex-DNA surrounds one face of the HPV-18 E2/D ␤-barrel engulfing both recognition helices in successive major grooves (Fig. 6a). The global structure of the HPV-18 E2/D protein is similar in its free and DNA-bound states with a root mean square deviation of 1.1 Å for the peptide backbone of the dimer. There is a very small adjustment of recognition helix orientation that facilitates symmetric placement of the helices in contact with the identity elements of the E2BS. On average, the temperature factors of recognition helix backbone atoms are lowered by 11 Å 2 upon DNA interaction, indicating that the helices assume more rigid conformations as a network of protein-DNA interactions is formed.
HPV-18 E2/D-DNA binding is not accompanied by any large disorder-to-order transition. In particular, the disordered loop connecting ␤ 2 and ␤ 3 is not conformationally restricted upon DNA binding. In contrast, DNA binding significantly rigidifies the corresponding segments of the BPV-1 E2/D protein; six intersubunit hydrogen bonds and two tethering contacts with DNA phosphate oxygens contribute to this stabilization (Fig.  6c).
DNA Conformation in the E2-DNA Complexes-DNA wraps around the outside surface of the E2 ␤-barrel, tracing a smooth curve. The overall bend angle was measured between normals to the base pairs at positions ϩ6 and Ϫ6 from the dyad axis, thus encompassing the consensus E2BS and eliminating any effects of end-to-end lattice packing (Table III). The bend angle is very similar for the HPV-18 E2/D and BPV-1 E2/D complexes with both E2BS sequences studied. The DNA is the B form with compressed major and minor grooves facing the protein. Three local parameters, roll, helix twist, and slide, that are the most sensitive to DNA deformation were analyzed and are listed in Table III. The roll angles in the contacted ACC/GGT half-sites are positive, thus compressing the major groove, and the roll angles in the central NNNN region are negative, compressing the minor groove. No kink at any single base step is observed. The DNA axis traces an out-of-plane bend. In all cases, the central four base pairs are overtwisted, while the ACCG/CGGT half-sites are underwound.
The complexes formed by the HPV-18 and BPV-1 E2/D proteins differ in two related respects: the central minor groove width and the roll angle at the central dinucleotide step. The central minor groove is narrower in the HPV-18 E2/D complexes (8.5 Å) than it is in the BPV-1 E2/D complexes (9.8 Å), regardless of spacer sequence. A large negative roll angle of Ϫ11.4°at the central dinucleotide step of the HPV-18 E2/D-E2BS(AATT) complex contributes to this narrowing of the minor groove. No classical spine of hydration is apparent in the minor groove of any E2-DNA complex. The complex between BPV-1 E2/D and E2BS(ACGT) has a more positive slide in the central AT step than all the other complexes. This results in a deepening of the central minor groove.
In summary, both E2 proteins induce a similar global deformation of DNA characterized by compressed major and minor grooves at the interaction surface. The trends in roll angle and groove width are characteristic of the particular E2 protein present in the complex. Since both HPV-18 E2/D and BPV-1 E2/D must induce the same overall deformation in DNA, any differences in their preferences for intrinsic DNA conformational propensities must reflect differences in their innate abilities to stabilize the appropriate distortion.
Half-site Recognition by HPV-18 E2/D-Each recognition helix of HPV-18 E2/D lies along the floor of a DNA major groove and presents numerous side chains to the exposed edges of the bases (Fig. 7a). Recognition of the minimal E2 binding site sequence ACCNNNNNNGGT is mediated by a network of hydrogen bonds (Fig. 7b) comparable with those seen in the BPV-1 E2/D-E2BS(AATT) complex (Fig. 7d). Specifically, residues Asn 297 , Lys 300 , and Cys 301 of HPV-18 E2/D make direct hydrogen bonds with bases at positions 4 -6. These interactions are symmetric about the dyad axis defined by the palindromic  (45). These parameters were calculated with respect to the local axis of the base pair step. Parameters corresponding to the spacer region of the E2BS are in boldface type. DNA parameters for the BPV-1 E2/D-E2BS(ACGT) complex were calculated from 2bop.pdb. Minor groove widths were calculated from the shortest P-P distances across the groove. Bend angles were measured between normal and base pairs at positions ϩ6 and Ϫ6 from the dyad axis.   DNA sequence. Additionally, Arg 305 contacts the adenine base at position Ϫ7 outside the minimal E2BS. This interaction may contribute to the 16-fold greater affinity with which HPV-18 E2/D binds to sites that have an AT base pair at this location (e.g. BS2 versus BS1 in the HPV-18 genome (1) (Fig. 1a)). Mutagenesis data show that an Arg 305 to Ala mutation abrogates this specificity (28). HPV-18 E2/D also contacts the phosphodiester backbone of the DNA in both the major and the minor grooves. In total, 14 direct and 20 water-mediated phosphate contacts have been identified. Phosphate contacts made by Thr 319 , Arg 303 , and Ser 298 at positions 3, 4, and 7 of the DNA have exact counterparts in the BPV-1 E2-DNA complex via residues Thr 359 , Arg 342 , and Gln 337 . Thus, all major groove phosphate contacts are conserved between these two complexes. The minor groove phosphate contact made by Arg 296 is unique to the HPV-18 E2/D-DNA complex. Notably absent from the HPV-18 E2/D-DNA complexes is any contact between phosphates and amino acids in the ␤ 2 /␤ 3 loop, such as that mediated by the Arg 370 side chain of BPV-1 E2/D.
All of the direct protein-DNA contacts described above are present in the low affinity complex between HPV-18 E2/D and E2BS(ACGT). Although solvent molecules were not modeled, clear electron density is present at positions in the interface where solvent-mediated protein-DNA contacts are seen in the high affinity E2BS(AATT) complex.
Noncontacted Spacer Sequence Preferences of HPV-18 E2/D-The affinity of HPV-18 E2/D toward a representative set of E2BS that differed in spacer sequence was measured. These included the spacer sequences found on the HPV-18 genome and those used in the crystallographic analyses reported here. The results are presented in Table II. HPV-18 E2/D binds with greater affinity to E2BS containing AT-rich spacer sequences. The relationship between binding affinity and spacer sequence can be summarized as AATT Ͼ AAAT Ͼ AAAA Ͼ TTAA Ͼ Ͼ ACGT. Similar preferences for AT-rich spacer sequences were reported for HPV-16 E2/D and HPV-11 E2/D (10,12). The co-crystal structures reported here eliminate the possibility that direct contact with the bases is responsible for the ability of HPV-18 E2/D to discriminate between E2BSs based upon their spacer sequence.
A-tract-containing binding sites would be predisposed to adopt the narrowed central minor groove conformation found in the E2-DNA complex (29), thus lowering the enthalpic cost of DNA deformation upon E2 binding. Conversely, E2BS that are not conformationally complementary to the protein surface would bind to E2 with lower affinity, reflecting the distortion free energy of the DNA as it assumes the conformation found in the complexed state. This is clearly evident in the nearly 2 kcal/mol greater affinity of HPV-18 E2/D for E2BS(AATT) than for E2BS(TTAA). While both of these binding sites have ATrich spacer sequences, the TpA step in E2BS(TTAA) rolls into the major groove, resulting in central minor groove widening (30), a conformation not complementary to the E2 DNA-binding surface. A similar widening of the central minor groove was seen in the crystal structure of free E2BS(ACGT) (31), toward which HPV-18 E2/D also displays low affinity. The AT-rich spacer sequence preference of the HPV E2 proteins thus correlates with the conformational complementarity of such intrinsically curved DNA to the E2 surface.
Although BPV-1 E2/D also distorts DNA in a globally analogous fashion, it does not show any significant preference for AT-rich spacer-containing E2BS (12). Two differences between the HPV-18 and BPV-1 E2/D-DNA complexes may contribute to this effect. First, the central minor groove is not as narrow in BPV-1 E2/D-DNA complexes as it is in the HPV-18 E2/D-DNA complexes (Table III). Thus, a predisposition toward narrow minor grooves may be less important for the BPV-1 E2/D-DNA interaction. Second, Arg 370 in the ␤ 2 /␤ 3 loop of BPV-1 E2 makes a critical phosphate interaction in the central minor groove of the DNA. This contact could significantly attenuate repulsion between phosphate groups on the two strands caused by minor groove narrowing. The spacer sequence preferences of HPV-18 E2/D may reflect its inability to compensate for this electrostatic repulsion between the strands of a narrowed minor groove.
In order to explore the relationship between DNA flexibility and HPV-18 E2/D binding, we measured the affinity of HPV-18 E2/D toward representative nicked E2BS sequences (Table II). HPV-18 E2/D binds with significantly lower affinity to a nicked E2BS(AATT) target than it does to the corresponding intact oligonucleotide. In contrast, upon introducing a nick in a DNA target that is not conformationally complementary to the E2 DNA-binding surface (E2BS(TTAA)), the affinity of the interaction is slightly enhanced. It appears that HPV-18 E2/D discriminates against flexibility in the E2BS. In a more exhaustive analysis, it was shown that HPV-16 E2/D-DNA binding affinity was also drastically attenuated when the flexibility of the E2BS was increased via nicks in the phosphodiester backbone (12). Further support for the conclusion that these HPV E2 proteins discriminate against flexibility in their DNA targets comes from an analysis of DNA deformability (32), which suggests that A runs (of 3 or more A nucleotides) are more deformable than AA dimers. This correlates with the greater affinity exhibited by HPV-18 E2/D for E2BS(AATT) than for E2BS(AAAT) or E2BS(AAAA). DNA flexibility can reduce the energy expended in deforming DNA upon ligand or protein binding, but it also increases the entropic cost in going from a flexible, free DNA to a relatively rigid complexed state. The fact that the HPV E2/D-DNA interaction is adversely affected by flexibility in DNA indicates that the entropic cost is not balanced by favorable protein-DNA interactions. Interestingly, the BPV-1 E2 protein does not disfavor flexibility in the E2BS (12), suggesting that the DNA-bending mechanisms of these E2 proteins may be dissimilar. Furthermore, in the case of the E. coli phage 434 repressor-operator interactions (33,34), where a preference for AT-rich spacer sequences was also observed, the repressor-DNA interaction was favored by DNA flexibility. Thus, no general relationship between DNA flexibility and protein binding affinity has yet been derived.
There is a substantial difference in the distribution of positive potential on the DNA contact surfaces of HPV-18 E2/D and BPV-1 E2/D (Fig. 8). A large accumulation of positive charge is present in the central region of HPV-18 E2/D, which is juxtaposed against the minor groove of the E2BS (Fig. 8a). This is primarily due to the basic residues Arg 296 , Arg 303 , and Arg 307 . According to finite difference Poisson-Boltzmann analyses, the DNA minor groove potentials are more negative for ApT sequences than they are for GpC sequences. This is because the electrostatic potentials on T(O 2 ) and A(N 3 ) atoms are more negative than the corresponding values for C(O 2 ) and G(N 3 ) (35). E2BS with AT-rich spacer sequences thus have minor grooves that are electrostatically complementary to the HPV-18 E2/D surface. Interestingly, HPV-16 E2/D does not have a similar accumulation of positive charge at the center of its DNA-binding surface (15), yet it too prefers AT-rich spacercontaining binding sites (11,12). Thus the spacer sequence discrimination ability of HPV-16 E2/D may rely less on simple electrostatic complementarity than in the case of HPV-18 E2/D. In contrast to both of these HPV E2/D proteins, positive charge on BPV-1 E2/D is concentrated on surfaces flanking the E2BS (Fig. 8b), correlating with its inability to favor binding sites with AT-rich spacer sequences. Thermodynamic and structural studies are in progress to critically assess the relationships between electrostatics, DNA flexibility, and the mechanism of E2/D-DNA binding.

DISCUSSION
Viral DNA replication and gene expression patterns reflect the relative occupancy of E2 binding sites as the concentration of the E2 protein varies (1,36,37). The mucosal HPVs have a conserved arrangement of four E2BS on their genomes (Fig.  1a). E2 binding at BS4 can specifically up-regulate viral early gene expression, including the expression of oncogenes E6 and E7. E2 binding at the promoter-proximal sites BS1 and BS2 leads to repression of transcription of the early genes, including the oncogenes E6 and E7. E2BS3 is important for viral DNA replication.
Papillomaviruses replicate in terminally differentiated cells that have exited the cell cycle. Since viral DNA replication is dependent on host cell factors, they need to activate the transcription of the E6/E7 genes. The expressed E6 and E7 proteins can then bind and inactivate two negative regulators of cell proliferation p53 and Rb (38). However, elevated E6 and E7 protein levels are associated with malignant progression (39 -41), and hence their levels need to be kept low for a productive viral life cycle. A fine balance between activation and repression of the early promoter P 105 is essential.
The order in which the E2 binding sites on the genome are occupied depends on both the intrinsic affinity of the E2-DNA interaction at each site and the cooperativity of E2 binding to multiple sites. The HPV-18 E2 protein binds with the highest affinity to BS4 and with reduced affinity to sites BS1 and BS2 (1). Modulation of this intrinsic affinity by cooperative binding to BS1 and BS2 does not alter the order of E2BS occupancy as evidenced by footprinting experiments conducted on the HPV-18 long control region (1), which included all four E2BS. It is likely that at low concentrations of E2, E2BS4 is occupied and P 105 is activated. As E2 concentrations rise, P 105 repression occurs, concomitant with BS1 and BS2 occupancy. E2 bound to BS4 could continue to counteract total repression of transcription and ensure that the E6 and E7 proteins are expressed at a level appropriate to maintain a replicationcompetent cellular environment (3).
Recognition of DNA sequence-dependent conformational propensities contributes to the ability of the E2 proteins to dis-criminate among consensus E2 binding sites present on the viral genomes. Thus, the physico-chemical properties of the E2 proteins are uniquely tailored for optimal regulation of gene expression and replication.