Crystal Structure of the BARD1 Ankyrin Repeat Domain and Its Functional Consequences*♦

BARD1 is the constitutive nuclear partner to the breast and ovarian cancer-specific tumor suppressor BRCA1. Together, they form a heterodimeric complex responsible for maintaining genomic stability through nuclear functions involving DNA damage signaling and repair, transcriptional regulation, and cell cycle control. We report the 2.0Å structure of the BARD1 ankyrin repeat domain. The structure includes four ankyrin repeats with a non-canonical C-terminal capping ankyrin repeat and a well ordered extended loop preceding the first repeat. Conserved surface features show an acidic patch and an acidic pocket along the surface typically used by ankyrin repeat domains for binding cognate proteins. We also demonstrate that two reported mutations, N470S and V507M, in the ankyrin repeat domain do not result in observable structural defects. These results provide a structural basis for exploring the biological function of the ankyrin repeat domain and for modeling BARD1 isoforms.

In the United States, breast cancer is the second leading cause of cancer-associated death in women (1). Of the highly penetrant genes associated with breast and ovarian cancer (BRCA1, 2 BRCA2, CHEK2, TP53), 5-10% of the total breast and 10 -15% of the total ovarian cancer cases can be directly attributed to inherited mutations in BRCA1 and BRCA2 (2). Low penetrant genes may play minor but contributive roles in the predisposition of these specific cancers (3). In this regard, BARD1 is an interesting candidate for two major reasons. First, missense mutations in the BARD1 gene are associated with an increased risk of breast and ovarian cancers. Second, BARD1 is the constitutive nuclear partner of BRCA1.
In the nucleus, BARD1 and BRCA1 form an obligate heterodimer that is important for mediating DNA damage recognition, signaling, and repair (4). Both proteins are tumor suppressors required for cell viability and genome integrity (5). In mice, deletion of either protein leads to early stage embryonic lethality and high levels of genomic instability as indicated by the accumulation of chromosomal abnormalities. Together, the BARD1-BRCA1 complex favors survival and repair of the damaged cell. This is evidenced by co-localization of BARD1 and BRCA1 to nuclear sites of IR radiation-induced foci believed to be the location of stalled replication machinery and damaged DNA. BARD1 and BRCA1 associate with several complexes important for nuclear transcription, DNA repair, and cell cycle checkpoint regulation, although no detailed role in these processes has been described.
The N-terminal RING domain and the two C-terminal BRCT domains of BARD1 share homology with BRCA1. The only direct interaction between the BARD1 and BRCA1 proteins is through helices flanking the RING motif of each protein. Unlike BRCA1, BARD1 has three predicted ankyrin (ANK) repeats adjacent to the first BRCT domain. ANK repeats fold into a helix-turn-helix-extended-loop topology, and together, they stack to form an ankyrin repeat domain (ARD). ARDs contain from 3 to 25 repeats, and their elongated and variable surfaces allow them to interact with a wide range of proteins (6,7).
The ARD and BRCT domains appear to be functionally linked in BARD1 as both domains are required for BARD1-p53dependent apoptosis (8). Furthermore, the sequence connecting the ARD and BRCT domains is important for BARD1 interactions with p53 and CstF-50 and contains two BARD1 breast and ovarian cancer-predisposing mutations, C557S and Q564H. C557S increases the risk of breast cancer in patients also harboring a BRCA2 mutation (3,9). Q564H increases the risk of ovarian cancer, disrupts interactions with CstF, the polyadenylation cleavage specification complex, and abrogates p53dependent apoptosis (10 -12).
We present a 2.0 Å resolution crystal structure of BARD1-ARD containing four ANK repeats. The crystal structure is consistent with solution properties as determined by NMR studies. In addition, two reported missense mutations, N470S and V507M, within the ARD have been structurally characterized. Finally, we provide evidence for the independent behavior of the ARD and BRCT domains in solution. The structure of BARD1-ARD provides a basis for prediction of ARD function and the mechanism of interactions of the protein. It also provides key structural insights into functionally relevant isoforms of BARD1 and cancer-associated missense mutations.

EXPERIMENTAL PROCEDURES
Plasmids/Primers/Construct Definition-Constructs encoding BARD1 residues 397-777 and 566 -777 were cloned into the pET151/D-TOPO vector using the Invitrogen directional cloning strategy. Site-directed mutagenesis was used to generate the BARD1 425-555 and other constructs, as well as the V507M and N470S mutants. Details can be found in the supplemental material.
Protein Expression and Purification-pET151/D-TOPO BARD1 plasmids were expressed in Escherichia coli BL21-DE3. BARD1 425-555 and 425-565 were grown in LB medium. Expression was induced by the addition of 1 mM isopropyl ␤-D-1-thiogalactopyranoside. The cultures grew for 24 h at 22°C. BARD1 397-777, 425-777, and 566 -777 were expressed both as described above as well as in ZYP-5052 autoinduction medium with similar expression times and temperatures. 15  All BARD1 constructs were purified by taking advantage of the N-terminal His 6 -TEV tag. Further details concerning the purification and trypsin digestion of the proteins can be found in the supplemental material.
Crystallization and Data Collection-Crystals of BARD1-ARD were originally produced by sitting drop vapor diffusion with sparse matrix crystallization buffers at 4°C using 10 mg/ml BARD1 425-565 in buffers containing 10 mM sodium phosphate (pH 7.0), 50 mM NaCl, 1 mM EDTA, and 2 mM DTT. After 3 months, small, thin hexagonal crystals were produced in Index screen condition 85 (Hampton Research, 0.2 mg of chloride, 0.1 M Tris hydrochloride, pH 8.5, 25% w/v polyethylene glycol 3350). Analysis of the protein by SDS-PAGE suggested that BARD1 425-565 had been proteolytically cleaved to 425-555 during the crystallization. Several crystals from the first screen were diluted into 50 l of mother liquor and manually crushed with a glass rod. The mixture was diluted 10-fold with additional mother liquor and used for seeding during optimization screens. 1 l of BARD1 425-555 protein at 20 mg/ml in the same buffer above was combined with 1 l of crystal seeds and 2 l of mother liquor for sitting drop vapor diffusion crystallization experiments at 4°C. A large crystal was directly frozen in liquid N 2 , and diffraction data were collected at 100 K on a Rigaku R-axis IVϩϩ system to 1.68 Å resolution. The crystals were monoclinic (space group P2 or P2 1 ) with unit cell dimensions of a ϭ 37.8 Å, b ϭ 74.1 Å, c ϭ 51.2 Å, and ␤ ϭ 105.1°. The data were scaled and integrated with HKL2000 (13). Data reduction statistics have been summarized in supplemental Table 1.
Structure Determination and Refinement-The structure of BARD1-ARD 425-555 was determined by molecular replacement with four ANK repeats from a designed ARD (Protein Data Bank (PDB) ID 1SVX (14)) as the search model. Two monomers were located in the asymmetric unit using the program Phaser (15). Refinement and model building proceeded using REFMAC5 (16), ARP/wARP (17), and Coot (18). Translation, libration, screw refinement was added with six segments defined for each monomer in the asymmetric unit (19,20). The final refined 2.0 Å model contains 1945 atoms and 229 water molecules with final values of R and R free of 19.8 and 26.8%, respectively. Three residues at the N terminus and 10 BARD1 residues at the C terminus (547-555) are absent from the model due to poor electron density in these regions. Molprobity was used to check the stereochemistry (21). Coordinates for BARD1-ARD can be found in PDB entry 3C5R. Refinement statistics are reported in supplemental Table 1, and the electron density map is shown in supplemental Fig. 1.

RESULTS
Domain Boundary Determination-The C-terminal half of BARD1 (residues 423-777) encompasses a highly conserved region predicted to fold into three ANK repeats followed by tandem BRCT domains. A linker of approximately 40 residues with no homology to known domains is found between the ANK repeats and the BRCT domains and might include an additional ANK repeat. "Capping" repeats are topologically similar to internal ANK repeats but have polar amino acid substitutions at positions that typically favor internal non-polar contacts. Therefore, they often remain unidentified by domain prediction servers.
The domain boundaries were determined experimentally by limited trypsin proteolysis on a protein construct comprised of BARD1 residues 397-777. Analysis by SDS-PAGE and matrixassisted laser desorption/ionization-time of flight showed relatively stable fragments including ARD residues 425-565 and BRCT domains 566 -777. Rapid proteolysis of residues 397-424 reduced the likelihood of an additional N-terminal ANK repeat. The beginning of a highly conserved region coincides with this N-terminal boundary. On the other hand, the proteolytic stability of the region C-terminal to the last predicted ANK repeat (residues 522-565) suggested the presence of a C-terminal capping ANK repeat. The final constructs used in the studies reported here are shown in Fig. 1.
Overall Structure of the Four Ankyrin Repeats of BARD1-The tertiary structure of BARD1-ARD is composed of four ANK repeats that pack together to form an elongated structure measuring 35 ϫ 24 ϫ 18 Å ( Fig. 2A). As in all ARDs, a hydrophobic core runs the entire length of the long axis (Fig. 2B). The hydrophobic core ends with the N-and C-terminal capping ANK repeats that contain polar amino acids at positions typically conserved for small non-polar residues in the core. In the N-terminal ANK repeat, Ser-443 and Tyr-446 substitute for the canonical valine and leucine positions, respectively, and in the C-terminal ANK repeat, Glu-537 and Lys-540 replace leucine and valine (Fig. 2, B and D).
Typical of known ARD structures, extended loops primarily occur between ANK repeats and are stabilized by polar contacts within and between loops. BARD1-ARD also has a well ordered loop preceding the first ANK repeat in addition to the three internal loops. A similarly structured loop was observed preceding the first structured ankyrin repeat in the human Notch-ARD (22). However, unlike the BARD1-ARD, Notch has a poorly structured ankyrin repeat preceding the first structured loop.
The C-terminal capping repeat differs significantly from the canonical ANK repeat. Its two helices are shorter (by one turn) than the canonical helices, yet it maintains the overall tertiary ANK repeat fold (Fig. 2, C and D). Moreover, the highly conserved histidine in the TPLH motif, conserved in non-terminal ANK repeats, is replaced by an aspartate, thus removing the histidine side chain that usually contributes to stabilization of the "iϩ1" extended loop.
A similar half-ANK repeat was previously observed in GABP␤-ARD (23). Structural alignment of BARD1-ARD with ANK repeats 2-5 of GABP␤-ARD demonstrates high tertiary homology, 0.7 Å C␣ root mean square deviation despite low overall sequence conservation (Fig. 2E, left). The C-terminal half-ANK repeats have slight topological differences as three fewer residues connect the two helices in the BARD1 repeat relative to GABP␤, resulting in the ␣1 helix of the final repeat of BARD1 being one turn shorter (Fig. 2E, right).
NMR Analysis of BARD1-ARD in Solution-Backbone resonance assignments for the BARD1-ARDϩLinker construct, including residues C-terminal to the fourth repeat and terminating one residue before the first BRCT domain, were determined from standard triple resonance experiments. Nearly complete backbone and side-chain assignments were obtained. The secondary structure predicted from NMR chemical shifts closely matches those found in the crystal structure (supplemental Fig. 2). The NMR data for residues 547-565, C-terminal to the final helix in the crystal structure, do not indicate the existence of regular secondary structure.
To determine how ordered each residue is in solution, we calculated the NMR-derived order parameter, S 2 . An average rotational correlation time of 13.3 ns was determined, consist-ent with a species with a molecular mass of 20 kDa, approximately the monomeric size for the construct. Residue-specific order parameter values determined for residues 428 -545, including many residues within the extended loops, indicate motions consistent with a rigid body (supplemental Fig. 2B). Accurate values for the T 1 and T 2 relaxation times for residues 547-565 could not be determined due to their significantly longer relaxation times relative to the majority of the BARD1-ARD resonances. Heteronuclear nuclear Overhauser effects are negative for these residues and reflect motions consistent with a flexible disordered chain (supplemental Fig. 2C). This observation is also consistent with the absence of electron density for residues 547-555 in the crystal structure. Thus, the NMR data indicate that the extended loops observed in the crystal structure are well ordered in solution and behave as part of the globular domain, and those residues 547-565 of BARD1-ARD are disordered in solution.
Analysis of BARD1-ARD Evolutionarily Conserved Residues-Analysis of all known ANK repeats shows that positions 2, 4 -7, 9, 13, 21-22, and 25-26 within a canonical repeat are well conserved, and positions 8, 15-18, 20, 28, and 30 -31 are semiconserved. The conserved regions include residues within the hydrophobic core, the TPLH motif that initiates the first helical turn of a repeat, and the two glycines present in sharp turns. We assume that these residues are required for the structural integrity of the general ANK motif. The remaining positions in each repeat were evaluated for their conservation in an alignment of BARD1 homologs. As shown in supplemental Fig. 3, BARD1-ARD is highly conserved through evolution. Conserved residues specific to BARD1 are indicated by asterisks.
To identify potentially important surface features that could be involved in protein-protein interactions, a surface representation of the BARD1-ARD structure was painted with the BARD1-specific conserved residues (Fig. 3A) and an electrostatic surface potential map (Fig. 3B). A majority of the residues form a contiguous region on the surface. A wide acidic patch is present along the N-terminal half of the ARD, spanning ANK repeats 1-3 and involving residues in the respective loops and inner helices. The acidic patch coincides with several BARD1specific conserved residues (Arg-427, Glu-429, Asp-458, Ala-460, Trp-462, and Tyr-492). Conserved residue Glu-467 forms the base of the patch and has been included in the BARD1specific analysis as the amino acid type is significantly divergent from the consensus sequence, despite its conserved position in the general ANK repeat.
A second feature is an acidic pocket that lies adjacent to the acidic patch. The pocket is made up of residues from ANK repeats 2-3. Conserved BARD1-specific residues that form the pocket include Trp-462, Gly-491, Tyr-492, Asp-495, and Lys-503. Conserved residues His-466 and Asp-500 also help form the pocket but again lie in positions generally conserved in all ANK repeats. Asp-500 has been included in the BARD1-specific conservation analysis as it too is significantly divergent from the consensus sequence. The pocket is separated from the acidic patch by a hydrophobic ridge formed by the side chains of three strictly conserved BARD1 residues, Ala-460, Trp-462, and Tyr-492. Finally, a small hydrophobic patch is formed by the extended loop prior to the fourth repeat and involves conserved residues Ile-525 and Phe-526.
Characterization of BARD1-ARD N470S and V507M-Two mutations, N470S and V507M, have been reported within the structured BARD1-ARD, although their linkage to cancer pre-disposition remains unclear (24,25). Asn-470 is neither a general ANK motif, nor is it conserved among BARD1 homologues. It is buried within the acidic surface in the inner helix of the second ANK repeat with no solvent accessibility for the main chain atoms and only 28% solvent accessibility for its side chain. On the other hand, Val-507 is conserved in BARD1 and is a general ANK motif residue that is partially buried on the periphery of the conserved hydrophobic core, with one methyl group pointing to the interior hydrophobic core (4% solvent accessibility) and the other methyl group pointing into the solvent (32% solvent accessibility).
We introduced each missense mutation into the recombinant construct 425-565 and compared their solution properties with the wild-type protein. Analysis of secondary structure content and thermal stability of BARD1-ARD N470S and V507M by circular dichroism revealed no significant differences (data not shown). Furthermore, limited proteolysis with trypsin under conditions similar to those used to define the protein domain boundaries did not indicate additional proteolytic susceptibility in the mutant proteins (data not shown).
NMR spectroscopy was used to detect detailed changes in the BARD1-ARD. Two-dimensional-HSQC spectra of 15 N-labeled wild type, N470S, and V507M were collected, and chemical shift perturbations (CSPs) from the native protein were calculated. A number of CSPs occurred in both sets of spectra and were localized around the site of mutation (supplemental Fig. 4, A-C). Likely due to its low solvent accessibility, the N470S CSPs include some perturbations within the hydrophobic core (Cys-469, Val-476, and Val-477) as well as several adjacent surface-exposed residues (Gly-461, Trp-462, His-671, Gly-672, His-673, and His-506). V507M also introduced several CSPs (supplemental Fig. 4, D-F). Met-539 has the largest CSP as its side chain is facing the buried methyl of Val-507. Perturbations were also observed in residues within the hydrophobic core (Ile-509, Val-510, and Met-539) in addition to residues peripheral to Val-507 and the hydrophobic core (Gly-505, His-506, His-506, Lys-511, Thr-534, Ser-538, Lys-540, and Leu-545).

The ARD and BRCT Domains in the Intact C-terminal Region Behave
Independently in Solution-The fourth ANK repeat in BARD1 alters the previously assumed domain boundaries between the ARD and BRCTs. In particular, the length of the linker sequence between the two structured domains is only 20 residues (547-567) rather than the ϳ40 residues assumed based on the three-ANK model. The short length of the linker and the presence of two cancer-predisposing mutations within it suggest that the ARD and BRCT domains may either adopt a preferred conformation relative to one another or cooperate in the binding of partner proteins. If the domains make direct contact with each other, residues in the binding interface will be perturbed relative to the individual domains in solution. If the ARD and BRCT domains behave as independent structural units, the local environment for the individual domains will be (nearly) identical to that of the intact C-terminal region. To address the relationship between the two structured domains, we compared the HSQC spectra of three constructs of BARD1: ARDϩLinker (residues 425-565), BRCT tandem domains (residues 566 -777), and ARD-BRCT (residues 425-777). Spectra of the individual domains overlaid on that of the intact C-terminal region match nearly perfectly for each resonance (Fig. 4, A and B). Therefore, FIGURE 2. Crystal structure of BARD1-ARD. A, stereo view of BARD1-ARD 425-546 with C␣ backbone trace colored from blue (N terminus) to red (C terminus). The first three ANK repeats are composed of an inner ␣-helix 1 3 sharp turn 3 outer ␣-helix 2 3 90°turn 3 extended loop topology. The first ANK repeat has an additional extended loop preceding the first ␣-helix, whereas the fourth ANK repeat loses regular secondary structure following the second turn of ␣-helix 2. B, hydrophobic core of BARD1-ARD with the ribbon structure colored gray. Residues belonging to the conserved hydrophobic core are represented as space-filling side chains and colored by atom type (hydrogen (white), carbon (green) oxygen (red), nitrogen (blue), and sulfur (yellow)). The hydrophobic core is strictly composed of aliphatic, non-polar side chains, whereas residues at the solvent interface have more polar, hydrophilic side chains. C, comparison of third ANK repeat (left) with the non-canonical fourth ANK repeat (right). The inner and outer ␣-helices of the fourth repeat are one helical turn shorter than the canonical ANK repeat. However, the overall orientation of ␣-helix 1 and ␣-helix 2 remains the same. All ribbon diagrams and polar contacts were generated with PyMOL (32). D, sequence alignment of the four BARD1 ANK repeats along with a canonical ANK consensus sequence (6). Conserved residues shown in red, and semiconserved residues are shown in green. The first three repeats of BARD1 are 33 residues long and share many of the conserved residues. The fourth repeat of BARD1 shares only a few conserved positions with the consensus sequence. Residues highlighted yellow indicate BARD1-capping ANK repeat residues that mediate solvent interactions at hydrophobic core positions typically conserved in canonical interior ANK repeats. Arrows shown in panel B indicate that two of these four residues. E, left, BARD1-ARD (green, 3C5R) was structurally aligned to GABP␤ 2-5 (blue, 1AWC) ANK repeats using PyMOL (32) with a C␣ root mean square deviation of 0.7 for 106 atoms. Right, BARD1-ARD ANK-4 (green) and GABP␤-ARD ANK-5 (blue) are shown in ribbon for comparison and to highlight the shortened ␣-helix 1 resulting from three fewer residues in the BARD1 repeat. the ARD and BRCT domains experience similar environments whether isolated or in the context of the same polypeptide chain. An HSQC titration in which non-isotopically labeled BRCT construct was added to 15 N-labeled ARDϩLinker construct similarly showed no significant CSPs (data not shown). Together, these data suggest that in the absence of any interacting protein, the BARD1-ARD and tandem BRCT domains behave independently in solution.

DISCUSSION
Implications for BARD1 Function-A direct role of the BARD1-ARD in BRCA1-dependent or independent cellular pathways remains to be elucidated. Given the numerous known protein-protein interactions involving ARDs, it is highly likely that BARD1-ARD interacts with one or more proteins. These interactions, including those with p53, CstF-50, IB␣ (NF-B inhibitor ␣), Bcl-3 (B-cell chronic lymphoma-3), and Ewing sarcoma, are reported to involve both the ARD and BRCT domains (4). Confirming initial reports of protein interactions will be important to understanding BARD1 function; however, many of the protein interactions have not been independently confirmed. Knowledge of the proper domain boundaries is critical for parsing out the ARD domain protein interactions and functions independent of the adjacent tandem BRCT domains. Many BARD1 functional studies have been based on a three-ANK repeat model, terminating constructs prior to the fourth and essential capping ANK repeat. Expression and purification of the BARD1 ANK repeats 1-3 produces mostly insoluble protein. 3 Loss-of-function mutations can be designed based on the structures that do not disrupt the folding and stability of the ARD yet are targeted to the predicted protein binding hotspots. The structure reveals a conserved contiguous surface patch formed by conserved residues from all four ANK repeats. That surface matches those typically used to mediate ARD protein interactions (Gankyrin-ARD and S6-ATPase, for example (26)) and correlates well with the acidic patch and pocket shown by electrostatic potential calculations. Selectively mutating these surfaces to disrupt their native features should prove useful for testing the role of the ARD in BARD1 and BRCA1 cellular functions.
BARD1 Cancer-associated Mutations and Isoforms-Characterization of the C-terminal half of BARD1 has revealed the structural context of two breast and ovarian cancer-associated mutations, C557S and Q564H. Comparison of the NMR spectrum of the intact ARD-BRCT construct and the individual spectra of ARDϩLinker and BRCT shows that residues within the linker, ARD and tandem BRCT domains experience similar chemical environments whether within an intact polypeptide or parsed into separate domains. In addition, from information derived from the ARDϩLinker construct, residues following the last amino acid in the ARD are unstructured. We can model the intact C-terminal half of BARD1 as two structured segments, the ARD and tandem BRCT domains, tethered by a flexible linker composed of 20 residues. Cys-557 and Gln-564 are located near the center of the linker and do not play structural roles in either domain. Therefore, we speculate that these residues are important for protein-protein interactions, as reported previously for interactions with CstF-50 and p53 (10,11).
We have also characterized two mutations, N470S and V507M, found within the BARD1-ARD. We predict that the missense mutations are not linked to cancer predisposition because they do not show significant alterations in either structure or stability. The link between an observed polymorphism or mutation, its functional consequences, and disease association is possible only in the presence of large amounts of genetic information. These data are in short supply for BARD1. How-3 D. Fox III and R. E. Klevit, unpublished observations. ever, in cases where known disease-linked mutations have been characterized biochemically and/or structurally, the mutant proteins are often severely compromised for structure and/or stability (27). The two reported mutations within the BARD1-ARD disrupt neither the structure nor the stability of the ARD.
The BARD1 structure also provides insight into the structure of two isoforms ( and ␦/⌬RIN) that include one of the four ANK repeats (Fig. 5A). These isoforms, among several others recently identified, serve as markers of proliferative and invasive growth and may link BARD1 to breast and ovarian tissuespecific cancers (28 -30). The ␦-isoform of BARD1 is shown because it can be easily broken down into discrete structural building blocks (Fig. 5B). Similar structural inferences can be made for the -isoform. We note that the single helix from the RING domain and the intact fourth ANK repeat from the ARD are each amphipathic with one face normally buried within the hydrophobic core of its relevant intact domain. We speculate that these hydrophobic surfaces pack together to form a new "hybrid" domain in ␦-BARD1, likely folding independently of the adjacent tandem BRCT domains. The function of the new ␦-BARD1 domain remains to be determined but is likely independent of BRCA1 and coupled to cytoplasmic functions of the intact tandem BRCT domains as the BRCA1 interaction domain and the BARD1 nuclear localization signal are absent.
Intriguingly, the linker between the fourth ANK repeat and the tandem BRCT domains, important for mediating protein interactions with p53 and CstF-50, is intact in these isoforms. Indeed, CstF-50 has been reported to interact with the ␦/⌬RINisoform of BARD1 in the cytoplasm (31). Perhaps the isoforms of BARD1 serve a dominant negative role by shifting the nuclear-associated protein complexes to the cytoplasm.
In summary, as only full-length BARD1 contains the intact ARD as well as an intact BRCA1 interaction domain, we propose that the function of the BARD1-ARD is correlated with BRCA1-dependent nuclear functions including DNA damage signaling and repair, ubiquitination of substrates, cell cycle control, and transcriptional regulation.