Human Origin Recognition Complex Binds Preferentially to G-quadruplex-preferable RNA and Single-stranded DNA*

Background: ORC binds to replication origins, but human ORC does not exhibit apparent sequence-specificity. Results: G-quadruplex (G4)-preferable RNA or single-stranded DNA competes for DNA binding of ORC. Conclusion: Human ORC binds preferentially to RNA and single-stranded DNA that form G4, and the certain domain in ORC1 is involved in this binding. Significance: This ability may correlate with the G4-formable motif in human replication origins. Origin recognition complex (ORC), consisting of six subunits ORC1–6, is known to bind to replication origins and function in the initiation of DNA replication in eukaryotic cells. In contrast to the fact that Saccharomyces cerevisiae ORC recognizes the replication origin in a sequence-specific manner, metazoan ORC has not exhibited strict sequence-specificity for DNA binding. Here we report that human ORC binds preferentially to G-quadruplex (G4)-preferable G-rich RNA or single-stranded DNA (ssDNA). We mapped the G-rich RNA-binding domain in the ORC1 subunit, in a region adjacent to its ATPase domain. This domain itself has an ability to preferentially recognize G4-preferable sequences of ssDNA. Furthermore, we found, by structure modeling, that the G-rich RNA-binding domain is similar to the N-terminal portion of AdoMet_MTase domain of mammalian DNA methyltransferase 1. Therefore, in contrast with the binding to double-stranded DNA, human ORC has an apparent sequence preference with respect to its RNA/ssDNA binding. Interestingly, this specificity coincides with the common signature present in most of the human replication origins. We expect that our findings provide new insights into the regulations of function and chromatin binding of metazoan ORCs.

Origin recognition complex (ORC), 2 the protein complex consisting of six subunits (ORC1-6), plays crucial roles in the initiation of DNA replication in eukaryotes (1,2). ORC was first discovered in yeast Saccharomyces cerevisiae as a protein complex that binds to the autonomously replicating sequence, which is virtually identical to the replication origin (3). Since then, ORC has been identified in various eukaryotic species including Drosophila, Xenopus, and mammals (1). ORC which binds to replication origins serves as a landing platform for the assembly of the pre-replicative complex, which is subsequently turned into the pre-initiation complex before DNA synthesis starts (1). These steps are ATP-dependent and highly dynamic, and the molecular mechanisms have been studied extensively (1). Besides its role in the initiation of DNA replication, it has been shown that ORC is also involved in cytokinesis (4,5) and the processes carried out in heterochromatin and centrosomes (6,7).
The role of ORC in the pre-replicative complex assembly at replication origins is conserved among various eukaryotes (1). In contrast, the mechanism for ORC to recognize the origins seems different in eukaryotic species. In S. cerevisiae, ORC can bind sequence specifically to the consensus sequence within autonomously replicating sequences (3,8). In another yeast, Schizosaccharomyces pombe, ORC can recognize the AT-tracts present in autonomously replicating sequences; the AT-hook motif of the ORC4 subunit is responsible for this recognition (9).
In sharp contrast with yeast ORCs, metazoan ORCs do not exhibit apparent sequence-specificity for its DNA binding, although it was reported that ORC relatively preferred AT-rich sequences or negatively supercoiled DNA (10 -13). Consistently, the organization of metazoan replication origins is diverse in terms of their length and sequence, when the previously well characterized origins are compared (2). On the other hand, it is obvious that the origins exist at the specified regions on chromosomal DNA in somatic cells, whereas DNA replication seem to initiate randomly in early embryonic cells such as Xenopus eggs, in which any source of DNA can be replicated (2). Thus, metazoan ORCs may be recruited to replication origins by the mechanism different from that in yeast cells. Although the previous studies suggest that ORC recruitment to the origin sites involves several factors including sequences such as CpG islands, chromatin structure, histone modification, DNA-binding proteins such as transcription factors, or transcription itself (1,2), the details of the mechanism of ORC recruitment to the specified origin sites are still elusive. The recent genome-wide mapping of replication origins revealed that a G-rich repeat element was present in most of the replication origins in mouse and human cells (14,15). This G-rich element potentially forms G-quadruplex (G4), but the biological role of this element has not been examined yet.
In addition, the structure of the target DNA for ORC has not been elucidated yet. Previous studies examined binding of metazoan ORCs to double-stranded DNA (dsDNA) (10,11), but their binding to single-stranded DNA (ssDNA) has not been examined extensively, whereas the human ORC1-derived polypeptides were shown to bind to G-rich RNA (16). Thus, to get further insights into metazoan ORCs, we investigated ssDNA binding of human ORC. Here we find that human ORC has apparent sequence-specificity for its ssDNA binding and for RNA binding.

Preparation of Recombinant Human ORC-Complementary
DNAs for human ORC subunits including N-terminally, FLAG-tagged (Sigma) ORC3 were cloned into pFastBac vector (Invitrogen) (17). Detailed information concerning their construction is available upon request. The baculoviruses for ORC expression were prepared and used for infection (multiplicity of infection ϳ2) of approximately 5 ϫ 10 8 cells of High Five cells (Invitrogen). Forty-eight hours after infection, the cells were harvested, and the expressed ORC was affinity-purified with anti-FLAG antibody (Sigma) as described previously (18). Although ORC6 was not visible in a Coomassie Blue-stained gel of purified recombinant ORC, it was detected by immunoblotting with anti-ORC6 antibody (rabbit polyclonal, Bethyl Laboratories, Inc.) (see Fig. 1A). ORC without the ORC6 subunit was also prepared as described above, except for exclusion of the ORC6-expressing baculovirus.
Preparation of Glutathione S-Transferase (GST)-fused ORC1 Derivatives-The expression vectors for GST-fused, ORC1 derivatives were constructed with pGEX-6P-1 (GE Healthcare). The cDNA for the Ala substitution mutant of ORC1(413-511) was constructed by PCR and cloned into the same vector. The recombinant proteins were expressed in Escherichia coli Arctic Express or BL21 pLysS (Agilent Technologies) and purified by using glutathione-Sepharose beads (GE Healthcare). To remove the bacterial RNA potentially co-purified with recombinant proteins, the protein-bound glutathione-Sepharose beads were treated with 5 g/ml RNase A for 2 h at 4°C and then washed extensively before elution from the beads. The purified proteins were dialyzed against 25 mM HEPES-KOH (pH 7.7), 0.3 M NaCl, 10% glycerol, and 1 mM dithiothreitol (DTT).
Probes and Competitors-The full description of probe and competitor sequences is in Table 1. The FITC-labeled or unlabeled RNA was synthesized by Eurofins Operon. For the biotinlabeled dsDNA probes (GGGTT/CCCAA or A/T-rich), the corresponding repeated (or A/T-rich) DNA (40 bp) was cloned into the SacI-KpnI sites in pBluescript SK(Ϫ) (Agilent Technologies) and amplified by PCR with biotin-labeled T7/T3 primers. For the biotin-labeled, (GGGUU) 8 -containing RNA probe (58 nucleotides), the plasmid as above was linearized and used for in vitro transcription with T7 RNA polymerase. The RNA was purified by denatured polyacrylamide gel electrophoresis and biotinylated at the 3Ј-end with the RNA 3Ј-end biotinylation kit (Thermo Scientific).
Gel Electrophoresis of G4-formed ssDNA-Electrophoretic separation of G4-formed and unstructured ssDNAs was done by 15% polyacrylamide gel electrophoresis essentially as described previously (19). Briefly, 2 pmol each of ssDNA was heated at 100°C for 3 min in water or 0.1 M KCl and then cooled slowly to room temperature, or incubated in the binding buffer without ORC (see below) at 30°C for 30 min. Then, ssDNA was electrophoresed at 4°C and stained with SYBR Gold (Invitrogen). Electrophoretic Mobility Shift Assay (EMSA)-Purified ORC or GST-ORC1 derivatives were incubated in a 10-l reaction mixture containing 25 mM HEPES-NaOH (pH 8.0), 10 mM magnesium acetate, 100 mM sodium glutamate, 0.1 mM EDTA, 5% glycerol, 1 mM DTT, 3 mM ATP, 2 mg/ml bovine serum albumin (BSA) (DNase-, RNase-free), 20 units of RNase inhib-itor (TAKARA Bio) with a probe in the presence or absence of competitors for 30 min at 30°C. The typical reaction mixture also contained 150 mM (for ORC) or 60 mM NaCl (for GST-ORC1 derivatives), which was brought from the preparation of ORC or GST-ORC1 derivatives. After incubation, the reaction mixture was subjected to 5% or 3.5% (for Fig. 3C) polyacrylamide gel electrophoresis (at 4°C) with a buffer containing 25 mM Tris, 192 mM glycine (pH 8.7). For FITC-labeled probes, the gel was scanned directly by Molecular Imager FX Pro (Bio-Rad). For biotin-labeled probes, the gel was subjected to transfer to a nylon membrane, and the biotin label was visualized with the Chemiluminescent nucleic acid detection module (Thermo Scientific). The signals were detected and quantified by using a Lumino-Image analyzer LAS4000mini and software Multi Gauge ver. 3.0 (Fuji Film). For unlabeled RNA or ssDNA probes, the gel was stained with SYBR Green II or SYBR Gold. To detect the ORC subunits, the gel was treated with SDS and then subjected to immunoblotting with anti-ORC1 antibody (rat monoclonal 7A7; Santa Cruz Biotechnology) and anti-ORC3 antibody (goat polyclonal; Abcam).
Three-dimensional Structure and RNA-binding Site Prediction-Three-dimensional structure of the proposed RNAbinding domain in human ORC1 was predicted using FUGUE (20). The domain locates between about 413 and 511, which corresponds to the N-terminal side to AAA ϩ domain and has no obvious sequence similarity to proteins with known threedimensional structures. The partial sequence of human ORC1 from 1 to 530 was submitted to FUGUE server. Once the threedimensional structure was predicted, RNA-binding residues were further predicted using KYG method (21).

RESULTS
Human ORC Exhibits Sequence Preference for Binding to Single-stranded DNA-We investigated DNA binding of ORC by EMSA with purified recombinant human ORC (Fig. 1A), in which the ORC3 subunit was tagged with a FLAG peptide. We confirmed that this purified ORC was functionally active because it could partially rescue DNA replication in the ORCdepleted, Xenopus egg extracts (data not shown), as shown previously (11). Biotin-labeled, ϳ110-bp dsDNA containing (GGGTT) 8 or a 40-bp AT-rich element was used as a probe for EMSA, and binding specificity was investigated by adding unlabeled competitors into the binding reaction (see Table 1 for their sequences). First, we compared two competitors, AT-rich and (GGGTT) 8 dsDNA (ϳ40 bp) to evaluate the conditions for EMSA. We detected a slight preference for AT-rich dsDNA (Fig. 1, B and C), consistent with the previous result (11).
In contrast, when ssDNA (ϳ40-mer) was used as a competitor, we could see sequence-specific effects of competitors on ORC binding. (GGGTT) 8 ssDNA, but not (AACCC) 8 ssDNA, inhibited DNA binding of ORC more efficiently than (GGGTT) 8 dsDNA (Fig. 1D); a 10-fold molar excess of (GGGTT) 8 ssDNA almost completely inhibited ORC binding to an AT-rich dsDNA probe, but (GGGTT/CCCAA) 8 dsDNA or (AACCC) 8 ssDNA was less effective. (GGGTT) 8 inhibited as efficiently as AT-rich dsDNA (Fig. 1E), which is thought to be a better binding substrate as shown previously (11) and also in Fig. 1B. In contrast, A-rich or T-rich ssDNA competitors did not inhibit ORC binding under these conditions (Fig. 1F). These results indicate that human ORC binds to ssDNA with apparent preference; (GGGTT) 8 ssDNA was the most effective competitors among those used in Fig. 1. Although we observed multiple shifted bands in EMSA, and the exact composition of each band is currently unknown, it is likely that they contained the entire ORC, ORC1 dissociated from the complex or the core complex consisting of ORC2-5 (lacking ORC1 and probably ORC6) (see below; data not shown).
Human ORC Preferentially Binds to G4-formable, G-rich ssDNA or RNA-To obtain further insight into the sequencepreference in ssDNA binding of ORC, we tested other GT-based sequences of ssDNA. We found that (GGGTT) 8 ssDNA was the most effective among the tested competitors, which include (GTTGT) 8 , (GTTTT) 8 , and (GTGTGT) 7 ssDNA (Fig. 2, A and B, and Table 1). Furthermore, we detected the complex formation between ORC and (GGGTT) 8 or (GGGAA) 8 ssDNA (lanes 2, 3, 17, and 18, Fig. 2C). Both (GGGTT) 8 and (GGGAA) 8 are the only G4-preferable G-rich sequences among those tested (22). Thus, we examined, by native polyacrylamide gel electrophoresis (19) whether these ssDNAs would form a G4 structure under the binding condition for EMSA. (GGGTT) 8 ssDNA that had been incubated in the binding buffer migrated as fast as the one restructured in 0.1 M KCl, which is favorable for G4 formation, and clearly faster than the unstructured one (lanes 1-3, Fig. 2D). In contrast, migration of (AACCC) 8 and other GT-based ssDNAs did not alter after pretreatment. These indicate that (GGGTT) 8 indeed forms a G4 structure under the binding condition. Therefore, these results suggest that human ORC has an ability to bind preferentially to G4-formable G-rich sequences of ssDNA.
It should be noted that when (GGGAA) 8 ssDNA was either incubated in the binding buffer or restructured in KCl and separated by native gel electrophoresis, we observed a similar but complex pattern of bands, which migrated differently than the unstructured one (data not shown). Although the nature of the complex pattern is unknown, this might explain the inconsistent effect of (GGGAA) 8 seen in Fig. 2, A and C (also see Fig. 5E).
In addition to the entire ORC, we tested the ability of ORC1-5, the complex without ORC6 (Fig. 1A). We did not see any clear difference in the complex formation with ORC1-5, compared with that with the entire ORC (Fig. 2C). Thus, it seems unlikely that ORC6 contributes directly to preferential recognition of G4-formable sequences by ORC.
We next examined whether G-rich RNA would bind to ORC. We found that (GGGUU) 8 RNA also inhibited ORC-DNA binding efficiently, whereas (AAACC) 8 RNA did not (Fig. 3, A and B). Interestingly, (GGGUU) 8 RNA was a slightly stronger competitor than (GGGTT) 8 ssDNA (Fig. 3, A and B). Furthermore, we could detect the complex formation between ORC and (GGGUU) 8 RNA and confirmed that the complex contained at least ORC1 and FLAG-ORC3 (Fig. 3, C and D). Note that the ORC-RNA complex seems more stable in the presence of ATP than that in the absence of ATP (stronger signals of ORC1 and FLAG-ORC3 in the presence of ATP) (Fig. 3D, compare lanes 3 and 4 in the middle or bottom panel). Because (GGGUU) 8 RNA is also a G4-formable, these results strongly suggest that human ORC should bind preferentially to G4-formable sequences of ssDNA and RNA.
ORC1 Has the G-rich RNA/ssDNA-binding Domain-We next investigated which subunit of human ORC should be responsible for G-rich RNA/ssDNA binding. We found that the ORC1 subunit, which had been fused to GST, bound to fluorescent-labeled, (GGGUU) 8 RNA probe (Fig. 4A), whereas our experiments, although preliminary, did not show G-rich sequence-specific binding by any one of other subunits ORC2-6 (data not shown). Thus, we investigated further the binding specificity of ORC1 by EMSA with competitors as above. Among the competitors tested, both (GGGUU) 8 and (GGGTT) 8 ssDNA efficiently inhibited RNA binding, but others did not (Fig. 4A). The slowly migrating signals were smear, probably due to the presence of ORC1-derived smaller fragments (see 2-861 in Fig. 4B), but the specificity of RNA/ssDNA binding of ORC1 was the same as that of the entire ORC as shown above. This suggests that ORC1 should mainly contrib-ute to G-rich sequence-specific, RNA/ssDNA binding of the entire ORC.
It was previously shown that ORC1-derived fragments (amino acids 1-200 and 201-511) have G-rich RNA-binding activity (16). We therefore constructed various truncation derivatives of ORC1 (Fig. 4B) and tested for the ability to bind to (GGGUU) 8 RNA. We found that ORC1(413-511) could bind to (GGGUU) 8 RNA (Fig. 4C). More importantly, EMSA with competitors showed that (GGGUU) 8 RNA and (GGGTT) 8 or (GGGAA) 8 ssDNA were stronger competitors, compared with dsDNA, A-rich or T-rich ssDNA, or ssDNA with other GT-based sequences (Fig. 5), indicating that ORC1(413-511) by itself exhibits the same specificity for RNA/ssDNA binding as that of the entire ORC. These results imply that ORC1(413-511) confers G-rich RNA/ssDNA-binding specificity on the entire ORC. Therefore, we designated the amino acid 413-511 region of ORC1, which is adjacent to the ATPase domain of ORC1, as the G-rich RNA/ssDNA-binding domain. Although we could narrow this region down further to amino acids 413-487 (Fig. 4C), most of the following analyses were done with ORC1(413-511).
We also examined whether the homologous region (amino acids 439 -537) of Xenopus ORC1 would bind to G-rich RNA or ssDNA. We observed a similar pattern of the shifted bands in EMSA with a (GGGUU) 8 RNA probe between Xenopus ORC1(439 -537) and human ORC1(413-511) and confirmed that each ORC1 fragments (GST-fused) was included in these shifted bands (Fig. 6, A and B). Furthermore, the competition assay indicates that Xenopus ORC1(439 -537) binds preferentially to (GGGTT) 8 ssDNA as well as (GGGUU) 8 RNA compared with (AAACC) 8 RNA (Fig. 6C). These results imply the conservation of G-rich RNA/ssDNA-binding activity among metazoan ORCs.
The G-rich RNA/ssDNA-binding Domain Resembles Part of Mammalian DNA Methyltransferase 1 (DNMT1)-We predicted a three-dimensional structure of this G-rich RNA/ ssDNA-binding domain using FUGUE (20) and found that the N-terminal half (amino acids 10 -540) in human ORC1 fits to the three-dimensional structure of the segment 731-1231 in mouse DNMT1 (Protein Data Bank ID code 3PT9) (23) with a Z-score 16.22, which is equivalent to Ͼ99% confidence level (20). The G-rich RNA/ssDNA-binding domain of ORC1 corresponded to the N-terminal portion of AdoMet_MTase domain (Fig. 7), which is an independent folding unit and binds Sadenosylmethionine (SAM). This structural similarity enabled us to model the three-dimensional structure shown in Fig. 8.
Residues critical for RNA/ssDNA binding within the domain were predicted using the KYG method (21). There are two candidate surfaces for RNA/ssDNA binding, one of which coincides with the S-adenosyl-L-homocysteine (SAH) binding site (Fig. 8). SAH consists of ribose and adenosine, suggesting that G-rich RNA/ssDNA molecule binds to the domain similarly as does SAH. The model predicted close contact between Arg-441, Arg-444, and Arg-465 of human ORC1 and an RNA replacement molecule (Fig. 9A).
To evaluate this prediction, we constructed simultaneous substitution at Arg-441, Arg-444, and Arg-465 to Ala in ORC1(413-511). In agreement with the prediction, this substitution almost completely eliminated the activity of the domain to bind to (GGGUU) 8 RNA (Fig. 9B). We thus propose that the G-rich RNA/ssDNA-binding domain of human   ORC1 is a novel domain for selective binding to G4-preferable RNA/ssDNA.

DISCUSSION
Our results suggest that human ORC has a higher affinity to G-rich RNA/ssDNA than dsDNA. ORC1 seems to be largely responsible for the binding to G-rich RNA/ssDNA binding of the entire ORC. Although our findings are obtained from the binding assay with synthetic RNA/DNA, these should be related to the following observations reported previously. First, EBNA1, the viral protein required for the latent DNA replication of Epstein-Barr virus, has an ability to bind to the replication origin (called oriP) present in the viral genome and also to G-rich RNA, and it has been proposed that EBNA1 bound to oriP utilizes G-rich RNA to recruit human ORC to oriP before the initiation of viral DNA replication (16,24). In this scenario, the viral DNA replication should utilize G-rich RNA-binding activity of human ORC. Second, ORC was also identified as one of the proteins that bind to TERRA, the noncoding RNA that is transcribed from telomeres and associated with them (25,26). In this case, TRF2, a telomere-binding protein, binds to TERRA RNA and recruits ORC to telomeres through the TERRA interaction. The specificity of RNA binding of human ORC, revealed by our study, is consistent with the fact that TERRA contains G4-formable, UUAGGG repeats (26).
In this context, it should be noted that the DNA-binding protein HMGA1a, which has been shown to target ORC to DNA, has an RNA-binding motif (16,27). In addition, AlF-C, which has been shown to bind to the origin region in rat aldorase B gene and interact with ORC, is a member of hnRNP family (28,29). These facts imply the possibility that certain proteins that bind to both DNA and RNA can recruit ORC to the site where the protein should bind. The importance of the RNA-binding activity of the RNA/DNA-binding proteins for specifying ORC binding in vivo would be the next target to be examined.
In contrast with RNA-binding activity, the significance of G-rich ssDNA-binding activity of human ORC is enigmatic. Nevertheless, it is intriguing that the sequence-specificity of RNA/ssDNA binding of human ORC coincides with a common signature of human replication origins (i.e. G4-preferable sequences) (14,15,30). Human ORC does not exhibit the preference to G-rich dsDNA as shown in Fig. 1B, hence it is unlikely that human ORC can identify the G-rich sequence of dsDNA in the origin region out of the whole DNA sequence. It has been shown that S. cerevisiae ORC can also bind to ssDNA (31), and in this previous report, it was suggested that ORC binding to ssDNA should be needed for ORC to stay on the unwound DNA in the origin region, so that ORC remains bound on origins even after initiation of DNA replication. Along this scenario, G-rich ssDNA-binding activity of human ORC may function for targeting the protein to the G-rich sequenceharboring strand just after DNA unwinding of the origin region.
Another scenario is that the G-rich sequence serves as a target for ORC to specify the active replication origin. This scenario needs some mechanism by which a single-stranded G-rich region within the origin should be formed before ORC binds to the origin. This situation may be possible when R-loop is formed upon transcription within the origin region (32). The correlation between transcription and origin activation has been suggested previously (33)(34)(35)(36)(37)(38)(39), but R-loop formation in the origin region is totally unknown. On the other hand, it should be noted that Tetrahymena ORC binds to certain RNA molecules that can recruit ORC to the specific origin region (40). Thus, it should be worth investigating whether G-rich RNA/ ssDNA-binding activity of ORC would be involved in specifying the active replication origin in metazoan cells, probably by using the mutant ORC1 lacking such activity.
It is worth noting that the G-rich RNA/ssDNA-binding domain of human ORC1 resembles part of DNMT1 (41). The similarity between ORC1 and DNMT1 is also observed beyond this domain; bromo-adjacent homology (BAH) domains exist in both at the similar positions relative to the RNA/SAM-binding domains. It would be interesting to compare the overall structure between ORC1 and DNMT1. As shown previously (16) and also in Fig. 4C, there must be another RNA-binding domain in the N-terminal region of ORC1 (for example, amino acids 2-412 in Fig. 4C). Thus, multiple domains in ORC1 may be involved in RNA/ssDNA binding.
Although our results suggest that ORC should recognize G4-formed RNA/ssDNA structure-specifically, it would be necessary to examine further whether the G4 structure is sufficient for preferential binding of ORC. Nevertheless, it is worth thinking that the possibility that G4 structure, which can be induced by transcription (42), may modulate the ORC function in replication and/or other nuclear functions.
G-rich RNA/ssDNA inhibited dsDNA binding of human ORC in our assay. However, these results do not necessarily mean that the same inhibition occurs in vivo, because the length of probes/competitors used in this study is limited. It has been reported that ORC4 has an affinity to DNA triplex (43) and that ORC6 has a TFIIB-like DNA-binding domain (44). Thus, it may be possible that the entire ORC binds simultaneously to dsDNA and RNA/ssDNA if an appropriate substrate is used. It would be necessary to determine the precise architecture of the entire ORC bound to a relatively long dsDNA substrate that contains single-stranded DNA or R-loop.