If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
To whom correspondence may be addressed: Structural Genomics Consortium, University of Toronto, 101 College St., Toronto, ON M5G 1L7, Canada. Tel.: 416-946-3868.
Structural Genomics Consortium, University of Toronto, Toronto, Ontario M5G 1L7Department of Physiology, University of Toronto, Toronto, Ontario M5S 1A8 Canada
* The SGC is a registered charity (number 1097737) that receives funds from AbbVie, Bayer Pharma AG, Boehringer Ingelheim, Canada Foundation for Innovation, Eshelman Institute for Innovation, Genome Canada, Innovative Medicines Initiative (the European Federation of Pharmaceutical Industries and Associations (European Union)) under ULTRA-DD Grant 115766, Janssen, Merck & Co., Novartis Pharma AG, Ontario Ministry of Economic Development and Innovation, Pfizer, Sao Paulo Research Foundation-Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP), Takeda, and the Wellcome Trust.
N6-Methyladenosine (m6A) is the most abundant internal modification in RNA and is specifically recognized by YT521-B homology (YTH) domain-containing proteins. Recently we reported that YTHDC1 prefers guanosine and disfavors adenosine at the position preceding the m6A nucleotide in RNA and preferentially binds to the GG(m6A)C sequence. Now we systematically characterized the binding affinities of the YTH domains of three other human proteins and yeast YTH domain protein Pho92 and determined the crystal structures of the YTH domains of human YTHDF1 and yeast Pho92 in complex with a 5-mer m6A RNA, respectively. Our binding and structural data revealed that the YTH domain used a conserved aromatic cage to recognize m6A. Nevertheless, none of these YTH domains, except YTHDC1, display sequence selectivity at the position preceding the m6A modification. Structural comparison of these different YTH domains revealed that among those, only YTHDC1 harbors a distinctly selective binding pocket for the nucleotide preceding the m6A nucleotide.
), and its distribution and functions are just beginning to be appreciated with the aid of recently developed high throughput sequencing technologies (
). In mRNA, m6A is found to be enriched in the 3′-UTR and stop codons, suggesting a role in the regulation of gene expression at the post-transcriptional level (
), its function had remained largely unknown until recently when its writer, erasers, and readers were identified. Three groups independently reported that the METTL3-METTL14 complex catalyzes the N6-adenosine methylation with high efficiency (
). The Schizosaccharomyces pombe YT521-B homology (YTH) domain-containing protein Mmi1 was reported to be an RNA binding protein and responsible for the selective elimination of meiosis-specific transcripts during vegetative growth (
). The only known YTH domain-containing protein in Saccharomyces cerevisiae Pho92 also decreases the Pho4 mRNA stability by binding to its 3′-UTR during the phosphate metabolism (
A novel protein, Pho92, has a conserved YTH domain and regulates phosphate metabolism by decreasing the mRNA stability of PHO4 in Saccharomyces cerevisiae.
). In the YTH domain complex structures, the m6A base is positioned in a hydrophobic pocket consisting of two or three aromatic residues. The m6A binding mode is reminiscent of the methyllysine and methylarginine binding mode (
). Furthermore, our structural and binding studies of YTHDC1 also established that YTHDC1 slightly prefers a G nucleotide and disfavors an A nucleotide at the position preceding the m6A nucleotide in RNA (
). In the human genome, there are at least five YTH domain-containing proteins, and we wonder whether these YTH domains have distinct sequence selectivity, similar to the Royal family or PHD domain histone readers (
In this study, we systematically characterized the m6A binding ability of four human YTH domains and the YTH domain of yeast Pho92 and determined the crystal structures of the YTH domain of human YTHDF1 and its complex with a 5-mer m6A RNA, and the YTH domain of yeast Pho92 in complex with a 5-mer m6A RNA. Our results indicate that the YTH domains recognized the m6A nucleotide in a conserved mode. Whereas YTHDC1 preferentially bound to the GG(m6A)C sequence (
), the other YTH domains bound to m6A RNA regardless of sequence context. Structural comparison of the YTHDC1 and YTHDF1 complexes revealed that YTHDC1 and YTHDF1 used different binding pockets to accommodate the nucleotide preceding m6A, corresponding to divergent binding grooves for accommodating RNA 5′ to m6A. Our structural and mutagenesis studies also pinpointed the key residues responsible for the differential sequence selectivity.
Experimental Procedures
Cloning, Expression, and Purification of the YTH Domains of YTHDF1, YTHDF2, YTHDC2, and Pho92
The Pho92 YTH domain (amino acids 141–306), two constructs of the human YTHDF1 YTH domain (amino acids 361–559 and amino acids 365–554), the YTHDF2 YTH domain (amino acids 380–579), and MJECL36 (amino acids 1–147) were subcloned into a pET28a-MHL vector. All recombinant proteins were overexpressed at 18 °C as N-terminal His6-tagged protein in Escherichia coli BL21 (DE3) codon plus RIL (Stratagene) and purified by HiTrap nickel column. The His tag was removed by the addition of 0.05 mg of TEV protease per milligram of recombinant protein, followed by the dialysis to remove imidazole at 4 °C for 12 h. The samples were then passed through a nickel-nitrilotriacetic acid column and further purified by size exclusion chromatography (Superdex 75; GE Healthcare). The mutants of the Pho92 and YTHDF1 YTH domains were cloned using site-directed mutagenesis kits (Invitrogen) and were expressed and purified in the same way as wild type recombinant proteins.
Crystallization, Data Collection, and Structure Determination
All crystals were obtained using sitting drop vapor diffusion at 20 °C. Crystals of the YTHDF1 YTH domain (361–559) were obtained by mixing 1.0 μl of purified protein (15 mg/ml) with 1.0 μl of reservoir solution containing 0.2 m potassium thiocyanate and 25% PEG 3350 and subsequent equilibration against 800 μl of reservoir buffer. The purified YTHDF1 YTH domain (365–554, 12 mg/ml) was mixed with the 5-mer GG(m6A)CU RNA in a ratio of 1:2 and incubated for 30 min at 0 °C before co-crystallization. Crystals of the complex were obtained by mixing 1.0 μl of the complex with 1.0 μl of reservoir solution containing 0.1 m Bis-Tris, pH 6.5, 0.2 m sodium chloride, and 25% PEG 3350 and subsequent equilibration against 800 μl of reservoir buffer. The purified Pho92 YTH domain (141–306, 10 mg/ml) was mixed with the 5-mer UG(m6A)CU RNA in a ratio of 1:2 and incubated for 30 min at 0 °C before co-crystallization. A crystal of the complex was obtained by mixing 1.0 μl of the complex with 1.0 μl of well solution containing 0.2 m potassium isocyanate and 20% PEG 3350 and subsequent equilibration against 800 μl of reservoir buffer.
Diffraction intensities were recorded under sample cooling to 100 K at rotating anode or synchrotron sources (see Table 2) and processed with XDS (
) was used for automated model building. The preliminary model was further refined against higher resolution diffraction data from another, isomorphous crystal.
The YTHDF1 oligoribonucleotide complex structure was solved by molecular replacement with coordinates from the YTHDF1 “apo” structure. Some nucleotide link restraints were prepared with JLIGAND (
The Pho92 structure was solved by molecular replacement with an ensemble of “apo” models of YTHDC1 (related to PDB entry 4R3H) and YTHDF1 (see above). Density improvement was performed with ARP/WARP (
All ITC experiments were performed at 298 K using a MicroCal ITC200 (GE Healthcare). All RNAs used for ITC experiments were purchased from Thermo Fisher Scientific except the unmodified GGACU, which was purchased from Integrated DNA Technologies, Inc. The purity of all purchased RNAs was >90%. All proteins and RNAs were dialyzed or dissolved into the same buffer containing 20 mm Tris, pH 7.5, 150 mm NaCl before the binding experiments. 10–17 injections were recorded by injecting 2 μl of 500–1000 μm of RNAs into a sample well containing 15–60 μm of protein. The concentration of the purified proteins and RNA oligonucleotides were estimated with absorbance spectroscopy (NanoDrop) using the extinction coefficients, A280 and A260, respectively. Each binding isotherm was plotted, analyzed, and fitted in a one-site binding model by Origin Software (MicroCal Inc.) after subtraction of the respective control.
Results and Discussion
YTH Domain Is an Evolutionarily Conserved m6A Reader
The m6A modification has been found from yeast to human, and recent m6A transcriptome analysis reveals that m6A is dominantly present in the RRACU (where R = A/G) consensus motif in mammals (
). Consistently, a previous study using the borate gel chromatography reveals that the m6A modification occurs within a G(m6A)C or A(m6A)C motif with probabilities of 70 and 30% in mammals, respectively (
). In the human genome, there are at least five YTH domain-containing proteins, and our recent structural and biochemical studies showed that YTHDC1 utilizes an aromatic cage to recognize m6A, and it preferentially recognizes the GG(m6A)C sequence somewhat (
). To understand the binding specificity and sequence selectivity of other YTH domain-containing proteins, we cloned the YTH domains of YTHDF1/2, YTHDC2, and S. cerevisiae Pho92 for quantitative binding and structural studies. Our ITC binding results show that both the human and yeast YTH domains recognized m6A-containing RNA but not unmodified RNA, regardless of the RNA length (Table 1), implying that the YTH domain is an evolutionarily conserved m6A-dependent RNA binding domain. Furthermore, the YTHDF1/2, YTHDC2, and Pho92 YTH domains did not display sequence selectivity at the −1 position (the position preceding the m6A nucleotide), unlike the YTHDC1 YTH domain, which preferred GG(m6A)CU over GA(m6A)CU containing 9-mer RNA (Table 1; see Fig. 4, A–D), consistent with our previous findings using 5-mer RNAs or 16-mer RNAs (
). In addition, the YTH domains bound to the 9-mer m6A RNAs with similar binding affinities to the 16-mer m6A RNAs, but much weaker to the 5-mer GG(m6A)CU RNA (Table 1), suggesting that the immediate surrounding nucleotides of m6A also contribute to the YTH domain binding.
TABLE 1Binding affinities of different YTH domains to RNAs
The data are from Ref. 24. Note that the data represent the mean values ± S.E., and standard deviations are calculated from the ITC curve fitting by Microcal Origin software.
The data are from Ref. 24. Note that the data represent the mean values ± S.E., and standard deviations are calculated from the ITC curve fitting by Microcal Origin software.
The data are from Ref. 24. Note that the data represent the mean values ± S.E., and standard deviations are calculated from the ITC curve fitting by Microcal Origin software.
The data are from Ref. 24. Note that the data represent the mean values ± S.E., and standard deviations are calculated from the ITC curve fitting by Microcal Origin software.
The data are from Ref. 24. Note that the data represent the mean values ± S.E., and standard deviations are calculated from the ITC curve fitting by Microcal Origin software.
The data are from Ref. 24. Note that the data represent the mean values ± S.E., and standard deviations are calculated from the ITC curve fitting by Microcal Origin software.
The data are from Ref. 24. Note that the data represent the mean values ± S.E., and standard deviations are calculated from the ITC curve fitting by Microcal Origin software.
FIGURE 4YTHDF1 specifically recognizes m6A RNA without sequence selectivity.A–D, the representative ITC binding curves of YTHDF1 binding to different 16-mer m6A RNAs. A, 16-mer GG(m6A)CU. B, 16-mer GA(m6A)CU. C, 16-mer GC(m6A)CU. D, 16-mer GU(m6A)CU. E–G, the methylated nucleotide is shown as green sticks. The residues involved in binding the m6A are shown as orange sticks. E, the m6A binding pocket of wild type YTHDF1. F, the hypothetic model of the m6A binding pocket of YTHDF1 with the D401N mutation. A hydrogen bond could be formed between the side chain of Asp401 and the N1 atom of the m6A nucleotide. Asp401 is marked in black. G, the m6A binding pocket of the Pho92 YTH domain.
To provide structural insights into the m6A specific recognition and the different sequence selectivity of the YTH domains, we determined the crystal structure of YTHDF1 (amino acids 361–559) at a resolution of 1.97 Å (Table 2). The YHDF1 YTH domain adopted a similar architecture to that of the YTHDC1 YTH domain (
), and the root mean square deviation between the backbone Cα atoms of the YTH domains of YTHDF1 and YTHDC1 was 0.9 Å (calculated from PyMOL software), although the sequence identity between the two YTH domains was only 27%. The YTHDF1 YTH domain consisted of five α helices (α0–α4), six β strands (β1–β6), and one 310 helix following the β5 strand (Fig. 1A). The six β strands were arranged in an atypical β barrel fold in the order of β6-β1-β3-β4-β5-β2. The only parallel β strands were β1 and β3, whereas the others were anti-parallel (Fig. 1, B and C). Three helices (α1–α3) packed against the β barrel and constituted a hydrophobic core together with the six β strands. The α1 helix also packed against α0 and α4 (Fig. 1B). The α0 helix was followed by a long loop linker, whereas the α4 was a kink helix with its axis perpendicular to that of α1 (Fig. 1B).
FIGURE 1Overall structure of the YTHDF1 YTH domain.A, sequence alignment of the YTH domains of human YTHDF1-3, YTHDC1-2, and yeast Pho92 by ESPript 3. B, overall structure of the YTHDF1 YTH domain (amino acids 361–559) shown in orange cartoon. The invisible residues in the apo structure are denoted with dashes. C, topology of the YTH domain of YTHDF1 with marked secondary structure elements.
m6A Is Specifically Recognized by an Aromatic Cage of YTHDF1
We attempted co-crystallization of the human YTHDF1 YTH domain with m6A-containing RNAs of different lengths, but only obtained crystals from the complex with the 5-mer GG(m6A)CU RNA, which diffracted to 1.60 Å resolution (Table 2). In the complex structure, nucleotides 5′-GG(m6A) exhibited density for both the backbone and nucleobases. Additional density was interpreted as representing nucleotide (C+1) and the phosphate of (U+2)-3′ (Fig. 2E). The m6A RNA bound to a positively charged concave of YTHDF1 in an extended confirmation (Fig. 2, A and B). The m6A binding pocket of the YTHDF1 YTH domain was similar to that of the reported YTHDC1 YTH domain (
) and was composed of the C termini of β1, α1, β2, the N terminus of α2, and the loop between β4 and β5 (Fig. 2A). Specifically, the m6A was accommodated in a pocket composed of Trp411, Trp465, and Trp470, with the ring planes of Trp411 and Trp470 parallel to each other and perpendicular to the ring plane of Trp465 (Fig. 3, bottom right panel). The N6-methyl bond was coplanar with the m6A purine ring, with the methyl group pointing toward Trp465 and the m6A adenine moiety sandwiched by the rings of Trp411 and Trp470 (Figs. 3, bottom right panel, and 4E). The CH-π interaction between the N6-methyl moiety and the aromatic cage, together with the π-π interactions between the adenine base and the aromatic residues, constituted the basis of the m6A specific recognition (Fig. 3, bottom right panel). Additionally, the m6A adenine formed hydrogen bonds with the YTHDF1 YTH domain, including the N3 of m6A and the backbone NH of Tyr397, and the N6 of m6A and the backbone carbonyl of Cys412 (Figs. 3, bottom right panel, and 4E). The side chain oxygen of Asp401 was located in close proximity of the N1 of m6A (Figs. 3, bottom right panel, and 4E).
FIGURE 2Complex structures of the YTHDF1 YTH domain with GG(m6A)CU and the Pho92 YTH domain with UG(m6A)CU.A, overall structure of the YTHDF1 YTH domain (amino acids 365–554) with a 5-mer RNA GG(m6A)CU. The four visible nucleotides (GG(m6A)C) are shown in green cartoon. m6A is shown in a green stick model. The protein is shown in orange cartoon. B, electrostatic surface representation of the YTHDF1 complex structure by PyMOL. C, overall structure of the Pho92 YTH domain in complex with a 5-mer RNA UG(m6A)CU, with only m6A visible and shown as a green stick model). D, electrostatic surface representation of the Pho92 complex structure. E, the simulated annealing m|Fo| − D|Fc| omit map of the RNA (green) is contoured at 2.8σ. The protein is shown in orange cartoon. F, the simulated annealing m|Fo| − D|Fc| omit map of the m6A (green) is contoured at 2.8σ. The protein is shown in orange cartoon.
FIGURE 3Detailed interactions between the YTH domain of YTHDF1 (orange) and the 5-mer RNA −2GG(m6A)CU+2 (green).Top left panel, overall view. Nucleotides and residues involved in binding to RNA are shown in stick mode. The intermolecular hydrogen bonds are shown as black dashes. Bottom left panel, G-1 interactions. Bottom right panel, the m6A pocket. Top right panel, C+1 interactions.
To further evaluate the role of the m6A interacting residues of YTHDF1 in binding, we performed mutagenesis and ITC binding experiments. Mutating Trp411, Trp465, or Trp470 to an alanine disrupted the binding of YTHDF1 to the 5-mer GG(m6A)CU RNA, confirming that the aromatic residues were critical for the m6A specific recognition (Table 3).
TABLE 3Binding affinities of different RNA constructs to the wild type and mutants of the YTH domains of YTHDC1, YTHDF1, and Pho92
. Note that the data represent the mean values ± S.E. Standard errors and n values are calculated from the ITC curve fitting by Microcal Origin software.
The Binding Site of the YTHDF1 YTH Domain for the Nucleotide Preceding m6A Is Distinct from That Found in the YTHDC1 YTH Domain
In addition to the m6A specific binding, the m6A RNA also made contacts with the YTHDF1 YTH domain through other nucleotides (Fig. 3, top left panel). To understand the different sequence selectivity of the YTH domains of YTHDC1 and YTHDF1 toward the −1 position of the m6A RNA, we superimposed the crystal structures of YTHDC1 and YTHDF1, respectively, in complex with the 5-mer GG(m6A)CU RNA and found that only the 3′ ends including m6A of the two RNAs coincided, whereas the 5′ ends of the two RNAs deviated (Fig. 5A). In the YTHDF1 complex, G-1 was sandwiched between G-2 and the ring of Tyr397 (Fig. 3, bottom left panel). Tyr397 is conserved in YTHDF1/2/3, and mutating Tyr397 to alanine significantly reduced its binding affinity to both the 5-mer and 16-mer m6A RNA (Table 3). Whereas Tyr397 was important to m6A RNA binding, it appeared able to stack with any nucleotide via π-π interactions, consistent with our binding data of similar affinities of the m6A RNA with different substitutions at the −1 position (Table 1). In the previously published YTHDC1 complex, on the other hand, G-1 interacts with Leu380 and Met438 and forms a direct H-bond with the backbone NH of Val382 and a solvent-mediated H-bond with the side chain of Asn383 (
) (Fig. 5B). Specifically, replacing the G with an adenosine at the −1 position would disrupt the Val382 hydrogen bond or introduce potential steric clashes with Val382, which explains why YTHDC1 disfavors an A nucleotide at the −1 position in the m6A RNA (
). In addition, neither Met438 nor Leu380 is conserved in other YTH domains except YTHDC1 (Figs. 1A and 6A), and mutating either of them in YTHDC1 would diminish its binding to the m6A RNA and attenuate the nucleotide preference at the −1 position for YTHDC1 (Table 3). Therefore, YTHDC1 and YTHDF1 used different binding pockets to accommodate the nucleotide preceding m6A, which leads to different nucleotide selectivity at the −1 position (Figs. 3, bottom left panel, and 5B).
FIGURE 5YTHDC1 has a selective binding pocket at the −1 position to accommodate the nucleotide preceding m6A, distinct from other YTH domains.A, superposition of the crystal structures of the YTHDF1 (orange cartoon)-GG(m6A)CU (green cartoon) and YTHC1 (blue cartoon)-GG(m6A)CU (yellow cartoon) (PDB entry 4R3I). The m6A nucleotides are shown in stick mode. B–D, the G-1 binding pocket of the YTHDC1 YTH domain is superimposed with the corresponding pocket of the YTHDF1 YTH domain (B), the Pho92 YTH domain (C), and the YTHDC2 YTH domain (D, PDB entry 2YU6). The G-1 nucleotide is shown as a yellow stick model. The Leu380, Val382, Asn383, and Met438 of YTHDC1 and their corresponding residues in the YTH domains of YTHDF1, YTHDC2, and Pho92, are shown as sticks.
FIGURE 6Structure comparison betweenYTHDF1, Pho92, and YTHDC2 (PDB entry 2YU6). A, sequence alignment of the YTH domains of human YTHDC1, YTHDC2, and YTHDF1-3. The secondary structures of the YTHDC1 YTH domain are shown at the top of the sequences according to Ref.
. The two critical G-1 binding residues of YTHDC1, Leu380 and Met438, which are not conserved in other YTH family members, are marked in blue. B, superposition of the crystal structures of YTHDF1 (salmon cartoon)-GG(m6A)CU (green cartoon), YTHDC2 (yellow cartoon), and Pho92 (blue cartoon). The m6A nucleotide is shown as green sticks. C, superposition of the m6A binding pockets of the YTH domains of YHTDF1 (salmon stick), YTHDC2 (yellow stick), and Pho92 (blue stick). D, the binding pocket for the nucleotide at the −1 position in the YTHDC1 (gray ribbon) and GG(m6A)CU (green stick for G-1) complex with YTHDF1 (salmon ribbon), YTHDC2 (yellow ribbon), and Pho92 (blue ribbon) superimposed on it. Met438 of YHDC1 and its corresponding residues in the YTH domains of YTHDF1, YTHDC2, and Pho92 are shown as stick models. E, the binding pocket for the nucleotide at the −1 position in the YTHDF1 (salmon ribbon) and GG(m6A)CU (green stick for G-1) complex with Pho92 (blue ribbon) superimposed on it. The Tyr397 and its corresponding residues in Pho92 are shown as stick models.
The base of C+1 in the YTHDF1-GG(m6A)CU complex could stack with the side chain of Arg506 based on the weak electron density. In addition, the 5′-phosphate of C+1 formed a hydrogen bond with the backbone of Asp507 and could interact electrostatically with the side chain of Arg506. Arg506 is highly conserved in the YTH domain (Fig. 1A), and mutating Arg506 to an alanine diminished the binding of YTHDF1 to the 5-mer RNA (Table 3). The 5′-phosphate moiety of U+2 formed a hydrogen bond with the backbone NH of Gly442 and the side chain of Lys395 (Fig. 3, top right panel). Except m6A, no base-specific interactions were observed in the YTHDF1 complex. Nevertheless, the nucleotides surrounding m6A made significant contributions to the m6A RNA binding and deleting U+2 and G-2 significantly reduced binding of m6A RNA (Table 3).
Crystal Structure of Yeast Pho92 with a 5-mer m6A RNA Reveals a Conserved m6A Binding Pocket
To explore whether the m6A binding mode of the YTH domain is also conserved in yeast, we determined the crystal structure of the S. cerevisiae Pho92 YTH domain (141–306) in complex with a 5-mer m6A RNA at a resolution of 1.80 Å (Table 2). We only located m6A of the 5-mer RNA in the electron density map (Fig. 2F). Overall, the Pho92 YTH domain adopted the canonical YTH fold similar to those of the human YTH domains, consisting of three α helices and six β strands (Fig. 1A). The root mean square deviation between the YTH domains of Pho92 and YTHDF1 was 1.4 Å (
) at a sequence identity between YTH domains of 35% (Figs. 1A; 2, C and D; and 6B). Despite the overall similar architectures between the two structures, the Pho92 YTH domain did not contain the helices α0 and α4 at the N and C termini but contained long loops in their places, respectively. Both loops packed against β5 (Figs. 2, C and D, and 6B).
In the Pho92-m6A complex structure, m6A was accommodated in an aromatic cage consisting of Trp177, Trp231, and Tyr237 in a similar way to that of YTHDF1 (Figs. 4G and 6C). Therefore, the YTH domain used a evolutionarily conserved aromatic cage to recognize m6A, and mutating the cage residues Trp177 and Trp231 or the C+1 binding residue Arg273 of Pho92 to an alanine would diminish or disrupt its binding (Table 3).
Sequence alignment between the YTH domains of Pho92 and YTHDC1 showed that the nucleotide binding pocket at −1 position of YTHDC1 was not conserved in Pho92 (Figs. 5D and 6, A and D). Like YTHDF1, the Pho92 YTH domain bound to m6A RNA without obvious sequence preference (Table 1). However, Tyr397 of YTHDF1 corresponded to Ser163 in Pho92 (Figs. 1A, 3D, and 6A); thus Pho92 would potentially recognize the nucleotide at −1 position with a pocket different from either that of YTHDC1 (Fig. 5D) (
N1 Atom of m6A Nucleotide Favors an Asn over an Asp in the Aromatic Cage Pocket
When comparing the m6A binding pocket of YTHDF1 with that of YTHDC1, we found the Asn367 in YTDC1 is replaced by an Asp (Asp401) in YTHDF1. m6A is neutral in charge. Assuming the N1 atom of m6A in the “free” rather than protonated state (
), deprotonation of Asp401 in YTHDF1 should weaken binding. On the other hand, mutating Asp401 to an asparagine should enhance the binding by introducing a potential hydrogen bond donor to complement “free” N1 of m6A (Fig. 4, E and F). Accordingly, we found the YTHDF1 D401N mutant binds the GG(m6A)CU 5-mer RNA 16-fold stronger than its wild type (Table 3 and Fig. 4, E and F). Furthermore, we mutate the Asn367 of YTHDC1 to an Asp and found the N367D mutant disrupts the binding of the GG(m6A)CU 5-mer RNA (Table 3). Together, we found that the key residue difference could well explain the differential binding ability of the YTH domains toward the m6A RNA.
From the sequence alignment of human YTH domain-containing proteins and yeast Pho92, we found that two of the three aromatic cage residues (such as Trp411 and Trp465 in YTHDF1 and Trp177 and Trp231 in Pho92) were absolutely conserved in all YTH domains (Fig. 1A). In contrast, Trp470 of YTHDF1 (or Tyr237 of Pho92) corresponds to a leucine residue in YTHDC1/2 (Figs. 1A and 6C).
The Fold of the YTH Domain Is Similar to That of the EVE Domain
A structural similarity query on the Fatcat server (
). YTHDF1 and MJECL36 had an root mean square deviation of 3.2 Å when we aligned the backbone Cα atoms of these two structures (Fig. 7A). Interestingly, MJECL36 also harbored a similar aromatic cage comprised of Trp25, Phe79, and Phe90 (Fig. 7B). Despite their structural similarity, we found that the backbone of Asn12 in MJECL36 deviated from that of Tyr397 of YTHDF1, which might disrupt the base specific hydrogen bond with N3 of m6A (Fig. 7B). Consistently, our ITC results indicated that MJECL36 did not bind to the 5-mer GG(m6A)CU RNA (Table 3).
FIGURE 7The YTH fold is similar to that of the EVE domain in prokaryotes.A, superposition of the YTHDF1(orange cartoon)-GG(m6A)CU (green cartoon) complex and MJECL36 (PDB entry 2P5D, blue cartoon). The m6A nucleotide is shown as green sticks. B, MJECL36 (PDB entry 2P5D, blue ribbon) has a similar pocket to the m6A binding pocket in the YTHDF1 YTH domain (orange ribbon). One hydrogen bond is formed between Tyr397 of the YTHDF1 YTH domain and the N3 atom of the m6A, in contrast, no hydrogen bond is formed between Asn12 of the MJECL36 and the N3 atom of the m6A. The m6A nucleotide and the residues forming the pockets are shown as sticks.
Implications for Division of Labor among the YTH Domain Proteins
The subject of the present study, m6A, is the most abundant internal modification underlying various eukaryotic RNAs, including mRNAs, tRNAs, and long noncoding RNAs (
) analysis reveals that m6A is usually present within the GAC or AAC sequence motifs in mammals, and consistently these motifs are also the substrates for the only known m6A methyltransferase METTL3-METTL14 complex (
). The corresponding pockets of other YTH proteins do not structurally support this selectivity (Figs. 5, B and D, and 6, D and E), consistent with no sequence selectivity at the −1 position in non-YTHDC1 proteins (Table 1). On the other hand, YTHDF2 was reported to bind a G(m6A)C motif from a RNA pulldown assay using the total RNA transcripts from the cells (
). The YTH domain of YTHDF2 is 87% identical to that of YTHDF1 in sequence, and all the critical residues in m6A recognition are conserved between YTHDF1 and YTHDF2 (Fig. 1A). Therefore, they should bind a very similar sequence motif, which was confirmed by our ITC binding results (Table 1), i.e. both YTHDF1 and YTHDF2 are m6A specific binders without sequence selectivity at the −1 position (Table 1). One possible explanation for the different motifs obtained from the two different binding assays is: whereas YTHDF1 and YTHDF2 are able to bind to any m6A-containing RNA in vitro, the only known m6A methyltransferase complex, METTL3-METTL14, selectively catalyzes the methylation of the adenosine in a G(m6A)C (70%) or A(m6A)C (30%) context in vivo (
). On the basis of our binding and structural studies, we propose that the preferred physiological ligands of YTHDC1 could be G(m6A)C RNAs. The other human YTH domains could recognize both G(m6A)C and A(m6A)C RNAs as their ligands.
Author Contributions
C. X. and J. M. conceived the research. C. X. cloned, expressed, and purified all YTH domain proteins and their mutants listed in the manuscript with the assistance from K. L. and P. L. C. X. performed extensive structural experiments and analyzed the data. C. X. and K. L. performed ITC binding experiments. C.X. and J. M. wrote the manuscript. All authors discussed the manuscript. H. A. and M. S contributed to the analyzing and discussion of the binding characteristics of YTH domain proteins. J. M. supervised the project.
Acknowledgments
We thank Dr. Wolfram Tempel and Dr. John R. Walker for the assistance with data collection and structure determination and for reviewing the YTHDF1 crystal structure. We also thank Dr. Cheryl Arrowsmith and Dr. Adelinda Yee for kindly providing the Methanocaldococcus jannaschii DNA and yeast DNA.
A novel protein, Pho92, has a conserved YTH domain and regulates phosphate metabolism by decreasing the mRNA stability of PHO4 in Saccharomyces cerevisiae.