MACROH2A2, a new member of the MARCOH2A core histone family.

MACROH2As are core histones that have a unique hybrid structure consisting of an amino-terminal domain that closely resembles a full-length histone H2A followed by a large nonhistone region. The human MACROH2A1 gene, on chromosome 5, encodes two MACROH2A subtypes, MACROH2A1.1 and MACROH2A1.2, produced by alternate splicing. Here we report the identification of MACROH2A2, a new MACROH2A subtype encoded by a separate gene on human chromosome 10, MACROH2A2. The amino acid sequence of human MACROH2A2 is 68% identical to human MACROH2A1.2. We show by immunofluorescence on mouse tissue sections that MACROH2A2, like MACROH2A1.2, is concentrated in the inactive X chromosome. However, MACROH2A2 has a very different pattern of expression in the cell types present in the liver and kidney. When MACROH2A2 and MACROH2A1.2 are present in the same nucleus, they have a similar, though nonidentical, pattern of localization, with both subtypes present in the inactive X chromosome. Our results suggest a developmental role for MACROH2A subtypes.

The MACROH2A core histones were first observed as two 42-kDa proteins that remained associated with rat liver mononucleosomes in 0.5 M NaCl (1). The sequences of cDNAs that encode these proteins revealed that they have a unique hybrid structure consisting of an amino-terminal domain that closely resembles a full-length histone H2A followed by a large nonhistone domain. MACROH2As are released from the nucleosome core at the same salt concentration as conventional H2As and appear to be released as a heterodimer with H2B (1). This indicates that the H2A region of MACROH2A replaces conventional H2A in the nucleosome core. It was estimated that 1 in 30 nucleosomes in rat liver would contain MACROH2A, assuming one MACROH2A per nucleosome (1). The H2A region (amino acids 1-122) is followed by a short linker region (10 amino acids) and then a region that is rich in basic amino acids (28 amino acids) (see Fig. 1). This basic region resembles basic tails present in other histones and most likely binds DNA.
The nonhistone region of MACROH2As is a feature not found in other known core histones. It constitutes 57% of the protein and contains a putative leucine zipper motif consisting of four heptad repeats (1). The majority of the nonhistone region appears to have evolved from a gene of unknown function that originated prior to the appearance of eukaryotes (2). Among the sequences that are similar to the nonhistone region is a conserved domain present in proteins involved in the replication of RNA viruses (2).
We identified two distinct MACROH2A cDNA sequences (1) and used specific antibodies to assign these sequences to the two MACROH2A proteins that were resolved by electrophoresis (3). We named these subtypes MACROH2A1.1 and MACROH2A1.2. The nucleotide sequences of cDNAs that encode MACROH2A1.1 and MACROH2A1.2 differed only in a single internal segment that encodes part of the nonhistone region (1,3), and these subtypes are formed by alternate splicing of a single gene, macroH2A1, on mouse chromosome 13 (4). Western blots of nuclear extracts from adult and fetal tissues revealed distinct tissue expression patterns that change during development (3). Both subtypes are highly conserved in mammals and birds (2). These results suggest that MACROH2A1.1 and MACROH2A1.2 differ in some aspect of their function (3).
We demonstrated that the inactive X chromosome of female mammals can be distinguished in interphase nuclei as a large MACROH2A-dense domain called a macrochromatin body (MCB) 1 (5). The preferential association of MACROH2A1.2 with the inactive X chromosomes of differentiating female mouse embryonic stem cells is a relatively late event, occurring several days after gene silencing (6). In female embryos, however, association of MACROH2A1.2 with one of the X chromosomes begins as early as the 12-cell stage (7), a time when the trophectoderm cells begin to differentiate. This indicates that MACROH2A1.2 associates with the inactive X chromosome at or near the time of initiation of inactivation in these cells. The only other known feature of X inactivation in preimplantation embryos that occurs before MACROH2A accumulation is the accumulation of Xist RNA (8), a cis-acting RNA transcribed from the X inactivation center of the inactive X chromosome (9 -13). The Xist gene is required for the initiation of X inactivation (14 -17), but it is not known how Xist RNA functions in the inactivation process. The homology of the nonhistone region of MACROH2A to a viral protein involved in RNA replication, together with these results, lead us to hypothesize a possible functional role for Xist RNA in localizing MACROH2A1.2 to the inactive X chromosome (2,5). A connection between Xist and MACROH2A1 was established in a mouse fibroblast model in which deletion of part of the Xist locus from the inactive X chromosome leads to the loss of MACROH2A1 association (18).
Although MACROH2A1.2 is not solely localized to the inactive X chromosome, even in female cells (5), that association * This work was supported by Grant GM49351 from the National Institutes of Health. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted to the GenBank TM /EBI Data Bank with accession number(s) AF151534.
‡ To whom correspondence should be addressed. Tel.: 215-898-0454; Fax: 215-573-5189; E-mail: pehrson@vet.upenn.edu. provides a framework for understanding the function of MACROH2A proteins. In our current working model, one aspect of their function is to establish and/or maintain transcriptionally silent chromatin domains. Here we present the discovery of a new MACROH2A subtype, MACROH2A2, encoded by a second MACROH2A gene, MACROH2A2. Analysis of the sequence, tissue distribution, and nuclear distribution of MACROH2A2 suggests a developmental role for MACROH2A subtypes.

EXPERIMENTAL PROCEDURES
MACROH2A2 Identification-A Blast search of expressed tagged sequences (dbEST) (19) with MACROH2A1 nonhistone domain sequences was performed, and a unique but clearly related sequence was found. Three representative clones were obtained, and one (clone identification number 50058 from Soares infant brain 1NIB library made from the whole brain of a 73-day-old human female) was sequenced using an ABI 373A sequencer with Taq FS dye terminator chemistry and found to contain the entire open reading frame (GenBank TM accession number AF151534). The amino acid sequence was aligned to the available published MACROH2A1 sequences (1-3, 20, 21) using ClustalW (22).
MACROH2A Genes-The rat macroH2A1 gene was cloned from a rat phage library from which five overlapping clones were isolated and mapped by polymerase chain reaction, and intron-exon junctions were sequenced using primers derived from the cDNA sequence (3). The mouse macroH2A1 gene was previously mapped (4). The human MACROH2A genes were discovered in the data base from the Human Genome Sequencing Project using the human cDNA sequences (20,21) in a BLAST search. MACROH2A1 is in gi 12731492 from human chromosome 5, and MACROH2A2 is in gi 12735407 from human chromosome 10.
Antibodies-The nonhistone region of the MACROH2A2 cDNA was amplified with the forward tailed primer GGAAGGATCCCAAAGGAC-AGCGATAAA and the reverse tailed primer CGAAGAATTCCCCTGC-TGGAAAGTGCGG, digested with BamHI and EcoRI, and cloned into pGEX-2TK (Amersham Pharmacia Biotech). The GST-MACROH2A2 nonhistone region fusion protein was expressed in bacteria, and the thrombin-cleaved protein was isolated as described (5). Antisera were raised in rabbits (Cocalico Biologicals), bound to a fusion protein affinity column, eluted, and passed through a GST-MACROH2A1.2 nonhistone region affinity column to remove cross-reacting antibodies (5). The rabbit antibody against the nonhistone region of MACROH2A-1.2 has been described (5). Direct antibody labeling was accomplished using the FluorReporter Texas Red-X, Fluorescein-EX, Alexa Fluor 488, and Alexa Fluor 594 protein labeling kits from Molecular Probes as suggested by the manufacturer. Antibodies against connexin 26 (N-19) and connexin 32 (C-20) were obtained from Santa Cruz Biotechnology, Inc.
Western Blot-To compare the relative levels of the MACROH2A subtypes in mouse tissue, nuclei were isolated from adult mouse liver and kidney following the procedure of Blobel and Potter (23), except that the nuclear isolation buffer was 0.4 M mannitol, 60 mM KCl, 15 mM NaCl, 0.15 mM spermine, 0.5 mM spermidine, 2 mM EDTA, 0.5 mM EGTA, 15 mM triethanolamine, pH 7.4, containing 0.3 mM phenylmethanesulfonyl fluoride and 6 g/ml aprotinin, and the nuclei were centrifuged through a lower layer that contained 2 M, rather than 2.3 M, sucrose. Nuclear extracts were prepared (2), and equivalent amounts of the extracts (in terms of DNA content, as determined by absorbance at 260 nm) were loaded, except for the MACROH2A2 blot, where 2.5 times more liver extract was loaded to detect a band. Proteins were separated by SDS gel electrophoresis and transferred onto polyvinylidene difluoride membranes (3). The blocked membranes were incubated overnight with the primary antibody followed by a secondary peroxidase-conjugated donkey anti-rabbit IgG, and the signal was detected using Su-perSignal West Femto maximum sensitive substrate (Pierce).
To compare the specificity of labeled versus unlabeled MACROH2A antibodies, nuclei were isolated from adult mouse kidneys, and proteins were separated and transferred as described above. One strip was stained directly with Coomassie Brilliant Blue, the others were incubated overnight with the primary antibody followed by a secondary alkaline phosphatase-conjugated mouse anti-rabbit IgG, and the signal was detected using the 1-Step NBT/BCIP detection system (Pierce).
Tissue Immunofluorescence-Immunofluorescence on adult mouse frozen tissue sections was performed as described (5).
Immunofluorescence/Fluorescence in Situ Hybridization-Immunofluorescence/fluorescence in situ hybridization on adult mouse frozen tissue sections was performed as described (5).

RESULTS
Sequence Analysis of Histone MACROH2A2-We identified cDNAs in dbEST (19) that encode a new MACROH2A subtype that we named MACROH2A2. A full-length human cDNA clone was sequenced, and the encoded amino acid sequence was aligned with the published MACROH2A1 sequences (Fig. 1a). Although MACROH2A2 has the same basic architecture as the MACROH2A1 subtypes, it is not encoded by the MACROH2A1 gene, because its cDNA and amino acid sequences differ from the MACROH2A1 subtypes along its entire length. Overall, the amino acid sequence of MACROH2A2 is 68% identical to that of MACROH2A1.2. The H2A region of MACROH2A2 is significantly more similar to the H2A region of the MACROH2A1 subtypes, 84% identical, than it is to conventional H2A, 66% identical to a human H2A (Fig. 1b). The amino acid sequence of the basic region of MACROH2A2 is only 25% identical to that of the MACROH2A1 subtypes, although its size and basic character are very similar.
The nonhistone region of MACROH2A2 is 64% identical to MACROH2A1.2. The region corresponding to the alternately spliced exons in MACROH2A1.1 and MACROH2A1.2 (under-lined in Fig. 1a) is more similar to MACROH2A1.2 both in length (33 amino acids) and sequence (48% identical). We have no evidence that the transcript from the MACROH2A2 gene is alternately spliced. The essential elements of the putative leucine zipper region are conserved except in the third repeat, where there is a threonine in the "d" position (Fig. 1a). On the basis of studies of other leucine zipper proteins, the presence of a threonine in this position would not necessarily preclude a coiled-coil interaction (24). The bacterial and viral proteins that are similar to the nonhistone region of MACROH2A1 subtypes (2) show a similar degree of homology to the nonhistone region of MACROH2A2 (data not shown).

FIG. 2.
Comparison of the exon structure of MACROH2A genes. The human MACROH2A2 gene is from gi 12735407 from human chromosome 10. The human MACROH2A1 gene is from gi 12731492 from human chromosome 5. There are apparently two optional E1s, the first found in only one reported sequence (21). The mouse macroH2A1 gene is on mouse chromosome 13 and was described (4). The structure of the rat macroH2A1 gene is from this report, and its chromosomal location is unknown. The transcription initiation sites are not known, and, therefore, E1 sizes are minimums. The coding regions are shaded. bp, base pairs. MACROH2A Gene Structure-The gene for mouse MACROH2A1 contains 10 exons and is located on chromosome 13 (4). Here we report a description of the rat and human MACROH2A1 genes as well as the human gene for MACROH2A2 (Fig. 2). The rat gene was identified from five overlapping clones from a rat phage library. The human genes were identified from the sequence data base from the human genome sequencing project. The complete MAC-ROH2A1 locus is available. The rat macroH2A1 gene consists of nine exons spread over more than 60 kilobases. The human MACROH2A1 gene consists of 11 exons spread over almost 65 kilobases. The first exon of the macroH2A1 gene of rat and mouse is noncoding. In the human gene the first two exons are noncoding and alternately spliced. In fact there is only one example of the first of the two being used (21), and it has yet to show up in dbEST. The same situation may apply to rat and mouse; however, there is no evidence for this in the data base yet. The last exon of all known MACROH2A1s have long 3Ј untranslated regions. The mouse MACROH2A1 message was reported with a short 3Ј untranslated region (4); however, a survey of mouse expressed sequence tags indicates that the 3Ј untranslated region of mouse macroH2A1 cDNAs is about 600 nucleotides (the discrepancy is most likely due to an internal stretch of As being mistaken for a poly(A) tail). One difference between the MACROH2A1 genes of rats, mice, and humans is in the exons encoding the histone region. Exon 2 of the rat macroH2A1 gene is split into two exons in the mouse and human genes, which we therefore here refer to as E2 and E2a.
One peculiarity of the MACROH2A1 gene is the 5Ј splice site selection of exon 4, which encodes the very end of the basic region and the beginning of the nonhistone region. We noticed that the two reported cDNA sequences for human MACROH2A1 (20,21) differ here, with the basic region of one ending with three lysines and the other with just two. Examination of the human gene sequence revealed that the first lysine comes from the 3Ј end of E3 and that the next two come from an alternate splice site selection of a tandem pair of lysine codons at the 5Ј end of E4 as shown in Scheme I. A survey of dbEST suggests that in the human the second site is twice as likely to be selected as the first (33 of 47 informative sequences). For the mouse the splice site selections are equally represented (6 of 12 informative sequences). The 5Ј end of E4 in the rat also shows this potential; however, rat sequences from this region are not represented in dbEST. The only rat sequences available from this part of the cDNA use the second site (1, 3). The significance of this phenomenon is unknown.
Whereas MACROH2A1 has been mapped to human chromosome 5 (21), we found the MACROH2A2 gene in sequences from chromosome 10. Most of the MACROH2A2 locus is available, including all the known exons and more than half of the known introns. MACROH2A2 is organized identically to MACROH2A1 (Fig. 2). It also starts with a noncoding exon and ends with a long 3Ј untranslated region. The only difference is in exon 5. Whereas MACROH2A1 has two alternately spliced exon 5s, which we refer to here as E5.2 and E5.1, MACROH2A2 has only one, which happens to be slightly more similar to E5.2 of MACROH2A1. No other spliced variants of MACROH2A2 have been found in dbEST, and a BLAST search of dbEST with the complete intron sequences between E4 and E6 of MACROH2A2 did not reveal any other exon candidates.
MACROH2A1.2 and MACROH2A2 Have Distinct but Overlapping Patterns of Expression-We raised antibodies against the nonhistone region of MACROH2A2 to examine the pattern of MACROH2A2 expression and its distribution within the nucleus. Antibodies were affinity-purified and absorbed against a GST-MACROH2A1.2 nonhistone region fusion protein to eliminate cross-reaction with MACROH2A1 subtypes (5). The specificity of these purified antibodies and the MACROH2A1.2 antibodies was tested on a Western blot of mouse liver and kidney nuclear extracts. The MACROH2A2 antibodies detected a single band that runs slightly slower than MACROH2A1.2 (Fig. 3). The MACROH2A1.2 content of mouse liver and kidney is similar. However, the MACROH2A2 content of the kidney extract was higher than that of the liver, even when we loaded 2.5 times more liver extract (Fig. 3).
We examined the distributions of MACROH2A1.2 and MACROH2A2 in different cell types and their distributions within the nucleus by immunofluorescence on frozen tissue sections of mouse liver and kidney. In the liver, MACROH2A1.2 antibodies stained hepatocytes brightly and, to a lesser extent, cells of the bile ducts (Fig. 4, a and b), as reported (5) (hepatocytes are labeled H, and bile ducts are labeled D). In contrast, hepatocytes were only faintly stained with the MACROH2A2 antibodies, whereas the bile ducts, certain cells around the central vein, and a small number of parenchymal cells were brightly stained (Fig. 4, c and d). The parenchymal cells that stain brightly for MACROH2A2 do not  appear to be hepatocytes, on the basis of their nuclear morphology.
To see which individual cells stained for MACROH2A1.2 or MACROH2A2, we used direct immunofluorescence with a fluorescein-labeled MACROH2A1.2 antibody and a Texas Redlabeled MACROH2A2 antibody. We used female mouse liver, because inactive X chromosome-associated MCBs are readily apparent in hepatocytes and ductal cells stained for MACROH2A1.2 (Figs. 4a and 5a). In several hepatocytes, two inactive X chromosome-associated MCBs were seen (Fig. 5a), and we have shown that such nuclei are polyploid (5). In the bile ducts both subtypes stained the ductal epithelial cells (Fig.  5, DE), but only MACROH2A2 antibodies stained the periductal cells significantly (Fig. 5, PD). MacroH2A2 containing MCBs were apparent in the ductal epithelium. These MCBs were only present in females (data not shown) and colocalized with MACROH2A1.2 MCBs (Fig. 5, a-c). This indicates that these MCBs involve the inactive X chromosome. Those cells around the central vein (Fig. 5, V) and the non-hepatocyte parenchymal cells (Fig. 5, NHP) that stain with the MACROH2A2 antibody do not stain with the MACROH2A1.2 antibody. MCBs were not evident in many of these nuclei, although they may have been obscured by the relatively high level of MACROH2A2 seen throughout these nuclei.
These subtypes also showed distinct expression patterns in the kidney (Fig. 6). The cortex of the kidney contains several prominent structures including the proximal and distal convoluted tubules and the glomeruli (the site of blood filtration). Proximal convoluted tubules (Fig. 6, P) were identified by their staining with an antibody to the gap junction protein connexin 32 (25) (shown in green in b and d). MACROH2A1.2 and MACROH2A2 were both found in the proximal convoluted tubules (Fig. 6, a and b for MACROH2A1.2; c and d for MACROH2A2). We identified distal tubules (Fig. 6, D) by their appearance and lack of connexin 32 staining. They stained more brightly for MACROH2A1.2 than for MACROH2A2. Glomeruli (Fig. 6, G) showed very little staining for MACROH2A1.2 (a and b) but were stained for MACROH2A2 (c and d). The strongest MACROH2A2 signal was in the parietal layer of Bowman's capsule (Fig. 6d, B), a structure that forms the outer layer of the glomerulus. In the medulla of the kidney there was less overlap in the staining patterns. In this part of the kidney MACROH2A1.2 was detected at a high level in the straight proximal tubules that we identified by costaining with antibodies to connexin 26 (25) (Fig. 6, e and f). Very little MACROH2A1.2 staining was seen in the cells between these tubules. In contrast, MACROH2A2 staining was very low in the connexin 26 positive tubules but was present at high levels in the cells between them (Fig. 6, g and h).
A distinct expression pattern was also seen in the adrenal gland. MACROH2A2 staining predominated in the outer cells of the capsule, and MACROH2A1.2 staining predominated in the inner cells of the cortex and medulla (Fig. 7).
The staining patterns of different cell types presented here are summarized in Table I. We also stained frozen sections of dog liver and kidney for MACROH2A1.2 and MACROH2A2, and the staining patterns were virtually identical to those in the mouse (data not shown).

MACROH2A1.2 and MACROH2A2
Have Similar although Nonidentical Nuclear Distributions-To confirm that the female-specific MCBs observed with MACROH2A2 antibodies involve the inactive X chromosome, we performed immunofluorescence with MACROH2A2 antibodies followed by fluorescence in situ hybridization with an X chromosome paint probe. This analysis was done on a section of female mouse kidney, because female-specific MACROH2A2 MCBs are easily de-tected in the nuclei of the cells of the proximal convoluted tubule. The MACROH2A2 MCBs in these cells colocalized to one of the X chromosomes (Fig. 8).
To determine whether there is overlap in the distribution of MACROH2A2 and MACROH2A1.2, we again used direct immunofluorescence using MACROH2A2 antibody labeled with Alexa 488 and a MACROH2A1.2 antibody labeled with Alexa 594. In this case we first wanted to establish that the labeling did not appreciably alter the specificity of the antibodies. We tested them on a Western blot of kidney nuclear extract and saw little or no change in specificity (Fig. 9). Immunofluorescence analysis of a cell type that has relatively high levels of both MACROH2A subtypes, the epithelial cells of the proximal convoluted tubule of the kidney, showed a similar pattern of nuclear staining (Fig. 10). A distinct colocalizing inactive X chromosome-associated MCB, plus relatively diffuse nuclear staining not restricted to Hoechst bright domains, could be seen with both antibodies (Fig. 10, a-c). Small regions where one subtype predominated over the other were also present (Fig. 10d). MACROH2A2 is virtually identical in size and architecture to the MACROH2A1 subtypes and has significant amino acid sequence homology with those subtypes throughout its length. This suggests that the fundamental functions of the H2A domain, the basic region, and the nonhistone domain of MACROH2A2 are similar to their functions in the MACROH2A1 subtypes. This idea is supported by our observation that, like MACROH2A1.2, MACROH2A2 is concentrated in the inactive X chromosome in certain cell types. The simplest interpretation of this result is that both of these MACROH2A subtypes perform a similar function in the inactive X.
On the other hand, MACROH2A2 differs from the MACROH2A1 subtypes in several interesting ways. In primary structure it is only 68% identical to MACROH2A1.2. This contrasts with the exceptional evolutionary conservation of both MACROH2A1 subtypes, which are ϳ95% identical between mammals and birds (2). We do not have any other complete MACROH2A2 sequences, but a comparison of the human MACROH2A2 sequence to partial mouse sequences from dbEST shows that mouse MACROH2A2 is nearly identical to human MACROH2A2 (98% identical covering 97% of the amino acid sequence (data not shown)). Thus, the sequence differences between MACROH2A2 and the MACROH2A1 subtypes are conserved. MACROH2A2 also has a very different pattern of expression from that of MACROH2A1.2 (Table I), and our studies with dog tissues (data not shown) indicate that these cell type differences are conserved in evolution. When MACROH2A1 subtypes and MACROH2A2 are present in the same nucleus, they have similar although nonidentical distributions (Fig. 10). Our results suggest that the MACROH2A that is not associated with the inactive X is also localized to specific regions of chromatin and that the mechanism(s) involved in localizing MACROH2A may have subtype specificity.
Our observation that MACROH2As have a preference for the inactive X has been questioned in a recent report that examined the distribution of MACROH2A1.2 and other core histones in primary human fibroblasts (26). These authors suggested that the labeling of the inactive X chromosome by MACROH2A antibodies or green fluorescent protein-tagged MACROH2A may only reflect a higher density of chromatin in the inactive X. This conclusion was based on their observations that the inactive X was also labeled by antibodies against conventional core histones or green fluorescent protein-H2A. Indeed the inactive X in many cells, including human fibroblasts, is often readily identifiable by DNA staining as the most prominent chromatin domain in the nucleus, i.e. the Barr body. Our conclusion that MACROH2A preferentially associates with the inactive X is based on the relative staining of the inactive X compared with other chromatin. Our studies have focused on cells of mouse tissues where the inactive X cannot be identified by DNA or chromatin stains due to the presence of other large domains of similar or greater density ( Fig. 10 and Refs. 5, 7, and 27). However, in these cells the inactive X is readily identified by MACROH2A staining (Fig.  10 and Refs. 5 and 7). Consistent with our conclusions, Chadwick and Willard (28) found that myc epitope-tagged MACROH2A was preferentially localized to the Barr body of primary human fibroblasts, whereas myc epitope-tagged H2B showed no such preference in comparison to other chromatin.
On the basis of the association of MACROH2A1.2 with the inactive X chromosome, we suggested that MACROH2As are involved in establishing and/or maintaining transcriptionally silent chromatin domains (5). Our present results demonstrate a new level of complexity and specificity to MACROH2A protein utilization. One interesting possibility is that each cell type has a specific complement of the genome associated with MACROH2A-containing nucleosomes. This would require a mechanism to localize MACROH2A to numerous specific chromosomal domains. MACROH2As could be localized to specific chromatin domains by cis-acting RNAs like Xist (5), and that targeting could involve a direct interaction of the nonhistone region of MACROH2A with RNA (2). This model predicts that cis-acting RNA genes like Xist are strategically located throughout the genome. The expression of such genes could be developmentally regulated and cell type-specific. The cell typespecific expression of MACROH2A subtypes demonstrated in the present work and in previous studies of the MACROH2A1 subtypes (3) could provide another level of specificity and control for such a system of transcriptional regulation.