C-to-U RNA Editing: Mechanisms Leading to Genetic Diversity*

Substitutional RNA Editing: Biochemical Mechanisms and Targets for C-to-U RNA Editing in Mammals RNA editing is an important mechanism for regulating genetic plasticity through the generation of alternative protein products from a single structural gene. Substitutional RNA editing employs a variety of genetic mechanisms, the biochemical basis of which has been elucidated following the development of in vitro assays that recapitulate important elements of this process. Two types of substitutional RNA exist in mammals, namely A-to-I and C-to-U RNA editing (1, 2). Important biochemical distinctions between these two processes provide an informative basis for understanding the mechanisms of C-to-U RNA editing and the adaptations that control target specificity. A-to-I RNA editing is mediated by a family of adenosine deaminases acting on double-stranded RNA (ADARs) with partially overlapping target specificity (1, 2). The absolute requirement for a double-stranded RNA template distinguishes A-to-I and C-to-U RNA editing because the former requires a pre-mRNA template containing intronic regions and is thus biochemically confined to unspliced transcripts. A further distinction biochemically is that ADAR enzymes do not require additional cofactors. ADARs contain both double-stranded RNA binding domains and a deaminase domain and function as modular editing enzymes (2, 3). The best characterized example of C-to-U RNA editing involves the nuclear transcript encoding intestinal apolipoprotein B (apoB) (4). ApoB RNA editing changes a CAA to a UAA stop codon, generating a truncated protein, apoB48 (4). ApoB RNA editing has important effects on lipoprotein metabolism, and its emergence defines distinct pathways for intestinal and hepatic lipid transport in mammals (4). C-to-U editing of apoB RNA requires a single-strand template (Fig. 1) with well defined characteristics in the immediate vicinity of the edited base, as well as protein cofactors that assemble into a functional complex referred to as a holoenzyme or editosome. This functional complex includes a minimal core composed of apobec-1, the catalytic deaminase, and a competence factor, apobec-1 complementation factor (ACF), that functions as an adaptor protein by binding both the deaminase and the RNA substrate (Fig. 1). The interaction of these protein components and their higher order interactions with the nuclear transcript illustrates the complexity of site-selectivity in C-to-U RNA editing. A second example of C-to-U RNA editing in mammals involves site-specific deamination of a CGA to UGA codon in the neurofibromatosis type 1 (NF1) mRNA (5). NF1 RNA editing generates a translational termination codon at position 3916 that is predicted to truncate the protein product neurofibromin at the 5 end of a critical domain (6) involved in GTPase activation (Fig. 2). Unlike apoB RNA editing, there is no formal proof that a truncated protein is generated. This example of C-to-U RNA editing has been demonstrated in peripheral nerve sheath tumors from patients with NF1 and may share elements of the same machinery as apoB RNA editing, as discussed below. A third target for C-to-U editing, NAT1, was revealed following forced transgenic overexpression of apobec-1 in murine and rabbit hepatocytes (7). NAT1 is homologous to the translational repressor eIF4G and undergoes C-to-U editing at multiple sites, with the creation of stop codons that in turn reduce protein abundance (7).

to truncate the protein product neurofibromin at the 5Ј end of a critical domain (6) involved in GTPase activation (Fig. 2). Unlike apoB RNA editing, there is no formal proof that a truncated protein is generated. This example of C-to-U RNA editing has been demonstrated in peripheral nerve sheath tumors from patients with NF1 and may share elements of the same machinery as apoB RNA editing, as discussed below.
A third target for C-to-U editing, NAT1, was revealed following forced transgenic overexpression of apobec-1 in murine and rabbit hepatocytes (7). NAT1 is homologous to the translational repressor eIF4G and undergoes C-to-U editing at multiple sites, with the creation of stop codons that in turn reduce protein abundance (7).

Intracellular Topology of C-to-U RNA Editing:
Lessons from ApoB RNA C-to-U RNA editing of endogenous apoB in vivo is largely confined to spliced and polyadenylated nuclear transcripts (8). More recent studies using an apoB editing cassette cloned into constructs containing intronic sequences and transfected into cells that support C-to-U RNA editing suggest two important findings (9). First, introns suppress editing, an effect rescued by mutating the splice donor and acceptor sites in the chimeric cassette (9). Second, a Rev complementation assay used to produce conditional export of unspliced apoB RNA demonstrated that unspliced RNA is intrinsically capable of undergoing editing (9). These results, considered together with the earlier in vivo findings, suggest several testable conclusions. First, apoB RNA editing occurs in the nucleus and is selective for spliced transcripts rather than pre-mRNA. This is a crucial finding, because A-to-I RNA editing occurs before splicing. Second, spliceosome assembly itself or alternatively targeting RNA to the splicing pathway inhibits C-to-U editing. Plausible mechanisms for such inhibition include protein-protein or protein-RNA interactions between components of the spliceosome and the apoB RNA editing holoenzyme that hinder enzymatic activity in the latter. How might these possibilities be reconciled in the in vivo situation? In the context of mammalian apoB, splice donor and acceptor interactions at the termini of a large exon (exon 26, Ͼ7 kb) would likely be physically remote from the site of assembly and action of the editing enzyme. It is worth noting, however, that formation of a double-stranded template for A-to-I RNA editing requires base-pairing between the exonic region containing the targeted base and sequences within an adjacent intron, which in some instances are located ϳ2 kb downstream (3). The availability of additional structural models for large transcripts should permit direct experimental validation of these possibilities.

C-to-U RNA Editing Machinery: Cis-acting Elements
ApoB RNA editing is exquisitely site-specific, targeting a single cytidine in a transcript spanning greater than 14,000 residues. However, the minimal sequence information for this process appears contained within ϳ30 nucleotides flanking the edited base (10,11), with more distant elements both 5Ј and 3Ј permitting enzymatic deamination with greater efficiency (12,13). An 11nucleotide mooring sequence located 4 -6 nucleotides downstream of the edited base is particularly important (10,11), but selection of the editing site likely also depends upon RNA secondary structure. RNase mapping and folding algorithms together predict that the apoB template folds into a stem-loop structure with the targeted cytidine located within an exposed loop of RNA (13,14). A consensus binding site for apobec-1 (UUUN(A/U)U) was determined by circular permutation analysis and is predicted to be located 3 nucleotides downstream of the editing site, partially overlapping the 5Ј terminus of the mooring sequence, at the apex of a stem-loop ( Fig. 1) (15). Identification of a consensus binding site for apobec-1 provides an additional screen for other RNAs that may be substrates for this enzyme. Among these candidate targets are RNA transcripts containing the consensus binding site embedded within the canonical destabilization element (UUAUU(A/U)(A/U)), which is found in the 3Ј-untranslated region of RNAs known to be regulated through alteration in stability (15).
A second candidate C-to-U editing template was identified through homology searches, which revealed that a CGA codon in the NF1 mRNA was changed to a UGA stop codon (5,16). Alignment of a 40-nucleotide NF1 mRNA encompassing the editing site reveals 50% identity with apoB mRNA, with 6 of 11 mismatches in the downstream cassette encoding the mooring sequence (Fig. 2). Recent studies have demonstrated several important features concerning C-to-U RNA editing of NF1 in tumors from these patients. First, the subset of tumors that demonstrate C-to-U RNA editing of NF1 contain high levels of an alternatively spliced downstream exon (exon 23A, Fig. 2) (17). RNA modeling predictions suggest that inclusion of the 63-nucleotide exon resulting from alternative splicing permits the transcript to fold into a more favorable configuration with respect to access of the active site of the deaminase to the targeted cytidine. 2 Second, tumors that demonstrate C-to-U RNA editing express apobec-1 mRNA, the transcript encoding the catalytic deaminase of the apoB RNA editing holoenzyme. In addition, a consensus apobec-1 binding site has been identified upstream of the editing site and appears functional as evidenced by UV-crosslinking studies (17). Taken together, the findings suggest that C-to-U RNA editing of NF1 may share certain elements of the apoB RNA editing machinery, although editing occurs at lower efficiency (Ͻ20% for NF1 versus Ͼ90% for intestinal apoB), presumably the result of mooring sequence mismatches.

C-to-U RNA Editing Machinery: Trans-acting Factors and Complex Formation
ApoB RNA editing is mediated by a multicomponent complex with a minimal, two-component core composed of the catalytic deaminase apobec-1 (4) and a competence factor, ACF (18, 19).  The model for an ϳ35-nucleotide region of apoB RNA flanking the edited base (asterisk) is shown. A schematic representation illustrates apobec-1 (red) and ACF (blue) binding to RNA both 5Ј and 3Ј of the edited base and depicts the presence of additional proteins that may modulate assembly of the holoenzyme (green). Note that the stoichiometry of apobec-1 and ACF molecules with respect to the active enzyme is unknown. The model emphasizes the role of both cis-acting elements within the vicinity of the edited base (mooring sequence is bolded) and the requirement for an optimal structure, conferred by both 5Ј and 3Ј efficiency elements.
FIG. 2. C-to-U RNA editing of neurofibromatosis type 1. The genomic organization of exons 21-24 and the alternatively spliced exon 23A (pink) are aligned above the region of NF1 RNA containing the edited base (3916). A 41-nucleotide region of NF1 RNA is aligned with the corresponding region from human apoB RNA (hapoB) and demonstrates both the conservation in the mooring sequence and the presence of apobec-1 binding sites in proximity to the edited base (adapted from Ref. 17). The edited RNA is predicted to encode a truncated protein that eliminates the GTPase activating domain although it is possible that the edited transcript is unstable and is degraded.
Apobec-1-Apobec-1 is highly conserved and likely evolved from a common ancestral cytidine deaminase (20). Like most cytidine deaminases, apobec-1 functions as a dimer (20), with a composite active site representing asymmetric contributions from each monomer that permits both substrate binding and deamination, together with a leucine-rich pseudoactive site at the carboxyl terminus, involved in dimerization (20). The active site residues and their spacing are conserved in all cytidine deaminases and define a signature motif (HXE(X) n PCXXC) for this gene family. Apobec-1, unlike the Escherichia coli homolog, demonstrates RNA binding activity, which requires two aromatic residues (Phe-66, Phe-87) within the catalytic site (20). Apobec-1 has no canonical RNA binding motifs in its primary sequence. However, mutations that interfere with RNA binding do not function in C-to-U RNA editing despite retaining cytidine deaminase activity (15,20). Accordingly, apobec-1 is proposed to have evolved from an ancient cytidine deaminase that has acquired RNA specificity, allowing targeted nucleoside deamination of an RNA (20).
As alluded to above, forced transgenic overexpression of apobec-1 led to a cancer phenotype in the setting of promiscuous editing of cytidines in both apoB RNA and NAT1, a translational inhibitor involved in early embryogenesis (7,21). This gain of function phenotype implies that overexpression of apobec-1 is deleterious to the organism, although the proximate mechanism leading to dysplasia and cancer is yet to be defined. Consistent with this implication, host adaptations modulate the expression of apobec-1 in sporadic human colorectal cancer and in experimental colonic adenomacarcinoma formation in carcinogen-treated rats (22)(23)(24).
Apobec-1-related Genes-EST data base searches for apobec-1related proteins based on the signature motif has revealed new members of the gene family. Among these, apobec-2/ARCD1 is located on chromosome 6 and represents an abundant transcript in heart and skeletal muscle (25,26). Apobec-2/ARCD1 is an authentic cytidine deaminase with apoB RNA binding activity but does not mediate C-to-U editing of apoB RNA, and its primary target, if any, is unknown (26). However, ARCD1 interacts physically with both apobec-1 and ACF and inhibits apoB RNA editing in trans, possibly through interactions that alter the composition or stoichiometry of the holoenzyme (26). This observation is comparable with the inhibition of A-to-I RNA editing of GluR-B by ADAR3, a third member of the ADAR family (27).
Another apobec-1 homolog, activation-induced deaminase (AID), is specifically expressed in B lymphocytes during immunoglobulin class switch recombination and is required for somatic hypermutation. The chromosomal locus for human AID is adjacent to that of APOBEC1 on 12p13.2, suggesting that they may represent a gene duplication (26,28). Targeted deletion of AID yielded a murine model of hyper-IgM syndrome resulting from a block in class switch recombination (29), and mutations in the human AID gene were demonstrated in the autosomal recessive form of hyper-IgM syndrome (HIGM2) (30). AID is structurally related to apobec-1 and demonstrates monomeric cytidine deaminase activity yet does not demonstrate RNA binding or RNA editing activity on any of the known substrates (26,29). Indeed the intriguing possibility exists that its primary target may be DNA and not RNA (31).
An additional cluster of apobec-related genes, phorbolins/ ARCD2-7/apobec-3 (A-to-G), has been found on chromosome 22 (26,28). They all display typical structural features of the cytidine deaminase active site described above, including the aromatic residues critical for RNA binding activity, and apobec-3G demonstrates RNA binding activity for AU-rich templates. Nevertheless, none of these gene products demonstrate C-to-U editing activity on any known RNA template (28). Distribution of mRNAs encoding these genes, particularly apobec-3C and -3G, appears widespread, and several are found at increased abundance in tumor tissues and cancer cell lines, raising the intriguing possibility that they may be involved in growth or proliferation (28).
ACF-Glycerol gradients of tissue extracts revealed that C-to-U RNA editing activity fractionated as a 27 S particle (32) leading to the concept of a multicomponent editing enzyme complex. Approaches to identify putative complementation factor(s) were predicated on the assumption that these factor(s) would bind apoB mRNA and/or apobec-1. This objective was conceptually simple but technically daunting, and the recent cloning of ACF by two inde-pendent groups represents a major advance in the field (18,19). Recombinant ACF together with recombinant apobec-1 is sufficient to mediate C-to-U editing of synthetic apoB RNA (18,19), and these two components, expressed either in vitro or in yeast, represent the minimal core of the holoenzyme (33). ACF is a novel 65-kDa protein, widely expressed in human tissues (18,19). Alignment of the predicted structural motifs within ACF reveals the presence of three non-identical RNA recognition motifs at the amino terminus, with a putative double-stranded RNA binding domain and 6 RG repeats within the carboxyl terminus (18,19). UV-cross-link experiments identified an ACF binding site in apoB RNA spanning a 12-nucleotide sequence surrounding the editing site and partially overlapping the proximal end of the mooring sequence ( Fig. 1) (34). Systematic mutagenesis of recombinant ACF has identified the functional domains involved in apoB RNA binding, apobec-1 interaction, and apobec-1 complementation of C-to-U editing. Both the apobec-1 interaction and apoB RNA binding domains are required for C-to-U editing (35,36). However, the other putative RNA binding domains in ACF are dispensable, raising the question of whether these domains may play other roles in RNA processing.
The human chromosomal ACF locus spans ϳ80 kb on chromosome 10 and encodes 9 distinct splice variants from 15 exons (37). The pattern of splice variants in human small intestine and liver reveals two dominant ACF isoforms, ACF64 and ACF65, the latter containing an 8-amino acid insertion as a result of alternative splicing of exon 12 (35,38). There is no functional distinction between ACF64 and ACF65 (35,38). Other splice variants have been demonstrated at lower abundance in human tissues and demonstrate either low or no activity. None of these splice variants undergo developmental regulation (37). Whether these isoforms play a role in relation to apoB RNA editing or other aspects of RNA metabolism and function is unknown.
Apobec-1/ACF/ApoB RNA-binding Proteins-Identification of other components of the holoenzyme remains elusive. Among these various candidate components are two apoB RNA editing inhibitors: CUGBP2 and GRY-RBP (34,39,40). CUGBP2 is a nucleocytoplasmic RNA-binding protein that cofractionates with ACF and binds apoB RNA upstream of the editing site but inhibits apoB RNA editing in vitro (39). GRY-RBP, by contrast, is homologous to ACF (34,40) and shares functional characteristics including the ability to bind apobec-1 as well as apoB RNA (34). In addition, GRY-RBP colocalizes in the nucleus with ACF and with apobec-1 but does not complement apobec-1 in C-to-U RNA editing. Rather, GRY-RBP inhibits editing in vitro and in vivo in a dose-dependent manner, potentially by sequestering ACF or alternatively apobec-1 and/or apoB RNA (34). This range of possible interactions raises the question of whether a hierarchy of cognate targets exists for these proteins.
Interestingly, several of the candidate proteins proposed for the apoB RNA holoenzyme and identified through their ability to bind apoB RNA and/or through their interaction with apobec-1 have been demonstrated to participate in other RNA processing events including splicing, mRNA turnover, and translation. Thus, GRY-RBP has been recently identified as a member of the heterogeneous nuclear ribonucleoprotein Q family (41). Two hybrid screens performed using apobec-1 as bait revealed interactions with several different proteins. Among these, heterogeneous nuclear ribonucleoprotein C1 inhibits apoB RNA editing in isolated rat liver extracts (42). Two other proteins, ABBP-1 and ABBP-2/HEDJ, are homologous to human heterogeneous nuclear ribonucleoprotein A/B and Hsp40 chaperone, respectively (43,44). No direct evidence exists for a primary role for either protein in apoB RNA editing in vivo, although immunodepletion of either protein reduced in vitro apoB RNA editing from extracts prepared from editing-competent cells (43,44). The importance of these proteins in apoB RNA editing will await further study.

Subcellular Localization of the Components of the
Editing Machinery: Another Level of Regulation of ApoB RNA Editing Immunocytochemical localization of endogenous apobec-1 has proven challenging because this is a low abundance protein but the data from transfection studies point toward a predominantly cytoplasmic localization (34,45). Thus the determinants of nuclear-cytoplasmic distribution of apobec-1 presumably require interaction with one or more of the protein components of the holoenzyme, most plausibly ACF. The primary amino acid sequence of ACF contains a short stretch of basic residues forming an SV40-type nuclear localization signal (18,19) and immunofluorescence microscopy of cells transfected with FLAG-tagged ACF reveals a predominantly nuclear distribution (34,35). However, mutagenesis of the basic residues within this putative nuclear localization signal failed to modify the nuclear localization of ACF, suggesting that another domain targets ACF to the nucleus (35). Co-expression of apobec-1 with ACF leads to colocalization within the nucleus of transfected cells (34,35), and fractionation of nuclear extracts indicates the presence of ACF within large 27 S complexes demonstrated to contain editing activity (46). These findings suggest that interaction of apobec-1 with ACF facilitates their combined translocation into the nucleus and subsequent formation of a holoenzyme complex. Further resolution of the mechanisms regulating the distribution of the core components of the apoB RNA editing holoenzyme will be of great importance.
Recent findings suggest that nuclear translocation of apobec-1 and ACF may represent a novel mechanism accounting for the metabolic regulation of C-to-U editing activity, presumably by altering the effective concentration and delivery of the enzyme components to the site of accumulation of the substrate (46). One possible scenario involves an "anchoring/release" cycle mechanism similar to that described for the nuclear localization of the Xenopus transcription factor xnf 7 or the type II cAMP-dependent protein kinase (47,48). This possibility is consistent with the finding of increased nuclear abundance of rat hepatocyte ACF following insulin or ethanol exposure (46).

Conclusions
C-to-U RNA editing is an important mechanism for amplifying mammalian genetic diversity in a regulated manner. Key to the success of this adaptation is the ability to define and limit access of the machinery to avoid enzymatic modifications within unintended targets. The identification of the core components of the apoB RNA editing holoenzyme and the ability to examine the role of new candidate genes that represent elements of the larger complex will likely reveal further functions in RNA metabolism. Establishing functional links between these distinct events should represent an exciting challenge for future years.