Three N-terminal variants of the AE2 Cl-/HCO3- exchanger are encoded by mRNAs transcribed from alternative promoters.

Multiple AE2 Cl-/HCO3- exchanger mRNAs have been identified in rat. To determine the genetic basis for these mRNAs and whether they encode different variants of the exchanger, we used both rapid amplification of cDNA ends and S1 nuclease protection protocols and examined the organization of the gene. mRNAs encoding three N-terminal variants of AE2 (AE2a, AE2b, and AE2c) were identified and shown to be transcribed from alternative promoters. The AE2a transcription unit consists of 23 exons, with exons 1 and 2 containing 5'-untranslated sequence and the first 17 codons. The first exon of AE2b is located in intron 2; it contains 5'-untranslated sequence and an alternative 3-amino acid N-terminal coding sequence and is spliced to exon 3. The first exon of AE2c is located in intron 5; it consists of 5'-untranslated sequence and is spliced to exon 6, which contains the translation initiation codon corresponding to Met-200 of AE2a. Northern analysis shows that AE2a is expressed in all tissues, AE2b exhibits a more restricted distribution with highest levels in stomach, and AE2c is expressed only in stomach. Thus, the use of alternative promoters leads to the production of three N-terminal variants of AE2 that exhibit tissue-specific patterns of expression.

Multiple AE2 Cl ؊ /HCO 3 ؊ exchanger mRNAs have been identified in rat. To determine the genetic basis for these mRNAs and whether they encode different variants of the exchanger, we used both rapid amplification of cDNA ends and S1 nuclease protection protocols and examined the organization of the gene. mRNAs encoding three N-terminal variants of AE2 (AE2a, AE2b, and AE2c) were identified and shown to be transcribed from alternative promoters. The AE2a transcription unit consists of 23 exons, with exons 1 and 2 containing 5-untranslated sequence and the first 17 codons. The first exon of AE2b is located in intron 2; it contains 5-untranslated sequence and an alternative 3-amino acid Nterminal coding sequence and is spliced to exon 3. The first exon of AE2c is located in intron 5; it consists of 5-untranslated sequence and is spliced to exon 6, which contains the translation initiation codon corresponding to Met-200 of AE2a. Northern analysis shows that AE2a is expressed in all tissues, AE2b exhibits a more restricted distribution with highest levels in stomach, and AE2c is expressed only in stomach. Thus, the use of alternative promoters leads to the production of three N-terminal variants of AE2 that exhibit tissue-specific patterns of expression.
Electroneutral Cl Ϫ /HCO 3 Ϫ exchange in mammalian tissues is mediated by a family of proteins encoded by multiple mRNAs from at least three genes, termed AE1, 1 AE2, and AE3 (reviewed in Refs. [1][2][3]. The AE1 gene encodes both erythrocyte band 3 (4) and kidney band 3 (5)(6)(7)(8), an N-terminal truncated variant of the exchanger. Kidney AE1 mRNA is transcribed from an alternative promoter located in the third intron of the erythrocyte transcription unit (7), and utilizes a Met codon in exon 5 as the translation start site (7,8). In the case of AE3, a 4.4-kb mRNA encoding a 1227-amino acid variant is expressed in brain and several other tissues (9,10), and a 3.8-kb mRNA is expressed in heart (9 -13). The cardiac AE3 mRNA, which encodes a 1030-amino acid N-terminal variant of the exchanger, is transcribed from an alternative promoter located in intron 6 of the brain transcription unit (11)(12)(13). Thus, for both AE1 and AE3, the use of tissue-specific alternative promoters leads to the production of at least two proteins that differ in their N-terminal sequences.
Full-length cDNAs encoding what appears to be a single variant of AE2 have been cloned from multiple tissues and four mammalian species (10, 14 -17); however, Northern blot analyses suggest that there are at least three different AE2 mRNAs (10). These include a ubiquitous 4.4-kb mRNA that corresponds to the form that has already been cloned, a 4.2-kb mRNA that is particularly abundant in stomach but is also present at lower levels in some other tissues, and a 3.8-kb mRNA that is expressed only in stomach. Because AE2 mRNAs are expressed in all tissues, it is likely that AE2 plays a housekeeping role such as the regulation of intracellular pH or cell volume. There is evidence, however, that it also serves more specialized functions in polarized epithelial cells. In stomach, AE2 is expressed at high levels on the basolateral membrane of gastric parietal cells (18), where it presumably mediates the exchange of Cl Ϫ and HCO 3 Ϫ that occurs during acid secretion across the apical membrane. In small intestine, it has been localized on the apical membranes of both villus and crypt enterocytes in ileum (16), consistent with a role in both NaCl absorption and HCO 3 Ϫ secretion. The mechanisms underlying the large variations in AE2 mRNA expression levels in different tissues and the mechanisms by which the protein is localized to either apical or basolateral membranes are unknown.
Interestingly, the size variations among the AE2 mRNAs seem to be due to differences at their 5Ј ends (10), consistent with the possibility that they are derived by the use of alternative promoters and encode protein variants that differ in their N-terminal sequences. If the multiple AE2 mRNAs are produced by the use of alternative promoters, it could contribute to the regulation of Cl Ϫ /HCO 3 Ϫ exchange in polarized epithelia and other tissues, either as a mechanism for producing variations in the primary structure of the protein, which in turn could lead to differences in membrane location or functional properties, and/or as a mechanism for transcriptional control. Therefore, we considered it important to determine the primary structures of the proteins encoded by the multiple AE2 mRNAs, their genetic basis, and their tissue-specific patterns of expression. The results of this study demonstrate that the AE2 gene contains three distinct promoters, which exhibit significant differences in their tissue specificity, and that the use of these promoters leads to the production of four mRNAs encoding three N-terminal variants of the exchanger.

EXPERIMENTAL PROCEDURES
Rapid Amplification of cDNA Ends (RACE) Protocol-The 5Ј RACE procedure was performed using the 5Ј-Amplifinder™ RACE Kit (Clon-* This work was supported by National Institutes of Health Grants DK39626 and HL41496. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. The  1 The abbreviations used are: AE, anion exchanger (the specific gene is designated by a number and variants produced by the use of alternative promoters are designated by a lowercase letter); kb, kilobase(s); nt, nucleotide(s); bp, base pair(s); RACE, rapid amplification of cDNA ends; PCR, polymerase chain reaction. tech). First strand cDNAs were synthesized using reverse transcriptase, 2 g of rat stomach poly(A) ϩ RNA, and an AE2-specific primer (P1) complementary to sequences near the 5Ј end of the AE2 mRNA to be reverse transcribed. After alkaline hydrolysis of RNA templates, an oligonucleotide anchor supplied with the RACE kit was ligated to the first strand cDNAs using T4 RNA ligase. PCR was performed using anchor-ligated first strand cDNAs and two primers; one primer was complementary to the anchor sequence (supplied with the RACE kit), and the other was an AE2-specific primer (P2) that was nested 5Ј to the P1 primer. PCR conditions were: denaturation at 94°C for 45 s, annealing at either 55°C (experiment shown in Fig. 1) or 60°C (all other experiments) for 45 s, and extension at 72°C for 2 min. Either 35 (experiment shown in Fig. 1) or 40 cycles (all other experiments) of PCR were performed with 15 min of final extension following the last cycle.
For amplification of the 5Ј ends of AE2a and AE2b described in Fig.  1, the P1 primer was complementary to nt ϩ235 to ϩ255, and the P2 primer was complementary to nt ϩ197 to ϩ214 of the rat AE2a coding sequence (10). For amplification of the 5Ј end of AE2c described in Fig.  2, the P1 primer was complementary to nt ϩ1047 to ϩ1065, and the P2 primer was complementary to nt ϩ674 to ϩ699. After fractionation by agarose gel electrophoresis, PCR products were visualized by ethidium bromide staining. Specific PCR products were captured on NA45 paper (Schleicher & Schuell), eluted, and reamplified under the same conditions used for the first round of PCR. The reamplified products were subcloned into the pCR™II plasmid vector (Invitrogen), and sequence analysis was performed by the chain termination method.
For amplification of the 5Ј end of AE2a described in Fig. 5, the P1 primer was complementary to nt ϩ32 to ϩ52, and the P2 primer was complementary to nt ϩ17 to ϩ36 (numbered as in top panel of Fig. 7; see also Fig. 5). In this experiment the P2 primer was 5Ј end-labeled with T4 kinase and [␥-32 P]ATP before being used in the PCR reaction. PCR products were fractionated by electrophoresis on a 6% polyacrylamide gel and visualized by autoradiography. Regions of the gel containing PCR products were excised, and the PCR products were eluted, reamplified with unlabeled primer, and subcloned into the pCR™II plasmid vector. Bacterial colonies harboring plasmids with AE2 sequences were identified using a 32 P-labeled oligonucleotide probe from near the 5Ј end of the AE2a mRNA (complementary to nt ϩ7 to ϩ22 as numbered in top panel of Fig. 7). Forty-eight positive clones were isolated and sequenced by the chain termination method.
Cloning and Characterization of the Rat AE2 Gene-The rat cosmid library used in our studies of the AE1 and AE3 genes (7, 11) was screened with a 32 P-labeled XbaI-BamHI restriction fragment spanning nt Ϫ77 to ϩ163 of the 5Ј end of the rat AE2 cDNA (10) using the same conditions described previously (7,11). Four clones were identified and colony-purified. Southern blot analysis of EcoRI-digested cosmid DNAs suggested that all four clones contained the full-length AE2 gene. Five EcoRI fragments of clone 10-1 that hybridized with the AE2a cDNA, with sizes of 10.2, 4.9, 4.3, 2.1, and 1.8 kb, were subcloned into a plasmid vector and sequenced by the chain termination method using primers corresponding to sequences of the AE2a cDNA (10) or the AE2b and AE2c RACE products. The known intron/exon organization of the AE1 (19) and AE3 (11) genes was used as a guide in designing the primers. The sequences of all exons and most of the introns were determined. The sizes of introns that were not completely sequenced were determined by size analysis of PCR products containing the introns, which were amplified using primers derived from the adjacent exons. S1 Nuclease Protection Analysis-Probes used for S1 nuclease analysis of the transcription initiation sites of the AE2a, AE2b, and AE2c transcripts were termed S1-a, S1-b, and S1-c, respectively. S1-a was a synthetic oligonucleotide complementary to nt Ϫ15 to ϩ52 from the 5Ј end of the AE2a transcription unit (see Fig. 7, top panel), and S1-b was a synthetic oligonucleotide complementary to nt Ϫ35 to ϩ62 of the AE2b transcription unit (see Fig. 7, middle panel). S1-a and S1-b were 5Ј end-labeled using T4 kinase and [␥-32 P]ATP. S1-c was prepared from a PCR fragment corresponding to nt Ϫ244 to ϩ155 from the 5Ј end of the AE2c transcription unit (see Fig. 7, bottom panel). The PCR fragment was cleaved at nt ϩ82 with Sau3A1, dephosphorylated with calf intestinal phosphatase, 5Ј end-labeled with T4 kinase and [␥-32 P]ATP, and then cleaved at nt Ϫ237 with ApaLI. The products were fractionated by polyacrylamide gel electrophoresis, and the 5Ј end-labeled probe, spanning nt Ϫ237 to ϩ62, was isolated. Hybridization of 2.5 or 5 g of rat stomach poly(A) ϩ RNA with radiolabeled probes and S1 nuclease digestion was carried out as described previously (11).
Northern Blot Hybridization Analysis-For the analysis described in Fig. 3, a blot containing 10 g of rat stomach poly(A) ϩ RNA was prepared, hybridized, and washed as described previously (10). The blot was hybridized with probes specific for each AE2 variant and then stripped before proceeding with the next analysis. For the experiment shown in Fig. 8, we used a blot from a previous study (10), which had been prepared using 5 g of poly(A) ϩ RNA from each of the rat tissues shown. Single-stranded antisense probes were generated from PCR products by extension of the antisense PCR primer using the Klenow fragment of DNA polymerase I in the presence of [␣-32 P]dCTP. The specific probes for each mRNA were: (i) AE2a, nt Ϫ155 to ϩ51 of the AE2a cDNA, corresponding to 155 nt of 5Ј-untranslated sequence and 51 nt of coding sequence from AE2a-specific exons 1 and 2; (ii) AE2b, nt ϩ1 to ϩ62 of the AE2b transcription unit, corresponding to the first exon of AE2b, plus 48 nt from the RACE anchor primer; (iii) AE2c, nt ϩ1 to ϩ210 of the AE2c transcription unit, corresponding to the first exon of AE2c1 (this probe recognizes both the AE2c1 and AE2c2 mRNAs); and (iv) intron 5, the 330-nt sequence from the 3Ј end of intron 5, which is located between the first exon of AE2c1 and exon 6 (this probe recognizes only the AE2c2 mRNA). The probe common to all AE2 mRNAs was a SacI-PvuII fragment, corresponding to nt ϩ1118 to ϩ2001 of the AE2a coding sequence (10), that was labeled with [␣-32 P]dCTP using random hexamers and the Klenow fragment of DNA polymerase I.

Cloning of the 5Ј Ends of mRNAs Encoding Three N-terminal
Variants of AE2-Our previous Northern blot analyses (10) demonstrated the existence of at least three AE2 mRNAs. These include a 4.4-kb mRNA corresponding to the AE2 cDNA that has already been cloned, and mRNAs of 4.2 and 3.8 kb that seem to differ at their 5Ј ends. Throughout this paper the proteins encoded by the 4.4-, 4.2-, and 3.8-kb mRNAs will be referred to as AE2a, AE2b, and AE2c, respectively. To obtain information that would facilitate the cloning of the 5Ј ends of the AE2b and AE2c mRNAs, we first characterized the intron/ exon organization of the rat gene (discussed below), and then performed Northern blot analysis to determine the points at which the sequences of the mRNAs diverged (data not shown). A probe corresponding to exons 1a and 2 of the AE2a transcription unit hybridized only with the 4.4-kb AE2a mRNA. Probes corresponding to exon 3 or to exons 3-5 of AE2a hybridized with both the 4.4-kb AE2a mRNA and the 4.2-kb AE2b mRNA, but not with the 3.8-kb AE2c mRNA. A probe corresponding to exon 6 of AE2a hybridized with all three mRNAs. These results suggested that unique sequences in the AE2b mRNA would be immediately 5Ј to exon 3 and that unique sequences in the AE2c mRNA would be immediately 5Ј to exon 6.
To clone the 5Ј end of the AE2b transcript, a RACE cloning protocol was employed. First strand cDNA was synthesized using rat stomach poly(A) ϩ RNA and a primer from exon 4. The cDNA was then ligated to the anchor oligonucleotide and PCRamplified using the anchor primer and a primer from exon 3. Two products, 470 and 290 bp in length, later shown to correspond to the 5Ј ends of the AE2a and AE2b mRNAs, respectively, were visible over background when the RACE reaction mixture was analyzed by agarose gel electrophoresis and ethidium bromide staining ( Fig. 1, top left). These products were isolated from the gel, reamplified in separate reactions, and again analyzed by agarose gel electrophoresis (Fig. 1, top right). The 470-and 290-bp products were then isolated from the second gel, subcloned, and sequenced.
The two RACE products differed in sequence at their 5Ј ends but were identical at their 3Ј ends, with the common sequence beginning at the acceptor site of exon 3 (Fig. 1, bottom panel). As expected, the 470-bp product was derived from the AE2a mRNA; it contained sequences from exons 1, 2, and 3 of the AE2a transcription unit, with the first 17 codons occurring in exon 2. The 290-bp product was derived from an mRNA, subsequently shown to be the AE2b mRNA, in which exon 3 was immediately preceded by a unique 5Ј sequence. Sequence analysis of seven independent AE2b subclones revealed that they AE2 Cl Ϫ /HCO 3 Ϫ Exchanger Alternative Promoters 7836 extended to a position ranging between 52 and 62 bp 5Ј to exon 3, with four subclones containing the entire sequence. The unique sequence, termed exon 1b, begins with a 53-nt 5Ј-untranslated sequence and ends with a 9-nt sequence that is headed by a potential initiation methionine codon. The 3-codon open reading frame at the end of exon 1b is in-frame with the coding sequence of exon 3. If this ATG codon serves as the translation initiation site, then the AE2b mRNA would encode an N-terminal variant of AE2 in which the first 17 amino acids of AE2a are replaced by an alternative 3-amino acid sequence.
The 5Ј end of the 3.8-kb AE2c mRNA was cloned using a similar RACE protocol. First strand rat stomach cDNA was synthesized using a primer from exon 8, ligated to the anchor oligonucleotide, and then PCR-amplified using the anchor primer and a primer from exon 6. Fractionation of the RACE products by agarose gel electrophoresis revealed a broad smear containing distinct bands at approximately 320 and 380 bp (Fig. 2, upper left), which were within the expected size range of the AE2c product. Each product was isolated and reamplified, analyzed by agarose gel electrophoresis (Fig. 2, upper right), and then subcloned and sequenced. The two products were identical at their 3Ј ends, with the common sequence corresponding to exon 6, but they differed at their 5Ј ends (Fig. 2, bottom panel). The 5Ј end of the 320-bp product terminated within exon 5 and consisted of sequences identical to common regions of AE2a and AE2b. The most likely explanation for this result is that it was due to premature termination of reverse transcriptase activity during first strand cDNA synthesis. The 380-bp product contained a 210-nt sequence at its 5Ј end, designated exon 1c, that did not correspond to previously characterized AE2 sequences. Exon 1c does not contain an ATG triplet and would therefore serve as 5Ј-untranslated sequence. The first ATG triplet of the 3.8-kb AE2c mRNA, which is in an acceptable context for initiation of translation, occurs in exon 6 and corresponds to Met codon 200 of AE2a. Thus, the AE2 variant encoded by this mRNA would lack the first 199 amino acids occurring in AE2a.
Northern Blot Analysis of AE2 mRNAs in Rat Stomach-To confirm that the unique sequences identified above are present in distinct AE2 mRNAs of the expected sizes, Northern blot hybridization analysis of rat stomach poly(A) ϩ RNA was performed. A probe corresponding to codons 372-666 of the AE2a cDNA, which should be present in all of the AE2 variants, hybridized with multiple mRNAs ranging in size between 4.4 and 3.8 kb (Fig. 3, lane 1). As expected, a probe corresponding to unique sequences from the 5Ј end of AE2a hybridized with a distinct 4.4-kb mRNA (Fig. 3, lane 2), and a probe corresponding to unique sequences at the 5Ј end of AE2b hybridized with a distinct 4.2-kb mRNA (Fig. 3, lane 3). However, the unique AE2c sequence hybridized with both the expected 3.8-kb mRNA (termed AE2c1) and an mRNA of about 4.1 kb (termed AE2c2) that was slightly smaller than the AE2b mRNA. It seemed possible that the additional sequence in the larger AE2c mRNA might be due to retention of a 330-nt sequence from the 3Ј end of intron 5, which is located between the unique AE2c sequences identified by RACE analysis and exon 6. Hybridization analysis confirmed that the 4.1-kb AE2c2 mRNA contained this intron sequence (Fig. 3, lane 5). Inclusion of this sequence introduces six upstream open reading frames, ranging in length from 3 to 24 codons, including one that overlaps the beginning of the open reading frame of the AE2c1 mRNA.
Intron/Exon Organization of the Rat AE2 Gene-A rat cosmid genomic library was screened using an AE2 cDNA probe, and four clones were identified and purified. Because Southern Upper left, DNA size markers and RACE products were fractionated by agarose gel electrophoresis and visualized by ethidium bromide staining. Note the discrete 470-and 290-bp products, which are derived from the AE2a and AE2b mRNAs, respectively. Upper right, the AE2a and AE2b RACE products from the first set of PCR amplifications were isolated, reamplified in separate reactions, and analyzed by agarose gel electrophoresis. Each product was isolated, subcloned, and sequenced. Bottom panel, nucleotide and deduced amino acid sequences of the 5Ј ends of the AE2a and AE2b mRNAs. The translation initiation codon is underlined, and the first nucleotide of exon 3, which begins the common sequence is indicated by an arrow.
FIG. 2. PCR cloning of the 5 end of the AE2c mRNA. First strand rat stomach cDNA was prepared, ligated with the anchor oligonucleotide, and analyzed by 5Ј RACE (see"Experimental Procedures"). Upper left, DNA size markers and RACE products were fractionated by agarose gel electrophoresis and visualized by ethidium bromide staining. Note the discrete 380-and 320-bp products, which are derived from the AE2c and AE2a/b mRNAs, respectively. Upper right, AE2c and AE2a/b RACE products from the first set of amplifications were isolated, reamplified in separate reactions, and analyzed by agarose gel electrophoresis. Each product was isolated, subcloned, and sequenced. Bottom panel, nucleotide and deduced amino acid sequence of 5Ј end of the AE2c mRNA and corresponding sequences of the AE2a/b mRNAs. The apparent translation initiation codon for AE2c is underlined, and the first nucleotide of exon 6, which begins the common sequence, is indicated with an arrow. AE2 Cl Ϫ /HCO 3 Ϫ Exchanger Alternative Promoters 7837 blot analysis indicated that all four cosmids contained the entire gene, only one clone, which contained an insert of over 40 kb, was further characterized. Five EcoRI restriction endonuclease fragments that hybridized with the cDNA probes were subcloned into a plasmid vector, and then these subclones were used for analysis of the intron/exon organization. The structure of the AE2 gene, which spans 16.5 kb and contains 25 exons, is diagrammed in Fig. 4. All intron/exon boundaries (Table I) conform to consensus donor or acceptor splice sites (20). The fully processed AE2a mRNA contains 23 exons and includes two exons from the 5Ј end of the gene that are not present in the AE2b and AE2c transcription units. The transcription units for the AE2b and AE2c1 mRNAs consist of 22 and 19 exons, respectively. The alternative first exon for AE2b, designated exon 1b, is located in intron 2 of the AE2a transcription unit, with its donor splice site positioned 596 nt upstream of the acceptor site of exon 3. The alternative first exon of AE2c, designated exon 1c, is located in intron 5 of the AE2a transcription unit, with the donor splice site used in the AE2c1 mRNA positioned 330 nt upstream of the acceptor site of exon 6. This 330-nt sequence is excised in the AE2c1 mRNA but retained in the AE2c2 mRNA.
Identification of Transcription Initiation Sites for AE2a, AE2b, and AE2c-The 5Ј end of the AE2a cDNA characterized previously (10) terminated at position Ϫ195 relative to the translation initiation site of the mature mRNA, and each of the RACE products analyzed terminated just 3Ј to this position, suggesting that this region contained the AE2a transcription initiation site. S1 nuclease protection analysis (Fig. 5, left and bottom panels) of rat stomach poly(A) ϩ RNA was performed using a single-stranded probe spanning the region thought to contain the transcription initiation site. A number of protected fragments were observed, with a prominent cluster (indicated by a bracket and asterisk in left panel of Fig. 5) corresponding to positions Ϫ224 to Ϫ221 relative to the translation initiation site of the mature mRNA. Because this cluster seems to represent the major transcription initiation site, the first nucleotide of the cluster is designated as position ϩ1 of the AE2a transcript. However, additional protected fragments 5Ј to this site were also observed, suggesting that transcription might also initiate 5Ј to the major cluster. S1 nuclease analyses using probes that extended further 5Ј were unsuccessful; therefore, we employed a 5Ј RACE protocol to confirm the location of the transcription initiation site.
In designing a RACE strategy to identify the 5Ј end of the AE2a mRNA (diagrammed in Fig. 5, bottom panel), we reasoned that a 14-nt palindromic sequence at positions ϩ53 to ϩ66 might interfere with reverse transcriptase activity. Therefore, the initial cDNA synthesis was carried out using a P1 primer complementary to a region just 5Ј to the palindromic sequence. A 32 P-labeled P2 primer that extended 11 nt further 5Ј was used for the RACE reaction. Analysis of the labeled RACE products on a polyacrylamide gel revealed the presence of bands corresponding to products terminating between posi-  AE2 Cl Ϫ /HCO 3 Ϫ Exchanger Alternative Promoters 7838 tions ϩ1 and ϩ20 (Fig. 5, right panel). The products from the regions indicated by brackets and labeled 1-4 were isolated, reamplified, and subcloned. Sequence analysis demonstrated that clusters 2 and 4 consisted of PCR artifacts, 2 whereas clusters 1 and 3 contained AE2 products with 5Ј termini at positions ranging from nt Ϫ1 to ϩ5.
To avoid missing any longer RACE products present in the region above cluster 1 and to obtain better quantitation of the frequency at which specific transcription initiation sites might be used, a piece of gel containing the region beginning with the bottom of cluster 3 and extending beyond nt Ϫ30 (bracketed region 5 in right panel of Fig. 5) was excised, and the DNA was eluted and reamplified. PCR products were subcloned, and colonies were identified using a probe corresponding to nt ϩ7 to ϩ22. Sequence analysis of 48 randomly selected positive clones showed that the majority of the RACE products (34/48) begin within a cluster at nt ϩ1 to ϩ4, corresponding to the major cluster identified by S1 nuclease protection, and a single clone began at nt ϩ6. Most of the remaining clones (10/48) began at nt Ϫ1 (two clones), Ϫ3, or Ϫ4, which correspond to minor protected fragments identified by S1 nuclease protection, and several additional clones (3/48) began at positions Ϫ11 or Ϫ8.
No products extending beyond this site were identified. These data and the data from S1 nuclease protection analysis indicate that there are no additional exons upstream of exon 1a, and demonstrate that the major site for initiation of the AE2a transcript is in the region designated nt ϩ1 to ϩ4 (see Fig. 7). On the basis of the gene characterization and RACE experiments, it seemed likely that the transcription initiation sites for AE2b and AE2c were located in introns 2 and 5 of the AE2a transcription unit, respectively. S1 nuclease protection analysis of the AE2b mRNA using a probe derived from sequences in intron 2 revealed a prominent cluster of seven protected fragments (Fig. 6, left panel), with the largest fragment beginning at a G residue located 62 nt upstream of the donor splice site of exon 1b. Because this residue was also the 5Ј-most nucleotide identified in four of the seven AE2b RACE subclones analyzed, we conclude that it serves as the transcription initiation site. S1 nuclease analysis of the AE2c transcript confirmed that its transcription initiation site is located in intron 5. Two pro-2 These PCR artifacts consisted of several products with sequences derived entirely from the anchor oligonucleotide and the P2 primer. In some of these products, the P2 primer was partially duplicated.

FIG. 5. S1 nuclease and RACE analysis of the AE2a transcription initiation site.
Left panel, nuclease protection was performed using a 5Ј end-labeled probe, 5 g of rat stomach mRNA or tRNA control, and 50, 100, or 200 units of S1 nuclease (see "Experimental Procedures"). Samples were analyzed by polyacrylamide gel electrophoresis and autoradiography. Markers in first two lanes are purine (AϩG) or pyrimidine (CϩT) ladders generated by chemical cleavage sequencing of the probe. Undigested probe is shown in the last lane. Nucleotides are numbered on the left, with ϩ1 indicating the beginning of the major transcription initiation site corresponding to a cluster of four protected fragments (delineated by the bracket and asterisk at the right). Right panel, RACE analysis. cDNA was synthesized using rat stomach mRNA and primer 1, ligated to the anchor oligonucleotide, PCR-amplified using 32 P-labeled primer 2 and the anchor primer, and aliquots of the RACE reaction mixture were analyzed by polyacrylamide gel electrophoresis and autoradiography (see "Experimental Procedures"). Numbers on the right indicate the estimated termination site of the extension products, with ϩ1 corresponding to the major initiation site determined by S1 analysis. Size estimates were obtained by running sequencing ladders in adjacent lanes and correcting for the size of the anchor sequence. Products in the bracketed regions labeled 1-5 were excised from the gel, reamplified, subcloned, and sequenced. 70% of the RACE products terminated within the region spanning nt ϩ1 to ϩ4. Bottom panel, S1 nuclease and RACE strategies. The region shown is from the 5Ј end of the AE2a transcription unit, with # indicating the 5Ј end of the rat AE2a cDNA (10). The S1 nuclease probe and primers used for RACE analysis are labeled and indicated by lines. The asterisks above and the heavy bars below the sequence indicate major initiation sites identified by S1 nuclease and RACE analysis, respectively. Sequences included in the major transcripts are shown in uppercase letters and 5Ј flanking sequences are shown in lowercase letters.
FIG. 6. S1 nuclease analysis of the AE2b and AE2c transcription initiation sites. 5Ј end-labeled probes were hybridized with 2.5 or 5 g of rat stomach mRNA or 5.0 g of yeast tRNA, digested with 100 units of S1 nuclease, and analyzed by polyacrylamide gel electrophoresis and autoradiography. Left panel, analysis of AE2b. The ladder shown in the first four lanes was generated by chemical cleavage sequence analysis of the probe. Undigested probe is shown in the last lane. Nucleotides corresponding to protected fragments (sense strand) are indicated on the left. Right panel, analysis of AE2c. The ladder shown in the first four lanes was generated by chain termination sequence analysis of the region corresponding to the probe. Nucleotides (sense strand) corresponding to ends of the protected fragments are indicated on the left AE2 Cl Ϫ /HCO 3 Ϫ Exchanger Alternative Promoters 7839 tected fragments were observed (Fig. 6, right panel), with their 5Ј ends corresponding to C and A residues located 211 and 209 nt upstream of the donor splice site of exon 1c. This result correlates well with the RACE analysis, in which the longest subclone extended 210 nt upstream of the donor site.

Analysis of Genomic Sequences Flanking the Transcription Initiation
Sites-The sequences of the alternative first exons for each of the AE2 mRNAs and the 5Ј-flanking region for each transcription unit are shown in Fig. 7. The first two exons of the AE2 gene, which are unique to the AE2a mRNA (Fig. 7,  upper panel), are separated by a 2-kb intron. The sequence immediately 5Ј to the AE2a transcription initiation site is GC-rich and contains few sequences resembling known basal promoter elements. It lacks a TATA element (21) but contains several potential CACCC (22,23) elements. The beginning of the alternative first exon of AE2b is located 1.15 kb downstream of exon 2 (Fig. 7, middle panel). The nucleotide sequence surrounding the AE2b transcription initiation site closely matches a CAP site consensus sequence (21). The promoter region lacks an apparent TATA element, but does contain four potential CCAAT boxes (21), a potential Sp1 binding site (21), and six CACCC elements (22,23). The transcription initiation site for AE2c is located approximately 550 nt downstream of exon 5 (Fig. 7, bottom panel) and is preceded by an AT-rich sequence at nt Ϫ30 to Ϫ24 that might serve as a TATA element. The 5Ј-flanking sequence also contains a potential Sp1 binding site and several potential CACCC and CCAAT elements. The splice donor site for exon 1c and the acceptor site of exon 6, which contains the apparent translation start site, are separated by a 330-nt sequence that is excised in the AE2c1 mRNA but retained in the AE2c2 mRNA.
Tissue Distribution of AE2 mRNAs-When a Northern blot containing poly(A) ϩ RNAs from rat tissues was hybridized with a cDNA probe corresponding to a common region of the AE2 variants, mRNAs of approximately 4.2-4.4 kb were detected in all tissues, with an additional mRNA of approximately 3.8 kb in stomach (Fig. 8, top panel). When probes specific for each variant were used, the 4.4-kb AE2a mRNA was identified in all tissues, with the highest levels in stomach, large intestine, lung, and uterus. The 4.2-kb AE2b mRNA was detected in a more limited set of tissues, with relatively high levels in stomach, moderate levels in small intestine, large intestine, liver, Exon sequences are shown in uppercase letters; 5Ј flanking and intron sequences are shown in lowercase letters; amino acids are shown above the corresponding codons. Transcription initiation sites determined by S1 nuclease analysis are marked with an asterisk above the nucleotide, and the 5Ј-most nucleotide identified by RACE analysis are either underlined with a heavy bar or indicated with a caret under the nucleotide. The 5Ј-most nucleotide of AE2a identified by cDNA cloning (10) is indicated with a black diamond. Potential transcription factor binding sites (see text) are underlined and labeled. Nucleotides are numbered with ϩ1 corresponding to the major transcription initiation site identified by S1 nuclease and RACE analyses and negative numbers indicating 5Ј-flanking regions.
AE2 Cl Ϫ /HCO 3 Ϫ Exchanger Alternative Promoters and kidney, and low levels in lung, brain, and uterus. The 3.8and 4.1-kb AE2c mRNAs were detected only in stomach.

Use of Alternative Promoters Leads to Tissue-specific Expression of mRNAs Encoding N-terminal Variants of AE2-Earlier studies of the AE Cl Ϫ /HCO 3
Ϫ exchanger gene family revealed the existence of multiple mRNAs for each of the three known genes (9,10,24,25), and more recent analyses of the AE1 and AE3 genes have shown that some of these mRNAs are transcribed from alternative promoters and encode protein variants that differ in their N-terminal sequences (5-8, 11-13, 25-27). In the current study we have shown that similar mechanisms, involving alternative exon and promoter usage, are responsible for the tissue-specific expression of mRNAs encoding three N-terminal variants of AE2.
One of these variants is AE2a, a 1234-amino acid protein encoded by a 4.4-kb mRNA that it is transcribed from the 5Ј-most promoter. Mapping of the transcription start site was hampered by palindromic sequences and the GC-richness of exon 1a and the 5Ј-flanking sequence. Nevertheless, S1 nuclease and RACE analyses show that the major site of initiation occurs at nt ϩ1 to ϩ4 (Figs. 5 and 7) and that a lower level of initiation occurs within the 11-nt sequence preceding this cluster. Some protection of the full-length S1 nuclease probe at nt Ϫ15 was observed; however, none of the 48 RACE products analyzed extended to this site. This suggests that the resistance of this region to S1 nuclease may have been due to formation of a hairpin secondary structure in the small portion of the probe not protected by the AE2a mRNA, although the possibility of a low level of transcription initiation occurring at these sites or further 5Ј cannot be ruled out. The 5Ј-flanking region of AE2a lacks the more common basal promoter elements, but does contain several CACCC sequences (22,23) in both orientations and, in this respect, is similar to the 5Јflanking region of erythrocyte AE1 (7,28). The AE2a promoter is active in most, if not all, mammalian tissues, in contrast to both the erythrocyte AE1 promoter (28), which is highly tissuespecific, and the brain AE3 promoter, which is moderately tissue-specific (10). The apparently ubiquitous expression of the AE2a variant (Fig. 8) suggests that it serves a housekeeping function in many cell types.
The second variant, AE2b, is a 1220-amino acid protein that contains an alternative 3-amino acid N-terminal sequence that replaces the first 17 amino acids of AE2a. The results of S1 nuclease analysis correlated well with the results of RACE analysis, and demonstrated that the 4.2-kb AE2b mRNA is transcribed from an alternative promoter located in intron 2. The sequence immediately surrounding the transcription initiation site closely matches the consensus CAP signal frequently occurring at initiation sites (21). A potential CCAAT element occurs 65 nt upstream, which is within the preferred region for this element between nt Ϫ57 and Ϫ212 (21). Additional CCAAT sequences occur further upstream, but their functional significance is questionable as they are outside the preferred region for functional CCAAT elements. CACCC sequences and a potential Sp1 binding site were also observed. The AE2b mRNA is expressed in a more limited set of tissues than AE2a (Fig. 8), with the highest levels in stomach, consistent with the possibility that AE2b serves more specialized physiological functions than the ubiquitous AE2a.
The third variant, AE2c, is a 1035-amino acid protein that is identical to AE2a except that it lacks the first 199-amino acids. It is encoded by two mRNAs, 3.8 and 4.1 kb in length, that are transcribed from a promoter located in intron 5. The 4.1-kb mRNA, which retains the intron sequence between the donor splice site of exon 1c and the acceptor site of exon 6, contains multiple upstream open reading frames, which might interfere with translation of the long open reading frame encoding AE2c. Because of this, it is unclear whether it is a functional mRNA or an incompletely processed mRNA. A number of cases have been reported in which incomplete processing of mRNAs serves as a regulatory mechanism (reviewed in Ref. 29), and the existence of the 4.1-kb AE2c2 mRNA raises the possibility that this mechanism might be used in the regulation of AE2c expression. The results of S1 nuclease and RACE analysis showed that the AE2c transcription start site is located 209 -211 nt upstream of the donor splice site of exon 1c. An AT-rich sequence that might serve as a TATA element begins at position Ϫ30 relative to the transcription initiation site. This is preceded by a potential CCAAT element at position Ϫ188, which is within the preferred region for such elements, and an inverted CACCC sequence at position Ϫ59. The AE2c promoter is highly tissue-specific, with expression of the AE2c mRNAs being observed only in stomach (Fig. 8), suggesting that this variant might serve a cell-type or organ-specific function.
Intron/Exon Organization and Possible Evolutionary Relationships among the AE Genes-The intron/exon organization of the AE2 gene is remarkably similar to those of the other AE genes (Fig. 9), 3 with the greatest similarity between AE2 and AE3. If the alternative exons transcribed from internal promoters are excluded from consideration, both genes have 23 exons. For each gene the first exon consists of 5Ј-untranslated sequence and the second exon contains the translation start site and first 17 codons. There are 21 additional exons in each gene, with the last exon containing a conserved C-terminal coding sequence and the 3Ј-untranslated sequence. The positions of 19 3 The transcription initiation site for the long form of AE3 has not yet been identified. Analysis of sequences extending several hundred bp upstream of the known 5Ј-most exon of AE3 do not reveal consensus acceptor splice sites (S. Linn and G. E. Shull, unpublished data), and the relative sizes of the brain and cardiac AE3 mRNAs suggest that there is little, if any, additional sequence in the larger mRNA besides that present in the cDNA. Therefore, in the discussion and Fig. 9 we have assumed that the 5Ј-most exon that was identified in the AE3 gene (11) is exon 1. However, the reader should be aware that the possibility of additional upstream exons has not been rigorously excluded. When the AE2 (or AE3) exon sequences are aligned with those of the AE1 gene, the positions of the splice junctions for exons 8 -23 of AE2 are identical to those of exons 5-20 of AE1, except that the junction between exons 13 and 14 of AE2 is shifted by a few codons relative to the junction between exons 10 and 11 of AE1. Based on the conservation of these junctions and the high degree of similarity between the amino acid sequences (10), it is apparent that the last 16 exons of all three AE genes are homologous. However, there is little, if any, significant similarity between the first four exons of the AE1 gene and the first seven exons of the AE2 or AE3 genes. This suggests that the 5Ј-most exons of the AE1 gene may have had a separate evolutionary origin from the corresponding region of the AE2 and AE3 genes. One possibility is that the differences at the 5Ј ends of these genes might have arisen during the chromosomal rearrangement events that were responsible for dispersing these genes in the mammalian genome. For example, the current structure of either the AE1 gene or the AE2 and AE3 genes might have resulted from a chromosomal rearrangement in which the last 16 exons of an ancestral gene were recombined with the promoter and 5Ј coding exons of another gene.
The patterns of alternative promoter and exon usage of the AE2 gene are analogous to those of the AE1 and AE3 genes (Fig. 9), although the locations of the internal promoters and the consequent variations in protein structure are different. Transcription from the internal promoters leads either to a switch in N-terminal amino acid sequence or to a truncation of the protein, depending on whether the alternative first exon contains coding sequence or consists entirely of untranslated sequence. In the latter case an internal Met codon in a downstream exon serves as the translation initiation site. The internal promoters for AE2b and AE2c are located in introns 2 and 5 of the AE2a transcription unit, respectively, whereas the kidney AE1 promoter is located in intron 3 and the cardiac AE3 promoter is located in intron 6. It is clear from their positions within each gene that the promoters, and associated first exons, for AE2b, AE2c, and cardiac AE3 arose independently of each other during evolution, but it is less apparent whether the promoters and first exons for kidney AE1 and cardiac AE3 arose independently, as they occupy the same positions relative to the 16 highly conserved exons. However, the differences in tissue specificity and the absence of significant similarity between exon 4 of AE1 and exon 7 of AE3, which lie between the alternative first exon and the highly conserved exons, argue against the possibility that these promoters share a common evolutionary origin.
Potential Functions of the Alternative AE2 Promoters-The biological rationale for the evolutionary development of alternative promoters in the AE2 gene has not been established, but a gene regulatory role is suggested by the observed differences in tissue specificity. For example, the high levels of AE2 mRNA in stomach is due, in large part, to expression of the AE2b mRNA at much higher levels in stomach than in other tissues, and to expression of the stomach-specific AE2c mRNAs. Thus, the AE2b and AE2c promoters may have evolved, in part, as a genetic mechanism for producing the high Cl Ϫ /HCO 3 Ϫ exchanger levels that are required in stomach.
Because use of the alternative AE2 promoters leads to variations in N-terminal amino acid sequences, it is likely that they also serve as a mechanism for producing protein variants with altered, and physiologically relevant, functional characteristics. Generation of the AE2b and cardiac AE3 variants are analogous in that each involves a switch in the N-terminal amino acid sequence relative to the long forms of each exchanger. The extent of the sequence alterations, however, are quite different. The differences between AE2a and AE2b are restricted to the extreme N-terminal sequence encoded by the first coding exon. In contrast, transcription from the cardiac AE3 promoter results in the replacement of a 270-amino acid N-terminal sequence of brain AE3, which is encoded by five exons, with an alternative 73-amino acid sequence encoded by the cardiac-specific first exon (11)(12)(13). The 270-amino acid sequence that is eliminated contains a histidine-rich domain, numerous proline-rich regions, and domains consisting of stretches of acidic or basic residues that have counterparts in AE2a (10,14). Transcription from the AE2c promoter leads to production of an mRNA encoding an N-terminal truncated  (4,7,19); AE1k1, kidney AE1 mRNA with alternative first exon spliced to exon 4, corresponding to mouse and human kidney mRNAs and a minor rat kidney mRNA (5,7,8); AE1k2, major rat kidney mRNA in which intron 3 sequences are retained (5,7). AE3b, mRNA encoding brain form of AE3, which is also expressed in some other tissues (9, 10); AE3c, mRNA encoding cardiac form of AE3 (11)(12)(13). AE2 mRNAs are described in text. Right column, number of codons in long open-reading frame of each mRNA.
AE2 Cl Ϫ /HCO 3 Ϫ Exchanger Alternative Promoters form of the exchanger in which Met codon 200 serves as the apparent initiation codon. In this respect the AE2c mRNA is analogous to the kidney AE1 mRNA, which also encodes an N-terminal truncated exchanger that utilizes an internal Met codon as a translation start site. At the protein level the differences between AE2c and AE2a are similar to the differences between cardiac AE3 and brain AE3. AE2c is approximately the same size as cardiac AE3, and the truncation of its Nterminal sequence removes the histidine-rich domain (residues 74 -88), the proline-rich regions, and the stretches of acidic (residues 122-130) and basic amino acids (residues 94 -109) that are homologous to those eliminated in cardiac AE3. Although the functions of these domains have not been determined, it is likely that they play a role in regulating the activity of the exchanger. Replacement of the 17-amino acid N terminus of AE2a with the 3-amino acid N terminus of AE2b is a relatively limited sequence alteration compared with those seen in the other AE variants. However, the unique N terminus of AE2a contains a potential phosphorylation site for cAMP-dependent protein kinase at serine 10. An increase in cAMP is known to inhibit the absorption of NaCl across the apical membranes of intestinal epithelial cells (30). This process involves the coupled activities of a Cl Ϫ /HCO 3 Ϫ exchanger, which appears to be the AE2a variant (16), and a Na ϩ /H ϩ exchanger (16) that is thought to be NHE3 (31). Although we are unaware of any evidence demonstrating that cAMP causes a direct inhibition of AE2 in intestine, cAMP has been shown to inhibit Cl Ϫ /HCO 3 Ϫ exchange in osteoblasts (32). The identity of the exchanger in osteoblasts was not determined, but it is conceivable that it is the ubiquitous AE2a. In stomach, secretion of acid is stimulated by cAMP (33), but cAMP does not alter the transport capacity of the basolateral Cl Ϫ /HCO 3 Ϫ exchanger in the parietal cell (34), which is known to be a variant of AE2 (18). If Ser-10 of AE2a does serve as an inhibitory phosphorylation site, then the elimination of this site in AE2b and AE2c could serve an important function during acid secretion.
A second possibility that should be considered is that the variant N-terminal sequences contain sorting signals or cytoskeletal attachment sites that influence the membrane location of the exchanger in polarized epithelial cells. Such a possibility is suggested by the recent demonstration that some variants of chicken AE1 are sorted to the plasma membrane when expressed in human erythroleukemia cells, whereas other variants, which differ in their N-terminal sequences, are retained in intracellular membranes (27). Immunolocalization studies have shown that AE2 is present on apical membranes of villus and crypt enterocytes in ileum (16) and on basolateral membranes of gastric parietal cells (18). The variant that was cloned from ileum is AE2a (16), and previous Northern blot data (10) suggest that in ileum the level of AE2a mRNA is greater than that of AE2b, consistent with the possibility that AE2a is the apical exchanger in ileum. Likewise, AE2b and AE2c mRNAs are abundant in stomach, suggesting that one or both of these variants might mediate Cl Ϫ /HCO 3 Ϫ exchange across the basolateral membrane of the parietal cell.
There is at least one known example in which the use of alternative promoters leads to an alteration in the localization of the encoded proteins. It has been shown that leukemia inhibitory factor can exist as a diffusible form or as an extracellular matrix-associated form (35), and that the two proteins, which contain only slightly different N-terminal sequences (MKVLAAG for the diffusible form and MRCR for the matrixassociated form), are encoded by mRNAs that are transcribed from alternative promoters. Thus, it seems reasonable to speculate that the unique 17-amino acid N-terminal sequence of AE2a might localize this variant to the apical membrane in certain polarized epithelial cells, and that elimination of this sequence, as in AE2b and AE2c, might result in sorting to the basolateral membrane. If future experiments prove this hypothesis to be correct, it would not rule out the possibility that the same variant can be sorted to different membranes depending on the cell-type in which it is expressed, but it would provide an explanation of at least one mechanism by which AE2 can be localized to either apical or basolateral membranes. Also, it would implicate AE2b as a possible candidate for the Cl Ϫ /HCO 3 Ϫ exchanger that functions on the basolateral membranes of epithelial cells of renal proximal tubules (36), the thick ascending limb (37), and ␤-intercalated cells of the cortical collecting duct (38).