Structure of the Human Sarco/Endoplasmic Reticulum Ca2+-ATPase 3 Gene

Human chromosome 17-specific genomic clones extending over 90 kilobases (kb) of DNA and coding for sarco/endoplasmic reticulum Ca2+-ATPase 3 (SERCA3) were isolated. The presence of the D17S1828 genetic marker in the cosmid contig enabled us to map the SERCA3 gene (ATP2A3) 11 centimorgans from the top of the short arm p of chromosome 17, in the vicinity of the cystinosis gene locus. The SERCA3 gene contains 22 exons spread over 50 kb of genomic DNA. The exon/intron boundaries are well conserved between human SERCA3 and SERCA1 genes, except for the junction between exons 8 and 9 which is found in the SERCA1 gene but not in SERCA3 and SERCA2 genes. The transcription start site (+1) is located 152 nucleotides (nt) upstream of the AUG codon. The 5′-flanking region, including exon 1, is embedded in a 1.5-kb CpG island and is characterized by the absence of a TATA box and by the presence of 14 putative Sp1 sites, 11 CACCC boxes, 5 AP-2-binding motifs, 3 GGCTGGGG motifs, 3 CANNTG boxes, a GATA motif, as well as single sites for Ets-1, c-Myc, and TFIIIc. Functional promoter analysis indicated that the GC-rich region (87% G + C) from −135 to −31 is of critical importance in initiating SERCA3 gene transcription in Jurkat cells. Exon 21 (human, 101 base pairs; mouse, 86 base pairs) can be alternatively excluded, partially included, or totally included, thus generating, respectively, SERCA3a (human and mouse, 999 amino acids (aa)), SERCA3b (human, 1043 aa; mouse, 1038 aa), or SERCA3c (human, 1024 aa; mouse, 1021 aa) isoforms with different C termini. Expression of the mouse SERCA3 isoforms in COS-1 cells demonstrated their ability to function as active pumps, although with different apparent affinities for Ca2+.

Sarco/endoplasmic reticulum Ca 2ϩ -ATPases (SERCAs), 1 me-diating the uptake of Ca 2ϩ into intracellular stores such as sarcoplasmic and endoplasmic reticulum, are encoded by three distinct genes in higher vertebrates (reviewed in Ref. 1). SERCA1 is expressed only in the fast-twitch skeletal muscle as one of its developmentally spliced variants: the adult SERCA1a (994 aa) or the neonatal SERCA1b (1001 aa). Both isoforms present identical amino acid sequences up to amino acid 993. As a result of retention/excision of the penultimate exon (42 bp), respectively, in the SERCA1a/SERCA1b splice variants, the last amino acid (Gly) in SERCA1a is replaced in SERCA1b by a highly charged octapeptide sequence DPEDERRK (2). COS cell expression studies showed no functional differences between SERCA1a and SERCA1b isoforms. The complete structures of the rabbit (2) and human (3) SERCA1 genes have been elucidated. The SERCA1 gene (ATP2A1) has been mapped to human chromosome 16p12.1 (4) and a deficiency in SERCA1 is responsible for at least one autosomal recessive form of Brody disease (5). Tissue-specific processing of the SERCA2 gene primary transcript generates up to four mRNA classes (6), which code for two isoenzymes as follows: a cardiac/ slow-twitch skeletal muscle protein (SERCA2a) and a ubiquitously expressed isoform (SERCA2b). As a result of alternative splicing, the SERCA2a-specific C terminus comprising the sequence AILE (aa 994 -997) is replaced by a variant tail of 49 or 50 amino acids in SERCA2b (7)(8)(9). This extended tail contains a very hydrophobic stretch, which is suggested to represent a possible 11th transmembrane segment (7)(8)(9). The divergence in the C-terminal part is responsible for functional differences between SERCA2a and SERCA2b (10,11); these differences were recently ascribed to the presence of the last 12 amino acids in SERCA2b (12). Thus far, the complete structure of a SERCA2 gene is lacking, but partial characterization of the 5Јand/or 3Ј-ends of the gene has been reported for human (7,13), rabbit (14), pig (15), and rat (16). The SERCA2 gene (ATP2A2) has been mapped to human chromosome 12q23-q24.1 (17). Structural and functional analyses of the SERCA2 gene promoter in rabbit (18 -20), rat (16), and human (13) identified the promoter regions required for transcriptional activity in NIH3T3 fibroblasts, primary cultured rat cardiomyocytes, C2C12 and Sol8 muscle cells. Several putative cis-acting elements have been described, among which Sp1 sites and thyroid-responsive elements have been proven to exert an important role in transcriptional regulation of the SERCA2 gene (20,21). Unique SERCA genes have also been described in invertebrates, such as the crustacean Artemia franciscana (22) and the insect Drosophila melanogaster (23). The gene primary transcript is alternatively spliced in Artemia, and the expres-sion of the two isoforms is regulated by tissue-specific alternative promoters (24).
The first report describing the cloning of the SERCA3 cDNA from rat kidney (25) indicated a broad expression pattern for its 4.8-kb transcript. Recent studies demonstrated that SERCA3 is always co-expressed along with the ubiquitous SERCA2b isoform (26), and high levels of SERCA3 mRNA have been documented in the hematopoietic cell lineage, arterial endothelial and secretory epithelial cells, as well as in cerebellar Purkinje neurons (27)(28)(29)(30). Upon expression in COS-1 cells, SERCA3 presents a much lower apparent affinity for Ca 2ϩ , when compared with the other members of the SERCA family (10). We have previously identified the 97-kDa SERCA3 (999 aa) in both human and rat platelets using a set of SERCA3specific antisera (27). Additionally, we cloned the human SERCA3 cDNA, isolated and partially characterized a genomic clone encoding all but the 5Ј-end of the gene, and localized the SERCA3 gene (ATP2A3) on human chromosome 17p13.3 (31). Until very recently, there were no indications that the SERCA3 pre-mRNA was subject to alternative splicing. Two mouse nucleotide sequences coding for SERCA3a and SERCA3b have been deposited in the EMBL/GenBank TM data bank. 2 So far, no indications regarding the alternative splicing mechanism were published.
We now document the complete exon/intron organization of the human SERCA3 gene. The transcription initiation site and several upstream putative cis-regulatory elements were identified. The functional promoter analysis delineates the minimal promoter region responsible for efficient transcriptional activity and suggests the involvement of the Sp1 transcription factor. We also provide evidence that the human and mouse SERCA3 gene primary transcripts are alternatively spliced, thereby generating not two but three distinct isoforms with The overlapping restriction maps of the clones (from left to right: ICRFc105-A09183, -G09189, -C10135, -F021, GHS3, ICRFc105-F10124, and -G1035) are shown below the scale line. The enlargement of a 50-kb genomic region (illustrated as a rectangle) encoding ATP2A3 is shown below the ICRFc105-A09183 clone. B*, BamHI; C, ClaI; E, EcoRI; H, HindIII; K*, KasI, and N, NotI. The asterisks denote that the indicated recognition sites are not unique ones in the cosmid contig. The 6.6-kb fragment (black box) flanked by B* and K* sites and used in the functional promoter studies is also indicated. The exon/intron layout of the gene is displayed below the restriction map of the 50-kb genomic region. The splicing pattern and the sizes (not to scale) of the 22 exons are indicated below the gene structure together with the ATG codon and polyadenylation signal (AATAAA). Exon 21 is optional, and when included, it is either 88 or 101 bp long. The inset shows the results of PCR amplifications using specific primers for the genetic marker D17S1828 and as DNA template altered C termini as follows: SERCA3a, SERCA3b, and SERCA3c. Furthermore, the three mouse SERCA3 isoforms were overexpressed in COS cells and shown to be functionally active but with different apparent affinities for Ca 2ϩ .

MATERIALS AND METHODS
Isolation and Characterization of Genomic Clones-To isolate the entire gene, a human chromosome 17-specific library from Reference Library Data Base, ICRF (32), was screened with the 1482-bp EcoRI FIG. 2. Exon/intron boundaries of the human SERCA3 gene. The nucleotide sequence of the exon/intron boundaries is shown, and the determined or estimated (ϳ) intron sizes are indicated in kilobases (kb). Exon sequences are shown in uppercase letters and intron sequences in lowercase letters. The numbers above the exon sequences denote the nucleotides where splicing occurs (numbering relative to the ATG codon). The deduced amino acid sequences at each junction are displayed below the exon sequences. Exon/intron boundaries are indicated by diagonal lines, and the conserved GT and AG nucleotides are shown in boldface. The asterisk in the last intron indicates the presence of an optional exon (exon 21). The sizes of all exons are shown in Fig. 1.
insert of the human SERCA3 partial cDNA clone Z8 (31). The EcoRI fragment comprised 6 bp of the 5Ј-untranslated region and the first 1476 bp of the coding region. Six new positive clones ( Fig. 1) were isolated and further characterized according to standard restriction mapping and sequencing protocols. Analysis of repetitive sequences was carried out using the CENSOR server. 3 The computer-assisted analysis of the putative transcription factor binding sites was performed using the Wisconsin Package Version 9.0 program from Genetics Computer Group (GCG), Madison, WI.
Primer Extension Analysis-Poly(A) ϩ RNA was isolated from human tonsils (31). Primer extension analysis was essentially performed as described (33). The extension primer used (5Ј-GAGGCCATGTCCGT-GCTGGGAC-3Ј) corresponds to the inverse complement of nucleotides 25-46 (numbering relative to the determined transcription site; see Fig.  5b). The 35 S-labeled sequencing products (used as size markers) of a 5Ј genomic fragment primed with the same extension primer and the extension products were separated on a 6% polyacrylamide, 7 M urea sequencing gel.
Analysis of Promoter Activity-A 6.6-kb BamHI-KasI genomic fragment containing the 5Ј-flanking region of the human SERCA3 gene ( Fig. 1) was subcloned between the BglII and HindIII restriction sites of the luciferase expression vector pGL3 basic (Promega, Madison, WI). The resulting plasmid, p6.6BK, was used as a template for further generation of controlled deletions by making use of restriction sites present within the genomic insert and the luciferase cloning vector (Fig.  5c). The 5Ј-end of each of the deletion constructs was confirmed by sequencing. The human SERCA3 promoter-luciferase constructs and the unmodified, promoterless pGL3 basic reporter vector were used for transient transfection of the human cell line Jurkat E6.1 cells by electroporation as described (34). To evaluate transfection efficiencies, the cells were co-transfected with 150 or 500 ng of a pEL1-␤gal vector, containing the ␤-galactosidase reporter gene driven by the elongation factor 1 promoter (pEL1-␤gal vector is a gift from Dr. F. Bulens). 4 Reporter enzyme activities were assayed 40 h after electroporation according to the manufacturer's instructions. The measurements were performed with the MicroLumat LB 96P luminometer (EG & G, Berthold, Bad Wildbad, Germany) and corrected for protein concentration, as determined by the bicinchoninic acid method (Pierce), using bovine serum albumin as standard. Luciferase activities are expressed relative to the ␤-galactosidase activities and normalized to the value obtained with the promoterless pGL3 basic vector which is set at 1.
Tissue Distribution of Human SERCA3 mRNA-The human RNA Master Blot (CLONTECH, Palo Alto, CA), to which high quality poly(A) ϩ RNAs from 50 different adult and fetal tissues have been immobilized along with several controls (Fig. 6), was hybridized following the manufacturer's protocol. The synthesis of a 3Ј-end probe by PCR  nt-long radioactively labeled oligonucleotide (extension primer) was hybridized to 12 g of poly(A) ϩ RNA from human tonsils, and the extension reaction was performed as described (41). The location of the extension primer is indicated in Fig.  5b. The extended products (lane Ext) were analyzed on denaturing polyacrylamide gels. Lanes A, C, G, and T show the migration of the 35 S-labeled sequencing products (as size markers) derived from a 406-bp EcoRI-KasI genomic fragment (the restriction sites are indicated in Fig. 5c) primed with the same extension primer. The position of the extended product (46 nt long) is indicated at the right of the autoradiogram together with an enlargement showing the sequence (5Ј to 3Ј) surrounding the transcription start site (ϩ1). The coding strand is boxed.
was described earlier (31). The probe corresponds to the nucleotides 3033-3405 (accession number Z69881) found in the 3Ј-untranslated region of human SERCA3 cDNA. The blot was analyzed by means of a PhosphorImager model STORM 840 (Molecular Dynamics, Sunnyvale, CA). A common SERCA3b/SERCA3c probe (90-bp long) was PCR-synthesized using a 5Ј primer Nϩ (5Ј-GCACGGCCTTCTCAGGACAGTCT-3Ј) and the 3Ј primer P1 (5Ј-GGCTCATTTCTTCCGGTGTGGTCTGG-3Ј) and the GHS3 clone as template DNA; these primers ( Fig. 8a) span the exon/intron junctions involved in the alternative splicing. PCR amplification was carried out for 20 cycles, each cycle consisting of 30 s at 94°C, 30 s at 65°C, and 30 s at 72°C.
Reverse Transcriptase-PCR Analyses-Total RNA (0.5 g) from mouse pancreatic islets (gift from D. L. Eizirik and D. Pipeleers, Department of Metabolism and Endocrinology, Vrije Universiteit, Brussels, Belgium) and 0.5 g of poly(A) ϩ RNA from human kidney (CLON-TECH) were reverse-transcribed in an oligo(dT)-primed reaction as described (27). The mouse SERCA3 primers used are as follows: a 5Ј primer M ϩ 1 (5Ј-GGGGTGGTGCTTCAGATGTCTCTGC-3Ј) corresponding to nucleotides 2948 -2972 in mouse SERCA3a and SERCA3b nucleotide sequences (accession numbers U49394 and U49393, respectively) and a 3Ј primer M Ϫ 1 (5Ј-GGACAAATGCCTGGATGCTCT-CAGT-3Ј) corresponding to the inverse complement of nucleotide stretches 3086 -3110 and 3159 -3183 in mouse SERCA3a and SERCA3b cDNA nucleotide sequences, respectively. A specific 3Ј primer for the mouse SERCA3c isoform P3 (5Ј-CTTCAGGTCCTTTTTTTC-CAAGAAGCCAAC-3Ј) spans the splice boundary between the last exon and an optional exon. PCR amplifications were carried out for 35 cycles, each cycle consisting of 30 s at 94°C, 30 s at 68°C, and 30 s at 72°C for both M ϩ 1/M Ϫ 1, and M ϩ 1/P3 pairs. The human SERCA3 primers used are as follows: a common 5Ј primer 22ϩ (5Ј-CTGCACTTCCT-CATCCTGCTCG-3Ј) corresponding to nucleotides 2833-2854 and a 3Ј primer 1Ϫ (5Ј-ATGGGCACCATCAGTCTGAGG-3Ј) corresponding to the inverse complement of the nucleotide stretch 3040 -3060; numbering according to the nucleotide sequence deposited under accession number Z69881. Two additional 3Ј primers specific for the human SERCA3b and SERCA3c isoforms were designed as follows: the above mentioned primer P1 and the primer P2 (5Ј-GGCTCATTTCTTCAAA-GAGGCCAAC-3Ј), respectively. The PCR conditions were the same for the 3 pairs of primers (22ϩ/1Ϫ, 22ϩ/P1, and 22ϩ/P2): 35 cycles, each cycle consisting of 30 s at 94°C, 30 s at 65°C, and 30 s at 72°C. All PCR amplifications were performed using a mixture of Pwo (proofreading activity) and Taq polymerases from Boehringer Mannheim, Brussels, Belgium. M ϩ 1, M Ϫ 1, P2, and P3 primers are also represented in Fig.  8a. PCR fragments were gel-purified and subcloned, and for each fragment several individual clones were sequenced.
Construction and Expression of SERCA3 cDNAs in COS-1 Cells-The entire coding regions of the mouse SERCA3a, SERCA3b, and SERCA3c cDNAs were amplified by PCR from mouse pancreatic islets first-strand cDNA using a common 5Ј primer MMLD (5Ј-AGAAGCGAC-CTGGACGTCGCGGAC-3Ј) corresponding to nucleotides 8 -31 in mouse SERCA3a and SERCA3b cDNA sequences (numbering according to accession numbers U49394 and U49393, respectively) in combination with either the primer M Ϫ 1 (for SERCA3a and SERCA3b amplifications) or the SERCA3c-specific primer P3. PCR reactions were carried out for 35 cycles, each cycle consisting of 1 min at 94°C and 4 min at 72°C for both MMLD/M Ϫ 1 and MMLD/P3 primer pairs. PCR products were separated by 1% agarose gel electrophoresis, gel-purified, blunt-ended, phosphorylated, and transferred into the EcoRI-cut, dephosphorylated, and blunt-ended mammalian expression vector pMT2 (from R. J. Kaufman, Genetics Institute, Boston, MA). The cloning of the pig SERCA2b cDNA in pSV57 expression vector was described earlier (11). COS-1 cell culture and DEAE-dextran-mediated DNA transfections were performed as described (11).
Membrane Preparations, Immunoblotting Analysis, and Ca 2ϩ Transport Assays-Microsomes were isolated from COS-1 cells expressing mouse SERCA3a, SERCA3b, SERCA3c, and pig SERCA2b according to Verboomen et al. (11). Preparation of the N89 anti-SERCA3 antibody, denaturing gel electrophoresis on 0.75-mm-thick 7.5% polyacrylamide slab gels, semi-dry blotting onto Immobilon-P membranes (Millipore, Brussels, Belgium), and immunostaining of the blots were done as reported earlier (27). Oxalate-stimulated Ca 2ϩ uptake was measured by a rapid filtration method in the absence or presence of 5 mM ATP at 27°C as described (12).

Isolation and Characterization of Human SERCA3 Genomic
Clones-We have previously described the isolation and partial characterization of the first genomic clone (GHS3, approximately 40 kb in length) specifying the 3Ј region of the human SERCA3 gene and localized the gene by fluorescence in situ hybridization to human chromosome 17 (31). Subsequent screening of a human chromosome 17-specific cosmid library from the Reference Library Data Base, ICRF (32) with a probe corresponding to the 5Ј-coding region of human SERCA3 cDNA resulted in the isolation of six new overlapping cosmid clones, whose restriction maps are illustrated in Fig. 1. The cosmid contig covers a genomic region of about 90 kb. The exon/intron organization of the human SERCA3 gene and the sizes of all exons are shown in Fig. 1. The gene is divided in 22 exons distributed across 50 kb of genomic DNA. Exon 21 is optional and, when retained, consists of 88 or 101 bp due to the use of an internal donor splice site (see below). The positions and sizes of introns were determined by PCR analyses using SERCA3specific primers derived from exonic sequences homologous to those flanking the exon/intron junctions in human (3) and rabbit SERCA1 (2), for which the complete gene structures are known. The exon sequences obtained by genomic sequencing perfectly match those from the cDNA (accession number Z69881), except for a single polymorphism in which a C replaces a T at position 1361 of the cDNA sequence. This point mutation does not change the amino acid sequence at position 453 (Asn). The nucleotide sequences of the exon/intron junctions as well as the position and size of each intron in relation to the amino acid stretch are shown in Fig. 2. However, the last intron (intron 20, 3.218 kb long) contains an optional exon (exon 21). The inset in Fig. 1 documents the presence of the D17S1828 marker in the cosmid clones ICRFc105-G1035 and -F10124 via PCR analysis. D17S1828, containing the dinucleotide repeat (CA) 22 , has been mapped by Genethon 11 cM from the top of the short arm p of chromosome 17. The amplified product (215 bp) was used as a probe in Southern blot hybridization analysis and assigned to a position approximately 20 kb downstream of the SERCA3 gene (Fig. 1).
Comparison of Exon/Intron Boundaries of SERCA Genes-The exon/intron structure of the human SERCA3 gene is compared in Fig. 3a (2). We found that this junction is conserved in both human SERCA1 and SERCA3 (Fig. 3a). The complete analysis of the exon/intron layout of the human SERCA1 and SERCA3 genes indicated that the positions of all junctions are conserved except for one boundary, which is found in the SERCA1 gene between exons 8 (298 bp) and 9 (167 bp), but not in the SERCA3 gene. This boundary is also absent from the Artemia and Drosophila genes. PCR amplifications from both human genomic DNA and human kidney first-strand cDNA with human SERCA2-specific primers (Fig. 3b) documented the absence of an intervening sequence in human SERCA2 at this position, too.
Structural and Functional Analyses of the 5Ј-End of the Gene-The cosmid contig described in Fig. 1 contains approximately 22 kb of genomic DNA upstream of the translation initiation site, of which the proximal 4447 nt were sequenced. In order to determine the transcription initiation site for the SERCA3 mRNA, primer extension analysis was performed with an antisense 22-nt long extension primer, stretching from Ϫ107 to Ϫ128 nt upstream of the ATG site, and using poly(A) ϩ RNA from human tonsils. The result in Fig. 4 shows a single extension product of 46 nucleotides, thus locating the transcription initiation site (referred to as nt ϩ1) at position 152 upstream of the translation initiation site (Fig. 5b). The transcription initiation site determined in the present study is found in the vicinity of the sequence 5Ј-CCACTGC-3Ј (represented as a box in Fig. 5b) matching the consensus initiator (Inr) sequence YYAN(T/A)YY (35), where the A indicates the transcription start site frequently used in other genes.
An 11-kb nucleotide sequence, comprising 4447 bp upstream of the cap site, exon 1 (270 bp), and the first 6500 bp of intron 1, was analyzed in terms of CG content (Fig. 5a). This analysis shows that exon 1 is embedded in a typical "CpG" island, i.e. a GC-rich region (1-2 kb in size), characterized by dense clustering of CpG dinucleotides, and frequently found at the 5Ј-end of a gene (36). The CpG island associated to the 5Ј-end of the human SERCA3 gene is 1.5 kb in size. Interestingly, the CpG island is preceded by a series of repetitive sequences belonging to J and S types of the Alu family. The nucleotide sequence (2458 bp long), including exon 1 and its 5Ј-flanking region and part of intron 1, is shown in Fig. 5b. Further analysis of the sequence upstream of the cap site revealed no TATA box. SERCA3 transcription appears to be driven by a TATA-less GC-rich promoter. Analysis of 1775 nucleotides upstream of the transcription initiation site revealed that its proximal part from Ϫ455 to Ϫ1 (GC-rich region with 72% G ϩ C) concentrates consensus sequences for several potentially important cis-regulatory elements (Fig. 5b). The Sp1-binding site 5Ј-GGGCGG-3Ј, present in constitutively expressed genes (37), and its inverse complementary sequence 5Ј-CCGCCC-3Ј are present 13 times. One Sp1-like element 5Ј-GGGAGG-3Ј was also found at position Ϫ44 to Ϫ39. The sequence 5Ј-CACCC-3Ј and its inverse complement 5Ј-GGGTG-3Ј occurred 11 times. This motif was reported to be potentially associated with a glucocorticoid receptor binding site (38) and to be functionally important in the human and mouse ␤-globin promoter, where it can be bound by both an erythroid Krü ppel-like protein and the ubiquitous factor Sp1 (39). In contrast, the human SERCA2 promoter (19) is characterized by the presence of only one 5Ј-CACCC-3Ј box. The sequence 5Ј-CANNTG-3Ј (3 times) corresponds to the consensus binding site of muscle-specific transcription factors of the MyoD family (40). The binding site (5Ј-CC(C/G)C(A/G)GGC-3Ј or 5Ј-CCCCCCGG-3Ј) of the general transcription factor AP-2 (41) occurred 4 times. The octamer sequence 5Ј-GGCTGGGG-3Ј (referred to as OCT in Fig. 5b) was found at positions Ϫ887 to Ϫ880, 447-454, and 454 -461. This motif, known to behave as a cis-acting element in the ␤-globin gene promoter (42), was also present in the 5Ј-flanking region of SERCA1 (2) and SERCA2 (14) genes. The GATA motif 5Ј-GATAAG-3Ј, currently associated with hematopoietic and endothelial expressed genes (43), was found in the SERCA3 gene immediately upstream of the 5Ј-GGCTGGGG-3Ј element at position Ϫ886 to Ϫ881. The sequence 5Ј-TCTCTTA-3Ј known to bind c-Myc protein (44) was found at position Ϫ831 to Ϫ825. The sequence 5Ј-CAGGCGGT-3Ј, representing the inverse complement of the TFIIIc consensus binding site, was found at position Ϫ119 to Ϫ112 (45). The Ets-1 cis-element 5Ј-GAG-GAAG-3Ј (46) was found at position Ϫ1181 to Ϫ1175. A trinucleotide repeat (TAA) 9 is found at position Ϫ1430 to Ϫ1404. A poly(dA-dT) stretch (47), 16 bp long, was found within the Alu sequence at position Ϫ1634 to Ϫ1619. Another poly(dA-dT) stretch, 13 bp long, is flanking the 3Ј-end of the Alu sequence at position Ϫ1449 to Ϫ1437. Such stretches were also reported in the 5Ј-flanking region of rabbit SERCA2 (14) and Artemia SERCA (22) genes.
In order to delineate the core promoter region responsible for human SERCA3 gene expression, six defined 5Ј deletions, ranging from position Ϫ1313 to Ϫ31 relative to the transcription initiation site, were generated from the pGL3 basic-derived plasmid p6.6BK. The various constructs and the promoterless pGL3 basic vector were used to transiently transfect cells FIG. 6. Tissue distribution of the human SERCA3 mRNA. a shows the result of a typical dot blot hybridization using a 3Ј-end probe derived from the 3Јuntranslated region of the human SERCA3 cDNA. The autoradiographic exposure time was 16 h at Ϫ80°C using a BioMax MS film (Kodak). The radioactive signals from each dot were calculated as percentages of the total radioactivity and expressed as the -fold induction over the percentage of the whole brain (i.e. 0.1% of the total radioactivity). b shows the dot-tissue identification and the tissue-specific relative values for SERCA3 mRNA.
of the human Jurkat E6.1 cell line. The results of three independent experiments are shown in Fig. 5c. Significant transcriptional activity was obtained with each construct, except for the SmaI-del. Despite the differences in the relative luciferase activities obtained for each construct among the three experiments, a 3.1-fold induction is obtained with the full promoter construct, p6.6BK. This demonstrates that the 6.6-kb BamHI-KasI genomic fragment contains the required nucleotide sequence information for SERCA3 gene transcription. A maximum level was reached when the region from Ϫ6600 to Ϫ135 was removed (corresponding to PstI-del construct). Due to experimental fluctuations, the existence of as yet unspecified regulatory elements in the region between Ϫ6600 and Ϫ135 cannot be ruled out. A total loss of transcriptional activity was observed for the shortest construct, SmaI-del, whose 5Ј-end extended to Ϫ31. Based on this functional analysis, the shortest genomic segment, still carrying all the promoter elements (core promoter region) needed for the efficient transcription of the SERCA3 gene, can now be assigned to the GC-rich (87% G ϩ C) region from Ϫ135 to Ϫ31.
Tissue Distribution of Human SERCA3 mRNA-To determine the relative levels of human SERCA3 mRNA in different tissues, we used a human mRNA Master Blot (dot blot), on which the applied mRNA amounts are normalized for eight housekeeping genes, thus minimizing the tissue-specific variations often related to the expression of any single housekeeping gene. Our dot blot hybridization analysis, using a 3Ј-end probe, first demonstrated that SERCA3 mRNA is expressed in the human adult and fetal non-muscle tissues shown in Fig. 6, and second, revealed that the expression levels dramatically vary from tissue to tissue as follows: with high levels in thymus, trachea, salivary gland, spleen, bone marrow, lymph node, peripheral leukocytes, pancreas, and colon and intermediate to low levels in the rest of the tissues.
Alternative Splicing of the SERCA3 Primary Transcript Generates Three Variants in Human and Mouse-Recently, two nucleotide sequences encoding mouse SERCA3a and SERCA3b isoforms were deposited in the EMBL/GenBank TM data base under accession numbers U49394 and U49393, respectively, and RT-PCR analysis indicated that SERCA3b is co-expressed with SERCA3a in mouse pancreatic islets of Langerhans. 5 Insertion of a 73-bp optional exon in SERCA3b occurs immediately after nt 2980 (relative to the ATG codon) which, interestingly, also represents the point of divergence between the different splice variants in the related SERCA1 and SERCA2 genes. Retention of this additional nucleotide stretch results in a shift in the open reading frame, so that the last 6 amino acids of SERCA3a are replaced by a 45-aa tail in the SERCA3b isoform. We confirmed the existence of the two SERCA3 transcripts in mouse islets by means of RT-PCR using M ϩ 1 and M Ϫ 1 as primers (Fig. 7a, lane 1). Equally intense bands of 163 and 236 bp were detected. Subcloning and subsequent sequencing confirmed that the 163-bp fragment, indeed, corresponded to SERCA3a. Remarkably, the 236-bp band proved to represent a heterogeneous population, consisting of a fragment of 236 bp (SERCA3b-specific) contaminated with a 249-bp long fragment. The latter represented a novel variant, SERCA3c, in which the 3Ј-end of the SERCA3b optional exon is extended with an additional 13-bp stretch (see also Fig. 8a). The SERCA3c-specific amplification from mouse islets became possible by using a mouse SERCA3c-specific primer, P3 (Fig. 7a, lane 2). Analysis of the genomic sequence of the 3Ј-end of the human SERCA3 gene indicated that the generation of the three SERCA3 splice variants is theoretically possible. The hybridization of the human mRNA Master Blot with a common human SERCA3b/ SERCA3c probe indicated that SERCA3b and/or SERCA3c are mainly expressed in human kidney, thymus, salivary gland, trachea, and colon but at much lower levels than the predominant SERCA3a mRNA (Fig. 7b). RT-PCR from human kidney performed with the human-specific primers, 22ϩ and 1Ϫ (Fig.  7c, lane 1), encompassing the optional exon(s), could in principle amplify all three SERCA3-specific variants (expected lengths: 228, 316, and 329 bp for SERCA3a, SERCA3b, and SERCA3c, respectively). However, only a 228-bp SERCA3aspecific product was detected; amplification of SERCA3b-and SERCA3c-specific products (Fig. 7c, lanes 2 and 3, respectively) became possible by using primer 22ϩ in combination with splice variant-specific primers P1 and P2, respectively. Fig. 8a compares a 3218-bp long human genomic fragment spanning the intervening region between exons 20 and 22 (31), with a partial mouse genomic sequence derived from a 3-kb PCR product amplified from mouse genomic DNA with the M ϩ 1 and M Ϫ 1 primers (data not shown). An optional exon (exon 21) is found 334 bp (in human) and 387 bp (in mouse) downstream of the conserved point of divergence (nt 2980). The human exon 21 is 15 nt longer than the mouse one, because an additional 3Ј acceptor splice site is found in the human sequence 15 nt upstream of the one used in mouse. Exon 21 contains an internal 5Ј donor splice site (designated D1 in Fig. 8, a and b). D1 and D2 are conserved in human and mouse. If D1 is used, the size of exon 21 is 88 or 73 bp in human or mouse, respectively, giving rise to the SERCA3b isoforms. If D2 is used, exon 21 reaches its maximum size in both human (101 bp) and mouse (86 bp), thereby giving rise to SERCA3c. An illustration of how 5 G. I. Bell, unpublished observations.  a rectangle with round corners). The 5Ј donor sites D1 and D2 are indicated above their boxed sequences. If D1 is used, then the sequence of the exon 21 indicated in bold is joined to exon 22. When D2 is used, then the sequence in italics is included in exon 21 and then joined to exon 22. The human (hA) and mouse (mA) 3Ј acceptor sites used are also indicated. The boxes Sa and Sb denote the overlapping human and mouse stop codons used in SERCA3a and SERCA3b, respectively. The human and mouse stop codons for SERCA3c (in italics) are shifted and shown in small boxes. The nucleotide numbering is shown relative to the human ATG codon in bold and italics for SERCA3b and SERCA3c, respectively. The determined or estimated (ϳ) sizes of the human and mouse introns and of the last human exon are also indicated. P1 (thin arrow), P2 (dashed arrow), P3, Nϩ, M Ϫ 1, and M ϩ 1 (thick arrows) primers are SERCA3 splice variants are generated is shown in the upper part of Fig. 8b.
Functional Analyses of the Three SERCA3 Isoforms Transiently Expressed in COS-1 Cells-To characterize functionally the three SERCA3 isoforms, we have employed the COS-1 cell expression system and measured the oxalate-stimulated Ca 2ϩ uptake into the microsomal fraction. For this purpose, we PCRamplified the coding regions of SERCA3a, SERCA3b, and SERCA3c cDNAs from mouse pancreatic islets first-strand cDNA (data not shown) and subcloned them into the expression vector pMT2. Since the higher GC content of the 5Ј-untranslated region of the human SERCA3 gene relative to the mouse one causes premature termination of the reverse transcription reaction (data not shown), the amplification of the complete human SERCA3 cDNAs coding for their corresponding isoforms was not possible so far. Therefore, COS-1 cells were transfected with each of the mouse SERCA3 constructs and, for comparison, also with the pig SERCA2b construct. Fig. 9a shows a typical immunoblot analysis of the mouse SERCA3 isoforms expressed in microsomes isolated from COS-1 cells transfected with the corresponding SERCA3 cDNAs. The immunoblot was stained with the polyclonal antibody N89, which was raised against an epitope close to the N terminus of rat SERCA3 (27). The epitope amino acid sequence is also conserved in both human and mouse SERCA3 isoforms. The time course of oxalate-stimulated Ca 2ϩ uptake into microsomal vesicles isolated from COS-1 cells transfected with SERCA2b and each of the SERCA3 cDNAs (Fig. 9b) demonstrates the ability of each of the SERCA3 isoforms to function as a Ca 2ϩ pump. The apparent affinities for Ca 2ϩ of the SERCA3 and SERCA2b isoforms were also deduced (Fig. 9c). We confirm that SERCA3a presents, in this COS-1 cell system, a lower Ca 2ϩ affinity with respect to SERCA2b (K1 ⁄2 ϭ 2.2 versus K1 ⁄2 ϭ 0.19 M), but the obtained values differ slightly from those reported earlier: K1 ⁄2 ϭ 1.1 M for rat SERCA3 (10) and K1 ⁄2 ϭ 0.27 or 0.24 for SERCA2b (10,12). Interestingly, SERCA3b and SERCA3c show much lower apparent affinities for Ca 2ϩ than SERCA3a. The K1 ⁄2 values for SERCA3b and SERCA3c cannot be determined, since the saturation plateau for either SERCA3b or SERCA3c was not reached under our experimental conditions and the use of still higher free Ca 2ϩ concentrations was incompatible with the calcium oxalate precipitation technique. DISCUSSION We have isolated a total of seven genomic clones spanning a DNA region of 90 kb, of which 50 kb encode the human SERCA3 gene. So far, four other SERCA genes have been completely characterized as follows: the rabbit (23 kb; Ref. . For SERCA2 only a partial exon/intron characterization of the human (7), rabbit (14), rat (16), and pig (15) genes has been reported. The estimated size of the mammalian SERCA2 gene is between 45 and 50 kb, which is comparable to that of human SERCA3. The human SERCA3 gene consists of 22 exons with an average exon size of 219 bp. In comparison, the rabbit and human SERCA1 genes count 23 exons. In both genes, the penultimate exon (exon 21 in SERCA3 and exon 22 in SERCA1) is alternatively spliced. Analysis of the exon/intron boundaries showed (Fig. 3a) that all the intron positions are conserved between the SERCA3 and SERCA1 genes, with the exception of one boundary that is present only in SERCA1 between exons 8 and 9. In the SERCA3 gene, the corresponding exonic sequences are joined in one exon, i.e. exon 8. We now provide evidence (Fig. 3b) that this junction is also absent from the human SERCA2 gene. The comparative junction analysis suggests that SERCA2 and SERCA3 would have diverged through gene duplication mechanisms from a common ancestor gene prior to the SERCA1 separation. Recent phylogenetic tree analyses based on the amino acid sequence comparison of the invertebrate and vertebrate SERCA pumps are in line with this conclusion (48). The localization of SERCA3 gene on human chromosome 17 (31) was further confirmed by the isolation of six overlapping genomic clones from a chromosome 17-specific library. In this study (inset in Fig. 1), we showed that the genetic marker D17S1828 was found approximately 20 kb downstream of the SERCA3 gene, which means that the gene encoding the SERCA3 pump can now be mapped 11 cM from the top of the short arm p of chromosome 17. D17S1828 and D17S1798 microsatellite markers have been recently demonstrated to flank a genetic region of 1 cM, which represents the interval where the cystinosis gene locus has been mapped (49).
It remains an open question whether ATP2A3 is included in the genomic region associated with cystinosis, an autosomal recessive disorder caused by a defect in the transport of cystine from lysosomes to the cytosol.
To investigate the regulation of the human SERCA3 gene expression, primer extension, nucleotide sequence, functional promoter, and mRNA dot blot hybridization analyses were performed. The primer extension analysis indicated that the transcription site (nt ϩ1) is located 152 nt upstream of the AUG codon. No TATA element was found 25-30 bp upstream of the cap site. We have, however, identified a sequence 5Ј-CCACTGC-3Ј extending from ϩ7 to ϩ13 nt (represented as a box in Fig. 5b) that matches the consensus initiator (Inr) sequence YYAN(T/A)YY (35), where the A represents the transcription start site frequently used in other genes. For SERCA3, the A is found at position ϩ9. It has been previously demonstrated that an Inr element can enhance the promoter strength even if it is shifted a few bases upstream or downstream with respect to the transcription initiation site (50). The SERCA3 promoter falls in the TATA Ϫ Inr ϩ category of promoters (51), whereas the SERCA1 and SERCA2 genes in human and rabbit (2,3,13,14) have TATA ϩ Inr Ϫ -type promoters. It should be noted that almost every Inr element described so far functions in connection with upstream Sp1-binding sites (52). It has also been shown that a distance of 40 -50 bp between the Sp1 element(s) and the cap site is optimal for accurate transcription initiation (53). Analysis of the CpG dinucleotide distribution within the first 11 kb of the SERCA3 gene (Fig. 5a) showed that the 5Ј-end of the gene is embedded in an 1.5-kb well defined CpG island (36). Within the CpG island, a total of 14 putative DNA-binding sites for Sp1 were identified. Eight of them were found immediately upstream of the cap site, in the region between Ϫ267 and Ϫ39. Moreover, three adjacent Sp1 elements were clustered in the region Ϫ57 to Ϫ39, i.e. within the optimal distance range with respect to the Inr element. Transient transfections in Jurkat cells via electroporation were performed using seven chimeric promoter constructs. The results obtained with the PstI-and SmaI-del constructs were the most informative ones. The PstI-del construct (from ϩ55 to Ϫ135) gave the maximum transcriptional activity, whereas a total loss of activity was obtained for the SmaI-del construct (from ϩ55 to Ϫ31). One pertinent conclusion deduced from the functional promoter analysis is that the GC-rich region (87% G ϩ C) from Ϫ135 to Ϫ31 is of critical importance in initiating SERCA3 gene transcription. This region contains six putative Sp1 motifs, an inverse complement for the CACCC box, and single potential binding sites for AP-2 and TFIIIc. The results of our functional analysis are in line with a transcription model in which Sp1 protein mediates the transcription initiation through complex interactions involving the Inr element and the cellular transcription machinery. Besides Sp1, several additional transcription factors are likely to be involved in the modulation of the core promoter activity. In contrast to the SERCA2 genes, no thyroid-responsive elements were identified in the 5Ј-flanking region of the SERCA3 gene. This suggests that SERCA3 expression is not under the control of thyroid hormone as is the case for SERCA2. We conclude that the existence of a TATA Ϫ Inr ϩ promoter, which seems to be prevalent among the hematopoietic lineage-specific genes in mammals (54), together with the several putative cis-regulatory elements identified in the 5Ј-flanking region of the SERCA3 gene might account for the observed tissue-restricted expression pattern of SERCA3 (Fig. 6). On the contrary, a TATA ϩ Inr Ϫ promoter (like the one characterizing the SERCA2 gene) might be responsible for a lineage-independent expression (54). Interestingly, the SERCA gene from A. franciscana comprises two promoters (24) as follows: a TATA ϩ Inr ϩ promoter, controlling the expression of a housekeeping isoform, and a TATA ϩ Inr Ϫ promoter involved in the expression of the muscle-specific isoform. We might speculate that the evolution from a unique SERCA gene in invertebrates to the multigene SERCA family in vertebrates was also accompanied by rearrangements of the TATA and Inr promoter elements.
The alternative processing of the SERCA3 pre-mRNA can give rise to three SERCA3 isoforms, which, like the other SERCA family members, differ solely in their C-terminal amino acid sequences, found downstream of amino acid 993. Like in SERCA1, an optional SERCA3 exon (exon 21) can be skipped, thereby generating the SERCA3a splice variant, but it can be retained partially (in SERCA3b) or entirely (in SERCA3c) due to the alternative use of an internal 5Ј donor splice site (D1) as is the case for SERCA2. Moreover, the alternative splicing mode for SERCA3 resembles that of the plasma membrane Ca 2ϩ -ATPase 1 gene. In the latter case, four isoforms with different C-terminal parts can be generated by alternative exclusion, inclusion, or partial inclusion of a single exon in the 3Ј-end of the gene.
Both human and mouse SERCA3 genes are transcribed and processed in the same way, according to the splicing scheme After adding 30 -35 g of microsomal protein in 0.5 ml of uptake medium (20 mM MOPS, pH 6.8, as buffer agent), the Ca 2ϩ uptake was stopped at 5, 15, and 30 min. c, oxalate-stimulated Ca 2ϩ uptake was measured at different free Ca 2ϩ concentrations (10 Ϫ8 , 3.16 ϫ 10 Ϫ8 , 10 Ϫ7 , 3.16 ϫ 10 Ϫ7 , 5.62 ϫ 10 Ϫ7 , 10 Ϫ6 , 3.16 ϫ 10 Ϫ6 , 10 Ϫ5 , and 3.16 ϫ 10 Ϫ5 M) for 10 min into freshly isolated microsomes from COS-1 cells transfected with the indicated mouse SERCA3 isoforms. For comparison, the data obtained from pig SERCA2b are also plotted. Samples of 10 -12 g of microsomal proteins/175 l of uptake medium were used. The curves represent the best fit of the data to a cooperative model of enzyme activation obtained by general non-linear computer-assisted curve fitting. b and c, the results are the means of the indicated number (n) of separate experiments; error bars represent S.E. illustrated in Fig. 8b. We have identified in both human and mouse introns preceding the corresponding exon 21 a sequence, 5Ј-CTCTGAC-3Ј, that matches the consensus branch point sequence 5Ј-YNYURAC-3Ј (complementary to the U2 snRNA sequence) in which the A is involved in the first transesterification reaction of the pre-mRNA splicing and lariat RNA formation. However, human SERCA3b and SERCA3c splice variants structurally differ from their mouse counterparts; this difference is caused by the occurrence of an additional 3Ј acceptor splice site in the human genomic nucleotide sequence, located 15 nt upstream of the site used in mouse. This explains why the optional exon (exon 21; 101 bp) in the human gene is 15 bp longer than in mouse (86 bp). Moreover, the 15-nt sequence, representing the 5Ј-end of exon 21, encodes a new stretch of five amino acids, ACLYP 998 . This stretch, present in both human SERCA3b and SERCA3c isoforms, is inserted immediately downstream of amino acid 993, which is encoded by the last constitutively spliced exon (exon 20). Both in human and mouse, when present, SERCA3b and SERCA3c are always co-expressed in the same tissue along with the SERCA3a isoform. However, in mouse pancreatic islets of Langerhans, SERCA3a and SERCA3b are expressed at nearly equal levels, whereas in human kidney SERCA3b is expressed at lower levels than SERCA3a. SERCA3c was also found to be expressed at much lower levels than SERCA3a in all human and mouse tissues examined so far. The tissue expression pattern of SERCA3b and/or SERCA3c mRNAs seems to be much more restricted than that of SERCA3a. As a result of alternative splicing, the SERCA3a-specific C terminus comprising the last six amino acids (from 994 to 999 aa) is replaced either by a tail of 45 or 50 aa in mouse or human SERCA3b, respectively, or by a stretch of 33 or 36 aa in mouse or human SERCA3c, respectively (Fig. 8b). Hydropathy analysis of the C-terminal primary sequence of SERCA3b did not show any propensity for a hydrophobic stretch, which might function as an additional transmembrane domain; this contrasts with the situation in SERCA2b. Earlier studies concerning the structure-function relationship revealed that the divergence in the extreme C terminus is responsible for functional differences between SERCA2a and SERCA2b (10 -12). By expressing the corresponding mouse SERCA3 cDNAs in COS-1 cells, we now demonstrate that each of the SERCA3 isoforms is able to function as a Ca 2ϩ pump. We also confirm that SERCA3 (designated now SERCA3a) displays a reduced apparent affinity for Ca 2ϩ than SERCA2b. It has been proposed earlier that the reduced Ca 2ϩ affinity of SERCA3 is consistent with an enzyme (E) in which the equilibrium between the E 1 (which binds Ca 2ϩ with high affinity) and E 2 (low affinity for Ca 2ϩ ) conformations is shifted toward the E 2 conformational state (10). Structural interactions between the SERCA3-specific nucleotide/hinge and C-terminal transmembrane domains appear to mediate the shift in the E 1 3 E 2 equilibrium and thereby the Ca 2ϩ dependence (55). Finally, we report that SERCA3b and SERCA3c present different apparent affinities for Ca 2ϩ , which are even lower than the one observed for SERCA3a (SERCA3a Ͼ SERCA3b and SERCA3c). The extended tails of SERCA3b and SERCA3c must, somehow, lower the affinity for Ca 2ϩ , possibly by their direct interactions with other cytoplasmic domains of the pump. In turn, these interactions, via long range coupling, may modulate further interactions between the transmembrane domains of the enzyme. As a result, the equilibrium E 1 7 E 2 would be shifted even more toward the E 2 state. An example in which the Ca 2ϩ transport activity can be modulated through intramolecular interactions is that of the plasma membrane Ca 2ϩ -ATPase, in which the C-terminal calmodulin-binding domain can exert an autoinhibitory effect on the enzyme activity. These experimental observations strengthen the idea that similar effects on the Ca 2ϩ dependence may be mediated in vivo by interactions between the extended tails of SERCA3b and SERCA3c with other regions of the pump. The reduced Ca 2ϩ affinity of SERCA3a and, especially, of the SERCA3b and SERCA3c isoforms raises questions with regard to their physiological significance. In the normal cellular context, they would be most likely inactive, unless they are expressed in a cellular environment characterized by increased Ca 2ϩ concentrations or their Ca 2ϩ transport activities are regulated by as yet unidentified modulators, which are normally not present in COS-1 cells. Recently, it has been documented that SERCA3a is more resistant to peroxide than SERCA2b (56). Further investigations are needed to determine to what extent SERCA3b and SERCA3c share this resistance to oxidative agents. Another issue to be addressed in the future concerns the functional significance, if any, of the amino acid stretch, ACLYP 998 , which is found only in the human SERCA3b and SERCA3c isoforms.