The tight junction protein ZO-2 contains three PDZ (PSD-95/Discs-Large/ZO-1) domains and an alternatively spliced region.

The complete cDNA sequence for canine ZO-2, a tight junction-specific protein, is presented. A single open reading frame encodes a polypeptide of 1,174 amino acids with a predicted molecular mass of 132,085 daltons. As noted previously (), ZO-2 is a member of the membrane-associated guanylate kinase-containing (MAGUK) protein family, a family which includes an additional tight junction-associated protein, ZO-1. These proteins contain a region homologous to guanylate kinase, an SH3 domain, and variable numbers of PSD-95/discs-large/ZO-1 (PDZ) domains, shown to be involved in protein-protein interactions. ZO-2 and ZO-1 contain three PDZ domains in the N-terminal half of the molecule. Between the first and second PDZ domains, ZO-2 displays a basic region (pI = 10.27) containing 22% arginine residues. Both ZO-1 and ZO-2 have proline-rich C-terminal regions that are not homologous to other MAGUK family members. Sequence analysis of multiple ZO-2 cDNAs reveals a 36-amino acid domain in this C-terminal region present in only some of the cDNAs. Overall, ZO-2 is highly homologous to ZO-1, showing 51% amino acid identity; however, the C-terminal ends of the molecules show only 25% amino acid identity. This suggests that the C-terminal ends of ZO-1 and ZO-2 have different functions.

The complete mouse (12) and human (13) protein and cDNA sequences of ZO-1 have been published. A partial sequence for ZO-2 has also been characterized (1). Analyses of these sequences indicate that ZO-1 and ZO-2 share significant homology with each other and with several other proteins, including the lethal(1)discs-large-1 (dlg) tumor suppressor gene product (dlg-A) of Drosophila (14), erythrocyte membrane-associated p55 (15), and PSD-95/SAP90, a protein found at brain presynaptic membranes (16,17). These proteins share several conserved regions, including a region homologous to guanylate kinase (GUK), 1 an enzyme which converts GMP to GDP, a single src homology (SH3) domain, hypothesized to be involved in protein-protein interactions necessary for signal transduction (18,19), and a variable number of N-terminal repeats termed PDZ domains (from PSD-95, discs-large, ZO-1), shown to bind integral plasma membrane proteins (20 -23). As all proteins in this group are associated with the plasma membrane, they have collectively been termed the MAGUK family (membrane-associated guanylate kinase-containing) (24).
ZO-1 and ZO-2 are different from other family members but similar to each other in that they display C-terminal acidic and proline-rich regions (1,13). ZO-1 also contains an alternatively spliced region, termed the ␣ motif, in the C-terminal region (25,26). The exact function of this splice domain remains unknown.
Here we present the complete sequence encoding ZO-2. We find that ZO-2, like ZO-1, contains three N-terminal PDZ domains. ZO-2 displays a basic (pI ϭ 10.27), arginine-rich domain between PDZ1 and PDZ2, similar to a previously unnoted region in ZO-1. Finally, ZO-2 also contains an alternatively spliced region near the C terminus which we call the ␤ motif. Amino acid (aa) sequence comparison between ZO-2 and ZO-1 indicates a high degree of homology, although divergence at the C termini of the molecules suggests these proteins will have functional differences.

EXPERIMENTAL PROCEDURES
Library Screening-To extend the cDNA sequence for ZO-2, a hybridization probe corresponding to the 5Ј-most sequence of the partial ZO-2 cDNA (1) was generated by polymerase chain reaction using oligonucleotides 5Ј-GCA CGA GAG AGA CGG-3Ј and 5Ј-TTC ATC TTC AGG ACT-3Ј. The resulting 285-bp product was used to rescreen an oriented oligo(dT)-primed Madin-Darby canine kidney (MDCK) cell cDNA library in Uni-ZAP XR (kindly provided by Marino Zerial). Of ϳ500,000 plaques analyzed, 10 ZO-2 cDNAs were isolated, the longest of which (pM7.7) was 4.1 kb. Analysis of the 4.1-kb cDNA sequence and the size of mature ZO-2 transcript in Northern blots (about 5.2 kb) indicated that the 4.1-kb clone was not full-length. We then generated an oriented MDCK cDNA library specific for ZO-2 in Uni-ZAP XR (Stratagene). MDCK poly(A) ϩ RNA was reverse-transcribed with the ZO-2-specific antisense linker-primer 5Ј-GAG AGA GAG AGA GAG AGA GAA CTA GTC TCG AGG TAG TCA TCG TCA TGG TC-3Ј containing an XhoI site. This primer was situated approximately 250 bp downstream of the 5Ј end of the known ZO-2 sequence contained in pM7.7. The cDNA was then ligated with EcoRI adaptors, digested with XhoI, size-fractionated, and ligated into the Uni-ZAP XR vector. The new library was screened with an oligonucleotide containing the sequence 5Ј-TGG CTG CTG CGC CGG CTC CTC TCG CTG TAC CCG CTG CGG GCA CTT CT-3Ј end-labeled with [␥-32 P]ATP (ICN Pharmaceuticals). This sequence is the complement of a known region of ZO-2 located 5Ј of the primer used to create the library. Approximately 400,000 plaques were screened, and positive clones were plaque-purified.
Inserts were sized by restriction digestion, and the 3Ј 200 -300 bp were sequenced to positively identify five ZO-2 clones among 28 primary picks. The longest clone was chosen for double-stranded sequencing (Sequenase, United States Biochemical, or Amplicycle, Perkin Elmer).
Sequence Analysis-Nucleotide sequences, isoelectric points of various domains, and the overall molecular weight of the protein were analyzed using the University of Wisconsin Genetics Computer Group (GCG) software package. Amino acid sequence comparisons to determine the percent identity and the percent similarity (% identity ϩ % conserved substitutions) were done using the default parameters in the GCG GAP alignment program based on the algorithm of Needleman and Wunsch (27). The amino acid numbers of the compared sequences are shown in Table I.
Splice Domain Analysis-Total RNA from MDCK cells was isolated using Trizol (Life Technologies, Inc.) and subjected to reverse transcription by avian myeloblastosis virus reverse transcriptase (Promega). The cDNA was then used for PCR using forward (5Ј-CCC CCC GCA TTC AAG CCA G-3Ј) and reverse (5Ј-TGT GTT TTG ATT GGA ACT GCA TAG ATG TC-3Ј) primers from either side of the putative alternatively spliced region (starting at nucleotides 3311 and 3723 of Fig. 1). PCR products were resolved on a 1% agarose gel.

RESULTS
The full-length nucleotide and aa sequence of canine ZO-2 is shown in Fig. 1. This sequence contains a single open reading frame coding for a 1,174-aa polypeptide with a predicted molecular mass of 132,085 daltons, considerably less than the mass of 160 kDa determined by SDS-PAGE (1). The disparity between the predicted molecular mass and that determined by SDS-PAGE could be due to anomalous electrophoretic migration or to post-translational modifications. A full-length sequence for human gene X104 related to the Friedreich ataxia locus was previously published, but no characterization of the gene product was presented (28). A comparison of the X104 aa sequence to canine ZO-2 indicates a high degree of similarity: the canine and human sequences are 87% identical at the aa level (Table II). The similarity is even greater within the characteristic MAGUK and acidic domains, with 92-100% aa identity. This indicates that the X104 gene product is the human homolog of canine ZO-2.
As noted previously (1), ZO-2 is highly homologous to ZO-1, showing an overall aa sequence identity of 51% (Table III). However, as in the comparison of canine and human ZO-2, the similarities between ZO-2 and ZO-1 are greater in the PDZ, SH3, and GUK domains. In addition, both ZO-2 and ZO-1  ZO-2 Complete Sequence: PDZ Domains and Alternative Splice 25724 display acidic regions at the C-terminal end of the GUK domain followed by proline-rich regions. Interestingly, while the acidic domains of ZO-2 and ZO-1 show significant similarity (60% aa identity), the proline-rich, C-terminal regions following the acidic domains are significantly less similar (25% aa identity). Fig. 1 indicates there are three potential in-frame start codons for the canine ZO-2 sequence. Several criteria indicate that the third (nucleotide 398) site is used as the start site. First, the nucleotide sequence surrounding the third site most closely matches the eukaryotic initiation consensus sequence ( Fig. 2) (29). In addition, the 100 aa following this start site closely match both the initial aa sequence of human ZO-1 (70% identity, 80% similarity) and the 100 aa following a methionine at position 24 of the human ZO-2 sequence (99% identity, 100% similarity) (28). Finally, the aa sequence generated from upstream of this third canine ZO-2 initiation codon shows no significant homology with corresponding sequences upstream of the human ZO-1 start site or methionine 24 of human ZO-2. The canine ZO-2 sequence shows no initiation codon corresponding to that published for human X104. Fig. 1 indicates the relative positions of three N-terminal PDZ repeats, the SH3 and GUK domains, and an acidic region. This arrangement of domains is similar to that found in ZO-1 (13). We further note that the region between PDZ1 and PDZ2 of canine ZO-2 is rich in arginine residues (22%) and is highly basic, with a pI of 10.27. Although not previously documented, a similar basic region exists in the corresponding region of ZO-1 (12,13). Fig. 1 also illustrates a 36-aa domain encoded by a 108-bp sequence present in 6 of 10 partial cDNAs from the C-terminal region of canine ZO-2. To confirm that the presence or absence of this sequence resulted from alternatively spliced mRNAs, we designed primers which straddled the 108-bp region and performed RT-PCR using total RNA isolated from MDCK cells. This analysis resulted in two distinct PCR products which ran at the expected sizes (413 and 305 bp; Fig. 3), confirming the presence of at least two unique species of RNA for canine ZO-2. We propose that this alternatively spliced region be called the ␤ motif. It was previously demonstrated that ZO-1 contains an alternatively spliced 80-aa region termed the ␣ motif (25,26). The ZO-1 ␣ motif is larger than that of ZO-2, and an aa comparison of the ZO-1 ␣ and ZO-2 ␤ motifs shows little similarity (17% aa identity; Table III). The similarity between the canine and human ZO-2 ␤ domains (72% aa identity) is lower than that found over the entire length of these molecules (87% aa identity), but similar to that found over the postacidic, C-terminal domains (68% aa identity; Table II).

DISCUSSION
The full-length sequence for canine ZO-2 further confirms that this protein is a member of the MAGUK family. Moreover, ZO-2 shows the same distribution of MAGUK domains as ZO-1, and both ZO-1 and ZO-2 contain unique domains not found in other MAGUK family members, including acidic and prolinerich aa sequences (Fig. 4). This indicates that the tight junction contains at least two proteins with a high degree of similarity.
A comparison of the canine ZO-2 aa sequence presented here with that from a previously published human gene (X104) related to Friedreich ataxia (28) indicates that these proteins are homologs. However, several differences in the sequences of the proteins from these two species are apparent. The canine cDNA sequence contains three potential, in-frame start codons. We believe the third ATG is the most likely initiation site based on 1) the highest degree of similarity to the eukaryotic initiation consensus sequence (Fig. 2) (29) and 2) a comparison of the   FIG. 2. Comparison of canine (c) and human (h) predicted and rejected translation initiation sites for ZO-2 based on match with eukaryotic initiation consensus sequence (29). Initiation codon of consensus sequence is boxed. The predicted initiation regions match at 7 of 10 nucleotides in the consensus sequence, including the two critical sites (asterisks), and they are identical to each other in all 10 nucleotides. Capital letters indicate a match with the consensus sequence. The number in brackets indicates the order in which the sites occur in the sequence from the 5Ј direction. hZO-2(1) is the initiation site published by Duclos et al. (28).  aa sequences surrounding the start sites of canine and human ZO-2 and human ZO-1. Surprisingly, we found no match for the published start site of human ZO-2 in our canine sequence. In addition, the canine ZO-2 is 69 aa longer than human ZO-2 based on a best-fit amino acid sequence comparison (GCG GAP), and the 35 aa immediately preceding the termination of human ZO-2 show a lower aa similarity (13%) to the corresponding region of canine ZO-2 than is present over the rest of the molecules. Combined, these results suggest species-dependent divergence of the canine and human ZO-2 sequences. A comparison of the amino acid sequences of canine ZO-2, human ZO-2, and human ZO-1 shows a high degree of similarity between these proteins (Tables II and III). The similarities are particularly high in the characteristic MAGUK domains, suggesting that these domains are functionally important. Of all the MAGUK protein domains, data demonstrating functional relevance have been provided for only the PDZ region. PDZ domains in PSD-95/SAP90 have been shown to bind to the N-methyl-D-aspartate receptor (20), Shaker-type potassium channels (21) and neuronal type nitric-oxide synthase (nNOS) (22). The PSD-95/SAP90-nNOS interaction appears to be mediated by PDZ domains in each molecule (22). Likewise, PDZ-PDZ domain interactions are implicated in the binding of nNOS to syntrophin, part of the dystrophin protein complex in skeletal muscle (22). The aa sequences of the N-methyl-D-aspartate receptor and Shaker-type potassium channel responsible for interacting with the PDZ domain have been further clarified as an N-terminal (S/T)XV sequence (20,21). The roles of the SH3 and GUK domains in any of the MAGUK proteins are unknown. The SH3 domain is believed to function in protein-protein interactions related to signal transduction (18,19), but no data are available yet in this regard for MAGUK family members. The GUK domain may also be involved in signal transduction as well as in tumor suppression (13,14), but again, no direct information is available.
Unlike other MAGUK proteins, both ZO-2 and ZO-1 contain acidic, proline-rich, and alternatively spliced domains at the C-terminal end of the proteins. The similarities between the acidic domains of canine ZO-2, human ZO-2, and human ZO-1 are high (Tables II and III). However, in similar comparisons, the downstream proline-rich aa sequences, including the splice domains, show much lower similarity, suggesting that these regions may play different roles in ZO-1 and ZO-2.
Although no data are shown, alternate splicing in the region of the third PDZ domain is described for human ZO-2 (28). Using primers from either side of this putative splice site in the canine ZO-2 sequence, we detected only one product of the expected size and sequence in RT-PCR analysis of total MDCK cell RNA (data not shown). It is possible that the canine and human versions of ZO-2 differ in splicing. The precise roles of the acidic and splice regions remain to be elucidated.