Identification and cDNA Cloning of a Novel RNA-binding Protein That Interacts with the Cyclic Nucleotide-responsive Sequence in the Type-1 Plasminogen Activator Inhibitor mRNA*

, Incubation of HTC rat hepatoma cells with 8-bromo-cAMP results in a 3-fold increase in the rate of degradation of type-1 plasminogen activator inhibitor (PAI-1) mRNA. We have reported previously that the 3 * -most 134 nt of the PAI-1 mRNA is able to confer cyclic nucleotide regulation of message stability onto a heterologous transcript. R-EMSA and UV cross-linking experiments have shown that this 134 nt cyclic nucleotide-responsive se- quence (CRS) binds HTC cell cytoplasmic proteins ranging in size from 38 to 76 kDa. Mutations in the A-rich region of the CRS both eliminate cyclic nucleotide regulation of mRNA decay and abolish RN-protein complex formation, suggesting that these RNA-binding proteins may be important regulators of mRNA stability. By se-quential R-EMSA and SDS-PAGE we have purified a protein from HTC cell polysomes that binds to the PAI-1 CRS. N-terminal sequence analysis and a search of protein data bases revealed identity with two human sequences of unknown function. We have expressed one of these sequences in E. coli and confirmed that the recombinant protein interacts specifically with the PAI-1 CRS. Mutation of the A-rich portion of the PAI-1 CRS reduces binding by the recombinant PAI-1 RNA-binding protein. The amino acid sequence of this protein includes an RGG using SEG (27), since its RGG repeats nonselectively retrieve many unrelated proteins with similar low-complexity regions. Gibbs sampling as implemented in a computer program PROBE (28) was used to identify conserved se- quence regions shared by all members of the family. This ungapped multiple alignment provided a seed for another round of BLAST searches in an attempt to identify distantly related sequences. Finally, we used HMMER programs (29) to produce a Hidden Markov Model, which was also used to search the protein data base. Both BLAST seed and Hidden Markov Model of this protein family are available to inter- ested researches upon request. Identified sequences were searched against PFAM (30) and Blocks 1 (31) to determine whether they have any previously known sequence motif(s).

Regulation of mRNA stability is an important component of the regulation of gene expression and is known to have a significant role in normal physiology and development (1)(2)(3)(4)(5). Our understanding of the regulation of message degradation has been enhanced by the identification of consensus cis-acting sequences that are involved in determining message stability and of some proteins that interact with them (4,6). Although it is known that many stimuli alter mRNA stability and some cis-acting sequences responsible have been identified, in few cases have trans-acting factors been isolated (2,(7)(8)(9)(10). In contrast, a broad spectrum of RNA-binding proteins that are involved in RNA processing, cellular localization, and translation have been identified, and structural domains involved in RNA recognition have been described (11,12). In many cases RNAbinding proteins contain short signature domains that bind RNA and anchor the protein such that functional domains align (13,14). Much less is known about the mechanisms by which RNA-binding proteins regulate mRNA stability.
Plasminogen activators (PAs) 1 are serine proteases that catalyze the conversion of plasminogen to the broad spectrum protease, plasmin. Plasmin is the major fibrinolytic enzyme in blood and also participates in a number of physiological and pathological processes involving localized proteolysis such as tissue remodeling and tumor cell invasion and metastasis (15,16). PA activity is regulated in large part by type-1 plasminogen activator inhibitor (PAI-1), a 50-kDa glycoprotein found in plasma, platelets, and a variety of cell types (17). PAI-1 expression is regulated by growth factors, cytokines, and hormones, including agents that regulate cellular cAMP levels (18 -20).
HTC rat hepatoma cells synthesize tissue-type plasminogen activator (tPA) and PAI-1. These cells respond to cyclic nucleotides with a dramatic (50-fold) increase in tPA activity secondary to a 90% decrease in PAI-1 activity and mRNA. The decrease in PAI-1 mRNA is due primarily to a 3-fold increase in the rate of mRNA degradation (21). By transfecting HTC cells with chimeric constructs containing the ␤-globin coding sequence and portions of the PAI-1 3Ј-UTR, we have shown that sequences in the PAI-1 3Ј-UTR are able to confer cyclic nucleotide regulation onto the heterologous transcript. Analysis of deletion and insertion constructs demonstrated that the 3Јmost 134 nt of the PAI-1 mRNA by itself is able to mediate this response. This cyclic nucleotide-responsive sequence (CRS) includes a 75-nt U-rich region at its 5Ј end and a 24-nt A-rich * This work was supported by Public Health Service Grant CA22729 from the National Cancer Institute (to T. D. G.). We also acknowledge National Institutes of Health Grant 5 P60 DK-20572 to The University of Michigan Diabetes Research and Training Center for the support of core services. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 1 The abbreviations used are: PA, plasminogen activator; tPA, tissuetype plasminogen activator; PAI-1, type-1 plasminogen activator inhibitor; UTR, untranslated region; CRS, cyclic nucleotide-responsive sequence; RBP, RNA-binding protein; PCR, polymerase chain reaction; nt, nucleotide(s); bp, base pair(s); R-EMSA, RNA electrophoretic mobility shift assay; PAGE, polyacrylamide gel electrophoresis, IPTG, isopropyl-␤-D-galactopyranoside; RRM, RNA recognition motif; KH, Khomology; CAT, chloramphenicol acetyltransferase; EST, expressed sequence tag. region at its 3Ј end (22). RNA electrophoretic mobility shift assay (R-EMSA) experiments have shown that the PAI-1 CRS forms complexes with HTC cell cytoplasmic proteins, and UV cross-linking studies have demonstrated binding proteins ranging in size from 38 to 76 kDa. Most of these proteins interact with the A-rich portion of the PAI-1 CRS. Mutations in the A-rich regions abolish both RNA-protein complex formation and cyclic nucleotide regulation of message decay in transfected HTC cells, suggesting that these RNA-binding proteins may be important regulators of mRNA stability (23).
Here we report the isolation and cloning of one of these proteins and demonstrate its specific interaction with PAI-1 mRNA. Detailed analysis of nucleic acid and protein data bases demonstrates that this PAI-1 mRNA-binding protein includes blocks of sequence that are highly conserved in a number of metazoans. Thus, we have identified and cloned a novel RNAbinding protein that reveals a family of proteins with a previously unidentified domain that may define a new RNA-binding motif. Our results suggest that this protein may play a role in regulation of mRNA stability.
Cell Culture and Polysome Preparation-Monolayer cultures of HTC cells were maintained in Eagle's medium with 5% fetal calf serum and 5% bovine serum as described previously (21). Cells were grown to confluence in T-150 flasks and harvested by trypsinization. HTC cell polysomes were isolated as described previously (23).
Preparation of Radiolabeled and Unlabeled RNA-All DNA constructs used as templates to prepare radiolabeled or unlabeled RNA have been described previously (23). 32 P-Labeled sense strand RNA was prepared by published methods (24) using as template either plasmid DNA linearized with the appropriate restriction enzyme or PCR products, including a T3 RNA polymerase site. The DNA template was incubated for 60 min at 37°C with T3 or T7 RNA polymerase in transcription buffer containing RNase inhibitor, ATP, CTP, GTP, and [ 32 P]UTP (100 Ci of 800 Ci/mmol). The labeled RNA was purified by electrophoresis through a 6% polyacrylamide, 8 M urea gel, eluted from the gel, and ethanol-precipitated. Unlabeled competitor RNA was prepared by in vitro transcription using the Ampliscribe TM T3 transcription kit according the protocol provided by the manufacturer.
R-EMSA and UV Cross-linking Analysis-R-EMSA and UV crosslinking analyses were carried out essentially as described previously (23). For R-EMSA, HTC cell polysomes or purified recombinant protein were incubated with 32 P-labeled RNA (ϳ200,000 dpm/reaction) for 20 min at room temperature in buffer containing 10 mM Hepes (pH 7.6), 5 mM MgCl 2 , 40 mM KCl, 5% gylcerol, 1 mM dithiotreitol, 8 units of RNase inhibitor, and 10 -25 g of tRNA. Following consecutive incubations with RNase T1 (1 unit/l) and heparin (5 g/l), electrophoresis was carried out at 4°C in 5% nondenaturing polyacrylamide gels (40:1 acrylamide-bisacrylamide). For UV cross-linking analyses, proteins were incubated with 32 P-labeled RNA (ϳ10 6 dpm/reaction) as described for R-EMSA. Following the heparin incubation, reactions were exposed to UV light (UV Statalinker 1800, Statagene) at a distance of 2.5 cm for 10 min (1.8 J/cm 2 ) and incubated for 10 min with RNase A and RNase T2. Reactions were subjected to SDS-PAGE and visualized by autoradiography. For competition studies protein was incubated with unlabeled RNA competitor for 10 min at room temperature prior to addition of radiolabeled RNA.
Isolation and Cloning of PAI-1 RNA-binding Protein-HTC cell polysomes (60 g/reaction) were incubated with 32 P-labeled (ϳ10 6 dpm/ reaction) or unlabeled (ϳ7 ng/reaction) CRS RNA and R-EMSA carried out as described above. In each gel, 4 of 16 lanes had R-EMSA reactions with radiolabeled RNA. High molecular weight complexes were located by a 1-h exposure to x-ray film at room temperature. Complexes from labeled and unlabeled reactions were excised and eluted from the gel by overnight incubation in 50 mM Tris (pH 7.9), 0.1 mM EDTA, 5 mM dithiotreitol, 150 mM NaCl, and 0.1% SDS (25). Proteins from 32 such reactions were pooled, concentrated by acetone precipitation, separated on 12% SDS-PAGE, and electrophoretically transferred to a polyvinylidene difluoride membrane (ProBlott, PE Applied Biosystems, Foster City, CA). The membrane was stained with Coomassie Blue and submitted to the Protein and Carbohydrate Structure Facility of The University of Michigan Biomedical Research Core Facility, where N-terminal sequencing was carried out, using a PE Applied Biosystems model 494 protein sequencer.
Expression and Purification of Recombinant Protein-The cDNA encoding "hypothetical protein" (DKFZp564M2423, GenBank TM accession number AL080119) was kindly provided by the Resource Center and Primary Data base of the German Human Genome Project (Resource Center and Primary Database, Berlin). The cDNA was amplified by PCR using a high fidelity PCR system (Roche Molecular Biochemicals) to include the entire protein coding sequence. The forward primer (DKFZ bp 71-95) was synthesized with a mutation at bp 83-85 to create an NdeI site. The reverse primer (DKFZ 1270 -1244) has 1 bp altered to create a BamHI site. The 1200-bp product was inserted into the NdeI/BamHI sites of the vector pET-15b, generating the construct pET-15b/DKFZ for expression of "hypothetical protein" with an Nterminal histidine tag and thrombin cleavage site. Ligation products were transformed into E. coli AG-1 cells, and the subclone was verified by PCR and DNA sequencing. E. coli BL21(DE3)pLysS were transformed with purified pET-15b/DKFZ plasmid, and protein expression was induced with 2 mM IPTG. The recombinant protein was purified by Ni 2ϩ chelation using the His-binding resin according to the supplier's protocol and was concentrated using Centricon 3 concentrators.
Data Base Search and Protein Sequence Alignment-An iterative search of the nonredundant protein data base with PSI-BLAST (26) was carried out to identify proteins with statistically significant similarity to the protein we have isolated. The data base of expressed sequence tags was searched using the gapped version of TBLASTN. Query sequence was filtered for low complexity regions using SEG (27), since its RGG repeats nonselectively retrieve many unrelated proteins with similar low-complexity regions. Gibbs sampling as implemented in a computer program PROBE (28) was used to identify conserved sequence regions shared by all members of the family. This ungapped multiple alignment provided a seed for another round of BLAST searches in an attempt to identify distantly related sequences. Finally, we used HMMER programs (29) to produce a Hidden Markov Model, which was also used to search the protein data base. Both BLAST seed and Hidden Markov Model of this protein family are available to interested researches upon request. Identified sequences were searched against PFAM (30) and Blocksϩ (31) to determine whether they have any previously known sequence motif(s).

Affinity Purification and Identification of PAI-1 mRNA-bind-
ing Protein-To isolate proteins that interact with the rat PAI-1 CRS, we have used an RNA affinity approach, using the 134-nt sequence of the rat PAI-1 CRS shown in Fig. 1A. HTC polysomal proteins were incubated with 32 P-labeled or unlabeled CRS and separated by nondenaturing polyacrylamide gel electrophoresis. The high molecular weight complex (Fig. 1B,  brackets) was excised from the gel, and the proteins were eluted from the gel slices. Proteins from 32 such reactions, representing ϳ2 mg of starting polysomal protein, were pooled, concentrated by acetone precipitation, and separated by SDS-PAGE. Fig. 1C shows that three protein bands are visible in the Coomassie Blue-stained gel. In a parallel experiment, proteins from the SDS-PAGE gel were transferred to polyvinylidene difluoride membrane (ProBlott, PE Applied Biosystems, Foster City, CA), and stained with Coomassie Blue. N-terminal amino acid sequencing of the major band (Fig. 1C, arrow) yielded 19 amino acids of N-terminal sequence.
The sequence was submitted to a BLAST search of the non-redundant protein data base and two entries with 18 identical N-terminal amino acids were found: hypothetical protein (Gen-Bank TM accession number AL080119) and "CGI-55" (Gen-Bank TM accession number AF151813). Both sequences represent human proteins with an as yet unknown function and appear to be splice variants; CGI-55 has a 6-amino acid insertion after position 202 in hypothetical protein (Fig. 1D) (Fig. 1D). In addition, there is a potential protein kinase A phosphorylation site a serine 74, indicating that the protein function could be regulated by cyclic nucleotides.
Expression of PAI-1 RNA-binding Protein in E. coli-The coding region of the cDNA for hypothetical protein was amplified by PCR and subcloned into the pET-15b vector downstream from the sequence coding for a histidine tag and a thrombin cleavage site, and the plasmid was expanded in E. coli AG-1. The subclone was verified by PCR and DNA sequencing and transformed into E. coli BL21(DE3)pLysS. Expression was induced by 2 mM IPTG and the protein partially purified by Ni 2ϩ chelation as described under "Experimental Procedures." Proteins in the column eluate were separated by SDS-PAGE, and as shown in Fig. 2 (lane 3), the major product appears as a triplet. To determine which protein band represents hypothetical protein, we carried out a thrombin digestion, which is expected to cleave at the thrombin site and remove the Nterminal His-tag and associated amino acids. Thrombin digestion resulted in approximately a 2000-dalton decrease in the size of each band in the triplet (Fig. 2, lane 4), demonstrating that all three bands represent the hypothetical protein. The different sizes most likely are the result of premature translation termination due to differences in codon usage frequency between bacteria and eukaryotes (24,33). Western analysis with anti-His antibody confirmed that all three bands carry the His-tag. NorthWestern analysis demonstrated that a protein that migrates at the same location binds 32 P-labeled PAI-1CRS (data not shown).
Binding Activity of Recombinant Hypothetical Protein-To confirm that the recombinant human hypothetical protein binds to the rat PAI-1 CRS, we have carried out both R-EMSA and UV cross-linking experiments. Purified recombinant protein was incubated with 32 P-labeled PAI-1 CRS and R-EMSA carried out as described under "Experimental Procedures." Fig.  3A illustrates that the recombinant protein binds in a concentration-dependent fashion (lanes 1-5). Binding is competed by increasing amounts of identical unlabeled sequence (lanes 6 -8) but not by a 100-fold molar excess of the bacterial CAT RNA sequence (lane 11). The U-rich region of the PAI-1 CRS (nt 2926/3024), which by itself does not form the major R-EMSA complex with HTC cell cytoplasmic proteins (23), competes only Similar results were obtained by UV cross-linking experiments. As shown in Fig. 3B, the RNA-protein complex migrates with an apparent molecular mass of ϳ50 kDa, similar in size to the major band seen with HTC cytoplasmic proteins (23). Binding is dependent on protein concentration (Fig. 3B, lanes 1-5) and is competed by unlabeled CRS (lanes 6 -8), but not by CAT (lane 11) or PAI-1 2125/2296 (lane 10). The presence of faster migrating bands observed in the presence of unlabeled PAI 2125/2296 suggests that the apparent decrease in the binding complex is probably caused by cross-linking between the labeled probe and the unlabeled competitor RNA rather than by competition for the same binding protein. The U-rich region of the PAI-1 CRS (nt 2926/3024) at 100-fold molar excess competes somewhat, but less well than the full sequence. These results confirm that the protein isolated from an R-EMSA complex is a PAI-1 RNA-binding protein, which we have named PAI-RBP1. Interestingly, the recombinant protein, which mi-grates on SDS-PAGE as a triplet, forms only a single UV cross-linked complex with PAI-1 CRS, suggesting that the smaller, C-terminal truncated forms are not able to bind.
Delineation of Sequences Involved in Binding to Recombinant Human PAI-1 RNA-binding Protein-To further define the sequence within the PAI-1 CRS required for binding to the PAI-RBP1, we carried out binding experiments with portions of the CRS. Fig. 4A shows diagrammatically the regions of the CRS used to generate radiolabeled probes. In R-EMSA the 3Ј portion of the CRS (nt 2926/2966, lane 4) and the entire U-rich portion (nt 2926/3024, lane 6) fail to bind. Mutations in the A-rich region of the CRS, which we have shown eliminate binding to HTC cytoplasmic proteins (23), severely decrease or abolish the ability of the RNA to bind to the recombinant RBP1 (lanes 8 and 10). As expected, neither labeled CAT mRNA nor the PAI 2125/2296 appears to interact with the RBP1. Thus, the PAI-RBP1 appears to interact primarily with the A-rich portion of the CRS confirming that the newly isolated PAI-RBP1 has sequence specificity and binding properties similar to that of the major 50 -53-kDa HTC cytoplasmic protein (23).
Protein Sequence Alignment and Identification of a Family of Proteins-The initial search of the protein data base using the 19-amino acid sequence produced only the two sequences with the identical N-terminal amino acids. The data base has grown considerably since, and we have now done an extensive search, as described under "Experimental Procedures," for similarity to the entire hypothetical protein and DKFZp564M2423 cDNA sequence. This search has revealed several other proteins with a high degree of similarity, particularly in the C-terminal portion. These similar proteins are found in nine different species, including Arabidopsis, Drosophila, chicken, mouse, and rat (Fig. 5). The availability of new members of the family permitted us to generate a multiple alignment and to identify their common motifs; the alignment revealed that these proteins share several blocks of conserved sequence (Fig. 5). This domain, which we propose has a function in RNA binding, is located at the C-terminal of all the proteins in this family and, apart from a relatively short stretch of RGG repeats, appears to be a compact entity. The C-terminal RGG box is located between the last two sequence blocks and the RG-rich and Argrich regions are N-terminal to the conserved blocks. A number of the other proteins in the family have the RGG box. Thus, PAI-RBP1/hypothetical protein is a member of a family of proteins that share a putative novel RNA binding motif. DISCUSSION We report here the identification of a novel rat cytoplasmic protein isolated based on its ability to form R-EMSA complexes with the rat PAI-1 mRNA cyclic nucleotide-responsive sequence. N-terminal sequence data reveal identity with a human sequence, hypothetical protein (GenBank TM accession number AL080119) of unknown function. We have expressed the human sequence in bacteria and analyzed its binding properties. Our results demonstrate that the human protein is a PAI-1 mRNA-binding protein; we now refer to this gene product as PAI-1 RNA-binding protein or PAI-RBP1.
Recombinant human PAI-RBP1 binds the A-rich portion of the rat PAI-1 CRS and fails to interact with the isolated U-rich portion. Mutations of the A-rich sequence severely decrease binding. In addition, the specificity of PAI-RBP1 binding is demonstrated by its failure to form a complex with either the bacterial CAT RNA or an upstream, noncyclic nucleotide-responsive region of the PAI sequence. We have reported previously that the PAI-CRS forms complexes with HTC cell cytoplasmic proteins ranging in size from 38 to 76 kDa (23). The binding specificity and the size of the complex seen in Fig. 3B indicate that PAI-RBP1 is similar to the 50 -53-kDa HTC cell cytoplasmic protein and may, in fact, be the human homologue of that rat protein. Because the PAI-1 mRNA binding site is a cyclic nucleotide-responsive sequence, it would be reasonable to expect that binding activity of PAI-RBP1 may be influenced by its phosphorylation state. The recombinant protein was expressed in bacteria and, therefore, not posttranslationally processed as it might be in mammalian cells, possibly explaining the high concentration of protein required for binding.
As seen in Fig. 2, the partially purified protein migrates on SDS-PAGE as a triplet. The size of the smaller products is FIG. 5. Protein sequence alignment. An iterative search of the nonredundant protein data base was carried out with PSI-BLAST, and the multiple sequence alignment was produced using Gibbs sampling function of PROBE (28). Some family members were added from an alignment generated by HMMER (29). The figure was made using ALSCRIPT (46). Numbers on the left are GenBank TM identifier codes for proteins. The first and last amino acid residues of the aligned region are numbered preceding the first block and following the last block, respectively. Numbers in parentheses indicate the size of gaps between blocks. Common protein names from their data base annotations are shown at the end of the alignment, followed by species names. The sequence from Rattus Norvegicus (clone C426 intestinal epithelium) has question marks in place of amino acids numbers because the sequence is incomplete. Amino acids are colored differently from the background when at least 90% of the residues conform to a consensus. The following color scheme was used: hydrophobic residues (ACFGHIKLMRTVWY) are dark blue; aliphatic residues (ILV) are green; polar residues (CDEHKNQRST) are red; small residues (ACDGNPSTV) are purple; charged residues (DEHKR) are orange. Individual residues with more than 90% identity across the whole alignment are highlighted in yellow.
consistent with a C-terminal truncation at the beginning of or within the RGG box. One of the arginines and four of the glycines in this region are coded by rare codons in bacteria and could cause premature termination (24,33). Only a single RNA-protein complex is detected in UV cross-linking experiments (Fig. 3A), strongly suggesting that the most C-terminal portion is required for RNA recognition.
A number of proteins that interact with RNA transcripts have been identified and structural domains involved in RNA recognition described (11,12). While the majority of known RNA-binding proteins function in RNA processing or translation, several proteins that interact with mRNAs and influence transcript stability have been identified. The iron response element-binding proteins were among the first for which a clear relationship between binding and mRNA stability has been demonstrated (10). Members of the Elav family of proteins, including HuR and HuB (HelN1), bind RNA through RRMs and have been reported to stabilize AU-rich element-containing RNAs (8,34,35). In contrast, the AUF-1 family of proteins (also called hnRNP D), bind AU-rich element regions and enhance degradation of the transcript (36,37). AUF-1 proteins have two RRMs, as well as a C-terminal RGG that appears to be important for binding (38 -40). ␣CP-1 and ␣CP-2 are KH domain (41) poly(rC)-binding proteins and play an important role in regulation of ␣-globin and tyrosine hydroxylase mRNA stability (2,42). Vigilin is a KH-domain protein that is induced in Xenopus oocytes by estrogen and binds to and stabilizes vitellogenin mRNA (7).
PAI-RBP1 does not have a recognizable RRM or KH domain. It does, however, have an RGG box at amino acid positions 343-359 (Fig. 1D), as well as an RG-rich (amino acids 126 -137) and an Arg-rich (amino acids 163-184) motif, which places it in the general category with RNA-binding proteins (11,12). It is of particular interest that this protein, which may be involved in a cyclic nucleotide-mediated regulation event, has a potential protein kinase A site (RKES) at serine 74.
We have identified several additional proteins from the data base that share significant similarity with PAI-RBP1 in the C-terminal region (Fig. 5). Interestingly, UV cross-linking experiments, which show a single protein-RNA complex, suggest that the most C-terminal region is required for RNA binding. All members of the protein family shown in Fig. 5 were used to search PFAM, a large data base of more than 2000 protein domain families (30), and failed to retrieve any previously known protein domain. Furthermore, these proteins did not show any significant matches when searched against a collection of short sequence motifs contained in Blocksϩ (31). Taken together, these results argue that PAI-RBP1 and its homologues comprise a novel family of proteins. Because PAI-RBP1 was isolated based on its RNA binding property, and because most members of the family have a similar RGG box in the C-terminal region and an additional Arg-rich motif, we suggest they constitute a family of novel RNA-binding proteins. In fact, two of the identified proteins are annotated in the protein data base as nuclear RNA-binding proteins.
We also searched the data base of expressed sequence tags (ESTs) using TBLASTN and found evidence that the C-terminal, putative RNA-binding domain is conserved in more than 10 different metazoan species, which include, in addition to those shown in Fig. 5, Xenopus, zebrafish, and pig. A search of the human EST data base shows that the mRNA is expressed in a wide range of tissues, including pancreas, liver, lung, muscle, ovary, and brain. These findings strongly suggest that PAI-RBP1 has a more general biological role involving regulation of mRNA stability or processes requiring interaction with RNA. Thus far homologues of RBP1 have been found only in metazoans, and their role may be related to regulatory mechanisms that were developed after the emergence of multicellular organisms. However, a protein of statistically marginal similarity to PAI-RBP1 is found in yeast. This protein, Stm1p, binds G-quartet DNA and purine motif DNA triplets, but not double-stranded DNA (43). While it remains possible that Stm1p can bind RNA, its limited similarity to PAI-RBP1 is probably a relic of divergent evolution.
PAI-RBP1 appears to have four splice variants (two alternative splice sites). CGI-55 differs from PAI-RBP1 (hypothetical protein) by the insertion of 6 amino acids at position 202. An incomplete sequence for a rat homologue of unknown function from intestinal epithelium (44) (GenBank TM accession number U21718) has also been reported. This cDNA codes for an additional 15 amino acid sequence inserted after amino acid position 226 of PAI-RBP1 (Fig. 1D). Using primers complementary to the human PAI-RBP1, we have cloned the rat homologue from an HTC cell cDNA library (45). This cDNA sequence indicates a variant that includes both the 6 amino acid and 15 amino acid inserts. In addition, a scan of the human genome found ESTs that include the 15-amino acid insertion with and without the 6 amino acid insert. Thus, four variants of the rat and human PAI-RBP1 are possible. Such variations could alter RNA binding properties and/or the function of the RNA-protein complex. Experiments are in progress to construct each of these variants and to analyze their RNA binding properties.