Molecular cloning, mapping to human chromosome 1 q21-q23, and cell binding characteristics of Spalpha, a new member of the scavenger receptor cysteine-rich (SRCR) family of proteins.

CD5 and CD6, two type I cell surface antigens predominantly expressed by T cells and a subset of B cells, have been shown to function as accessory molecules capable of modulating T cell activation. Here we report the cloning of a cDNA encoding Spα, a secreted protein that is highly homologous to CD5 and CD6. Spα has the same domain organization as the extracellular region of CD5 and CD6 and is composed of three SRCR (scavenger receptor cysteine rich) domains. Chromosomal mapping by fluorescence in situ hybridization and radiation hybrid panel analysis indicated that the gene encoding Spα is located on the long arm of human chromosome 1 at q21-q23 within contig WC1.17. RNA transcripts encoding Spα were found in human bone marrow, spleen, lymph node, thymus, and fetal liver but not in non-lymphoid tissues. Cell binding studies with an Spα immunoglobulin (Spα-mIg) fusion protein indicated that Spα is capable of binding to peripheral monocytes but not to T or B cells. Spα-mIg was also found to bind to the monocyte precursor cell lines K-562 and weakly to THP-1 but not to U937. Spα-mIg also bound to the B cell line Raji and weakly to the T cell line HUT-78. These findings indicate that Spα, a novel secreted protein produced in lymphoid tissues, may regulate monocyte activation, function, and/or survival.

Leukocyte function is regulated by a discrete number of cell surface and secreted antigens that govern leukocyte activation, proliferation, survival, cell adhesion and migration, and effector function. Among the proteins that have been shown to regulate leukocyte function are members of the SRCR 1 family. This family of proteins can be divided into two groups based upon the number of cysteine residues per SRCR domain, in-tron-exon organization, and domain organization (1). Group B includes the cell surface proteins CD5 (2) and CD6 (3), which are predominantly expressed by thymocytes, mature T cells, and a subset of B cells, WC1 (4,5), which is expressed by ␥␦ T cells in cattle, and M130 (6), which is expressed by activated monocytes. Of these, only CD5 and CD6 have been studied extensively. Monoclonal antibody (mAb) cross-linking studies suggest that both CD5 and CD6 can function as accessory molecules capable of modulating T cell activation (7,8). The role of CD5 and CD6 in the regulation of T cell function is further supported by the finding that following T cell activation, Tyr residues in the cytoplasmic domain of these two proteins are transiently phosphorylated. This provides a molecular mechanism whereby the cytoplasmic domains of both CD5 and CD6 can interact with intracellular SH2 containing proteins involved in signal transduction (9). Furthermore, phenotypic analysis of a CD5-deficient murine strain showed that its T cells are hyper-responsive to stimulation (10,11), suggesting that CD5 expression is required for the normal regulation of T cell receptor (TCR)-mediated T cell activation.
CD5 and CD6 are structurally the most closely related members of the group B SRCR family of proteins (1). They are both type I membrane proteins whose extracellular region is composed of three SRCR-like domains, each containing eight cysteine residues that are thought to form intrachain disulfide bonds. The extracellular domains of CD5 and CD6 are anchored to the cell membrane via a hydrophobic transmembrane domain and a long cytoplasmic domain. It has been reported that CD5 binds to the B cell antigen CD72 (12) and to CD5L (13), an antigen which is transiently expressed by activated B cells and has yet to be fully characterized. CD6 has been shown to bind to the leukocyte activation antigen ALCAM (activated leukocyte cell adhesion molecule). Unlike CD5 and CD6, which are closely related, CD72 and ALCAM are not homologous. CD72 is a type II membrane protein that is homologous to the C-type lectins; however, a carbohydrate binding activity for CD72 has not been reported. ALCAM is a type I membrane protein whose extracellular region is composed of five Ig-like domains (14). The regions of CD5 and CD72 involved in their interaction have not been identified. Studies with truncated forms of both CD6 and ALCAM have shown that the interaction between these two proteins is primarily mediated by the membrane proximal SRCR domain of CD6 and the aminoterminal Ig-like domain of ALCAM (15,16).
Here we report the cloning, chromosomal mapping, and cell binding properties of Sp␣, a novel member of the SRCR family of proteins. Sp␣ is expressed in lymphoid tissues and has the same domain organization as the extracellular regions of both CD5 and CD6. Binding studies with an Sp␣ Ig fusion protein were carried out to identify cells expressing a putative receptor for Sp␣. FIG

Cloning of Sp␣
An expressed sequence tag (EST) data base screen for potential new SRCR domain-containing genes revealed a novel gene in EST clone number 201340. The partial (Sp␣) clone from a fetal liver-spleen was purchased from Research Genetics and used to screen a human spleen library (Clontech HL5011a) by plaque hybridization for full-length cDNAs. Approximately 1 ϫ 10 6 clones were plated onto 20 150-mm plates and transferred to Hybond Nϩ nylon membranes (Amersham Life Science, Inc., rpn132b) as per manufacturer instructions. Membranes were UV cross-linked and hybridized by the method of Church (17). The hybridization probe was a radiolabeled EcoRI fragment digested from the EST clone 201340. The EcoRI fragment contained base pairs 1-1594 and was radiolabeled with [ 32 P]dCTP (Amersham Life Science, Inc.) using a random labeling kit (Boehringer Mannheim). Membranes were washed at 60°C using high stringency wash buffer and exposed to Kodak x-ray film (X-Omat AR). A subset of positive plaques were then replated and rescreened. After three rounds of screening, ten individual clones were obtained, of which two were full-length. Both of these clones were sequenced in both directions using the dideoxy method.

Northern Blot
One cell line and two tissue Northern blots were purchased from Clontech (Nos. 7757-1, 7766 -1, and 7754 -1, respectively) and hybridized in 50% formamide at 42°C according to manufacturer instructions. Radiolabeled Northern blot probes were prepared as outlined above. mRNA normalization probes were either GAPDH or ␤-actin. Positive blots were washed under high stringency conditions. Blots were exposed to Kodak x-ray film (X-Omat AR).

Chromosomal Mapping
Somatic Cell Hybrids and PCR Amplifications of Sp␣-Human Sp␣ was localized to a human chromosome using a panel of 17 human-Chinese hamster hybrid cell lines derived from several independent fusion experiments (18). PCR primers used to amplify the human Sp␣ gene sequence were derived from the 3Ј untranslated region, and they are 5Ј-GAGTCTGAACACTGGGCTTATG (forward at nucleotide 1231-1252) and 5Ј-GTAATGGTCTGCACATCTGACC (reverse primer at nucleotide 1431-1452). The PCR conditions were 94°C for 3 min, 35 cycles of 94°C for 30 s, 55°C for 40 s, and 72°C for 1 min followed by 72°C for 7 min.
Two human radiation hybrid (RH) mapping panels, GeneBridge 4 (Whitehead/MIT Genome Center) and Stanford G3 (Stanford Human Genome Center), were used to confirm and further define the localization of the Sp␣ gene. Typing was carried out using the primers and PCR conditions described above. 2 Fluorescence in Situ Hybridization-The chromosomal location of the human Sp␣ gene was independently determined by fluorescence chromosomal in situ hybridization (FISH) (22). Briefly, a genomic DNA clone containing a 2.4-kbp insert of the human Sp␣ genomic sequence in a TA cloning vector was labeled with biotin-16-dUTP by nick-transla-tion using commercial reagents (Boehringer Mannheim). Labeled probe was hybridized at a concentration of 300 ng/l/slide to pretreated and denatured human lymphocyte metaphase chromosomes. Hybridizations were performed in the presence of salmon sperm DNA and human genomic DNA.
After hybridization at 37°C overnight, the slides were washed in 50% formamide in 2 ϫ SSC at 42°C. To detect and amplify specific hybridization signals, slides were reacted with avidin-FITC (Vector Laboratories), washed, and treated with biotinylated goat anti-avidin D antibody (Vector Laboratories) followed by another round of incubation with avidin-FITC. Metaphase chromosomes were analyzed under an Axiophot (Carl Zeiss, Inc.) epifluorescence microscope. Specific hybridization signals were counted only when the fluorescence staining was observed on both chromatids of a chromosome. Digital images were generated using a cooled charge-coupled device camera (Photometrics PM512)/Macintosh computer system, with software supplied by Tim Rand (Yale University). Photographs were produced from PICT files.

Fusion Protein Constructs
DNA corresponding to the translated region of Sp␣ was obtained by PCR using full-length Sp␣ cDNA as template. Primers were designed with restriction sites enabling Sp␣ C-terminal ligation to the hinge, CH2, and CH3 domains of murine IgG 2a (see Fig. 4). All constructs were sequenced to verify correct sequence and correct reading frames. Sp␣-mIg (in the CDM8 expression vector) was transiently expressed in COS cells (23). The soluble Sp␣-mIg was purified from the COS cell supernatant by protein A column chromatography. Following protein A binding, the column was washed extensively with PBS (pH 7.0) and eluted with 4.0 M imidazole (pH 8.0) containing 1 mM each MgCl 2 and CaCl 2 . Proteins were dialyzed extensively with PBS.

Cell Culture
Human cell lines were grown to 0.5-0.9 ϫ 10 6 cells/ml in Iscove's modified Dulbecco's medium (Life Technologies, Inc.) containing 10% fetal bovine serum. Human peripheral blood T, B, and monocytes cells were separated by counterflow centrifugal elutriation.

Flow Cytometry
Approximately 5 ϫ 10 5 cells were incubated on ice for 1 h in 100 l of stain buffer (PBS containing 2% bovine serum albumin fraction V, 0.05% sodium azide, 1 mM each MgCl 2 and CaCl 2 ) containing 20 g/ml Sp␣-mIg fusion protein and 200 g/ml human IgG (Sigma I-8640). Cells were then washed with stain buffer, centrifuged, and aspirated. Following a second wash, cells were then incubated on ice for 1 h in 100 l of stain buffer containing 1:100 diluted FITC-labeled rabbit anti-mouse IgG 2a antibody (Zymed Laboratories, Inc. 61-0212). Cells were then washed twice and resuspended in 0.5 ml of stain buffer. Samples were run on a Becton Dickinson Facscan. Prior to running samples, propidium iodide was added to a final concentration of 1 g/ml. Dead cells were identified as propidium iodide-positive and were gated out and not used in the analysis. Mouse antibodies specific for CD3 (64.1 generously donated by Jeff Ledbetter, Ph.D., T cell, Bristol-Myers Squibb), CD19 (B cell, IOB4a Amak 1313), and CD14 (monocytes, MY4 Coulter 6602622) were used to verify elutriated cells. Second step staining for these antibodies was an FITC-labeled goat anti-mouse IgG (Biosource 4408). Informative hybrids 16 16 15 16 16 16 14 16 15 17 16 17 16 15 17 16 17 17 16 17 16 16  7 Percent discordance 0 19 33

RESULTS
Cloning of Sp␣, a New Member of the SRCR Family of Proteins-We have taken two approaches to isolating novel members of the SRCR family of proteins. The first involves a low stringency DNA hybridization technique, and the other involves a screening of the DNA data bases. This latter approach resulted in the identification of a cDNA fragment from the human EST data base that showed extensive sequence homology with members of the SRCR group B proteins, including CD5, CD6, M130, and WC1. The EST sequence (from fetal liver-spleen) was used as a probe to screen a cDNA library prepared from mRNA isolated from a human spleen. This resulted in the isolation of ten cDNA clones. The two longest clones, 1804 and 2152 bp, respectively, were sequenced in both orientations and found to contain a long open reading frame that encoded a 347-amino acid polypeptide, which had the features of a secreted protein (Fig. 1A). This protein was named Sp␣. Sp␣ contains 19 hydrophobic amino acids at its aminoterminal that act as a secretory signal sequence and are removed from the mature protein as determined by N-terminal sequence of the Sp␣ immunoglobulin fusion protein produced by COS cells. This secretory signal sequence is followed by three cysteine-rich domains, each approximately 100 amino acids in length. These domains are significantly homologous to the cysteine-rich domains found in the SRCR group B family of proteins (Fig. 1B) (1). The third SRCR domain of Sp␣ is followed by an in-frame stop codon. The predicted amino acid sequence of Sp␣ contained no putative N-linked glycosylation sites. The two clones differ in the length of their 3Ј-untranslated regions, where one clone is 348 bp longer. The shorter clone has a poly(A) sequence starting 18 bases downstream from a consensus polyadenylation sequence. The longer clone has two polyadenylation consensus sequences; the first one is identical to the one found in the shorter clone, and the second is located 351 bp downstream from the first site. The longer clone also contains three adenylate/uridylate-rich elements (AREs) in the 3Ј-untranslated sequence located between the two polyadenylation sites.
Amoung the SRCR group B members, the SRCR domain organization of Sp␣ most closely resembles CD5 and CD6 (Fig. 1C).  The Gene Encoding Sp␣ Is Located on Chromosome 1-Genomic DNAs from a panel of 17 human-Chinese hamster hybrid cell lines were analyzed by PCR using primers that specifically amplified human Sp␣ sequence. The expected 222-bp PCR products containing the 3Ј-untranslated region sequence were obtained from human control DNA and from hybrid cell lines that had retained human chromosome 1. As shown in Table I, except for chromosome 1, all other human chromosomes were excluded by this panel. These results indicated that the human Sp␣ gene is located on chromosome 1. Fluorescence in situ hybridization confirmed the Sp␣ assignment to chromosome 1 and refined the physical map position. Based on the localization of the signal on R-banded chromosomes in 22 metaphase spreads, the human Sp␣ gene was assigned to human chromosome bands q21-q23 (Fig. 2).
To confirm this assignment and to map the Sp␣ locus more precisely, two human RH mapping panels were typed by PCR amplification with the Sp␣ specific primers. In the Stanford G3 mapping panel, 9 of 83 RH cell lines were positive for the human-specific Sp␣ gene signal. By maximum likelihood analysis, Sp␣ was placed 45.8 centiRays 10000 (cR) from the STS marker D1S3249. In the GeneBridge 4 mapping panel, 30 of 93 RH cell lines were positive, and Sp␣ was placed 3.0 cR 3000 and 3.1 cR 3000 from the chromosome 1 markers WI-8330 and CHLC.GATA43A04, respectively. The order of markers in this region from centromere to telomere is D1S305-WI-8330-Sp␣-CHLC.GATA43A04-D1S2635. D1S305, WI-8330, CHLC. GATA43A04, and D1S2635 are known markers in the WC1.17 contig (Whitehead Institute/MIT Center for Genome Research), while D1S3249 and D1S2635 are clustered as chromosome 1 bin 69 in the Stanford Human Genome Center RH map. A more distal marker D1S196, which is in Stanford Human Genome Center chromosome 1 bin 75 and WC1.17 contig, was previously mapped to the q22-q23 region (24). These results are consistent with our FISH mapping data that placed Sp␣ at q21-q23. The insertion of Sp␣ into the linkage map will enable the evaluation of this locus as a candidate for genetic disorders.
Sp␣ Is Produced by Lymphoid Tissues-RNA blot analysis using a Sp␣ cDNA fragment as a probe indicated that mRNA encoding Sp␣ is expressed in the spleen, lymph nodes, thymus, bone marrow, and fetal liver but not in peripheral blood leukocytes (PBL) nor appendix (Fig. 3). Hybridizing bands to Sp␣ were also not detected in prostate, testis, uterus, small intestine, and colon (separate blot, data not shown). In all cases, tissues expressing mRNA transcripts encoding Sp␣ expressed three hybridizing transcripts. Three bands in the spleen (Fig.  3) are seen with shorter film exposure. These transcripts are ϳ2.4, 2.1, and 1.8 kbp in length. The 1.8-and 2.1-kbp transcripts correspond in length to the two longest cDNAs isolated from the spleen cDNA library. Presently there is no information as to the nature of these transcripts; however, the finding that two of the cDNAs which we isolated have sizes that are consistent with those seen on the RNA blot suggest that they may all encode Sp␣ but differ from one another in the length of their untranslated regions. It should be noted that we cannot exclude the possibility that one or more of these transcripts may encode closely related proteins.
In an effort to determine which cells might produce Sp␣, we have analyzed several cell lines by Northern blot. The RNA message for Sp␣ was not detected in the following cell lines: HL60, K562, Raji, Molt4, A549, SW480, GA361, HeLa S3, and peripheral blood leukocytes (data not shown).
Binding of Sp␣-mIg to Myeloid Cell Lines and Monocytes-Previously, we had successfully used an Ig fusion approach to identify cells expressing a CD6 ligand (25). These studies eventually led to the isolation of a cDNA encoding a CD6 ligand named ALCAM (14). The successful application of this technique in the isolation of a CD6 ligand and the characterization of the CD6-ALCAM interaction led us to use the same approach to identify cells that express Sp␣ receptors. We prepared a full-length Sp␣ murine IgG 2a (Sp␣-mIg) fusion protein by transient expression in COS cells (Fig. 4).
We began a systematic examination of the ability of Sp␣-mIg to bind to human cell lines using flow cytometry. We observed that the myeloid cell line K-562 bound to Sp␣-mIg but not to a control protein (WC1-mIg) containing the aminoterminal three SRCR domains of bovine WC1 fused to the same constant domain of murine IgG 2a (Fig. 5A, panel A).
Binding of Sp␣ to the K-562 cells was concentration-dependent and saturable (Fig. 5B). Sp␣-mIg also displayed weaker binding to the myeloid cell line THP1 (Fig. 5A, panel B) but not to U-937 cells (Fig. 5A, panel C). Binding of Sp␣-mIg was also observed on the lymphoma B cell line Raji (Fig. 6, panel  A) and also the T cell line Hut78 (Fig. 6, panel C). Binding was not seen with the control protein on these two cell lines (Fig. 6, panel B and D).
These observations led us to examine if the Sp␣-mIg fusion protein could bind peripheral blood mononuclear cells. As shown in Fig. 7, Sp␣-mIg (panels A and D), but not WC1-mIg (panels B and E), bound to peripheral blood monocytes. Binding of Sp␣-mIg was not seen on elutriated peripheral blood T cells (Fig. 8, panels A and D) nor on elutriated B cells (data not shown). The binding of Sp␣-mIg to elutriated monocytes from different donors could always be detected but showed some degree of variability (Fig. 7, panels A and D). DISCUSSION We have been interested in studying the structure and function of CD5 and CD6 and their regulatory role in the immune system. A large body of in vitro data suggests that these proteins play an important role in regulating T cell activation and, in the case of CD6, T cell development. The isolation and functional characterization of novel proteins that are closely related to CD5 and CD6 might provide further insights on the function and structure of this class of proteins. We screened the human EST data base for cDNA fragments that encoded polypeptides, which were homologous to CD5 and CD6, and identified a cDNA fragment encoding Sp␣. Analysis of fulllength cDNA clones encoding Sp␣ suggests that Sp␣ is a secreted protein that has the same domain organization as the extracellular region of CD5 and CD6. However, a detailed comparison of the amino acid sequence of SRCR domains of Sp␣ with all members of the SRCR protein family revealed a closer homology to WC1 and M130. This suggests that Sp␣ may be more closely related to WC1 than CD5 or CD6. Further evidence that points to a more distant evolutionary relationship between Sp␣ and CD5 or CD6 than that between CD5 and CD6 comes from the finding that the genes encoding CD5 and CD6 are found in close proximity on chromosome 11 (26 -29), whereas the gene encoding Sp␣ is located on chromosome 1. Presently there is no information of the genomic localization of the human equivalent of WC1 or M130.
The subgroup of SRCR family members, which contains CD5, CD6, WC1 and M130 (Group B), can be distinguished from other members of the family based on the number of cysteine residues contained within the SRCR domains and the observation that the extracellular domains of each of these proteins are composed exclusively of SRCR domains. More recently, analysis of the genomic organization of the genes encoding some of the members of this subfamily has indicated that a third distinguishing feature of this group of proteins is that each of its SRCR domains is encoded by a separate exon (27,30,31). This is in contrast to the type I macrophage scavenger receptor and related proteins (Group A). The SRCR domains of group A proteins have fewer Cys residues (six instead of eight), and each SRCR domain is encoded by two exons. Preliminary data on the genomic organization of Sp␣ indicates that the second SRCR domain is encoded by a single exon. 3 Based on these criteria, we propose that Sp␣ be considered a member of the SRCR Group B family of proteins.
RNA blot analysis indicates that transcripts encoding Sp␣ are exclusively expressed in lymphoid tissues. However, it appears that leukocytes do not express this protein. This finding indicates that Sp␣ may be produced by specialized epithelial and or endothelial cells in lymphoid tissues. The observation that Sp␣ is expressed in bone marrow, thymus, and fetal liver, as well as in the spleen and lymph nodes, implicates this protein in processes responsible for both the development and maintenance of the lymphoid compartment. Studies are currently underway to identify the cells that make this protein and factors that are involved in regulating its expression. The Northern blot probed with Sp␣ showed three bands. Based on our analysis of two different cDNAs encoding Sp␣, it appears that at least two of these transcripts correspond to mRNAs encoding Sp␣ and differ in the length of their 3Ј-untranslated regions. We also observed a significant difference in the 3Јuntranslated region of these Sp␣ mRNAs. We found that the longer clone contained three consensus ARE elements (AUUUA). ARE elements are located within the 3Ј-untranslated region of mRNAs and have been found to be the most common determinant of RNA stability (32,33). Messenger RNAs encoding cytokines and transcription factors, among others, contain these elements, which provide an additional mechanism for the regulation of protein expression by directing the stability and, therefore, half-life of the mRNA encoding the protein. The finding that at least one of the mRNAs encoding Sp␣ contains ARE motifs suggests that the expression of this protein might be tightly regulated.
Preliminary studies designed to identify cells that bind Sp␣ and are the target of its activity revealed that some resting myeloid cell lines, as well as peripheral blood monocytes, are capable of binding Sp␣. Sp␣-mIg was also found to bind to the B cell line Raji and also the T cell line Hut78. These studies were carried out using an Sp␣ immunoglobulin fusion protein, and thus, the possibility existed that the interaction between this protein and the myeloid cell lines and monocytes, which are known to express high levels of Fc receptors, was mediated via the Ig portion of the molecule rather than the Sp␣ moiety. This is unlikely for the following reasons. 1) Two Ig fusion control proteins, WC1-mIg (SRCR Group B member) and human ALCAM-mIg, showed no binding; and 2) the interaction between the Sp␣-mIg and myeloid cell lines and peripheral blood monocytes was detected in the presence of a vast excess of human IgG (up to 2 mg/ml) present in the binding studies.
The isolation of cDNAs encoding Sp␣, the preparation of Sp␣ immunoglobulin fusion proteins, and the identification of cells that express putative receptors will provide the basis for future studies on the structure and function of this new member of the SRCR family of proteins. The finding that this protein is expressed in lymphoid organs involved in the development of the lymphoid compartment as well as in immune surveillance, in conjunction with the observation that peripheral blood monocytes are capable of binding Sp␣, suggests that this protein may play an important role in regulating the immune system.