EMILIN, a Component of the Elastic Fiber and a New Member of the C1q/Tumor Necrosis Factor Superfamily of Proteins*

EMILIN (elastinmicrofibril interface located protein) is an extracellular matrix glycoprotein abundantly expressed in elastin-rich tissues such as blood vessels, skin, heart, and lung. It occurs associated with elastic fibers at the interface between amorphous elastin and microfibrils. Avian EMILIN was extracted from 19-day-old embryonic chick aortas and associated blood vessels and purified by ion-exchange chromatography and gel filtration. Tryptic peptides were generated from EMILIN and sequenced, and degenerate inosine-containing oligonucleotide primers were designed from some peptides. A set of primers allowed the amplification of a 360-base pair reverse transcription polymerase chain reaction product from chick aorta mRNA. A probe based on a human homologue selected by comparison of the chick sequence with EST data base was used to select overlapping clones from both human aorta and kidney cDNA libraries. Here we present the cDNA sequence of the entire coding region of human EMILIN encompassing an open reading frame of 1016 amino acid residues. There was a high degree of homology (76% identity and 88% similarity) between the chick C terminus and the human sequence as well as between the N terminus of the mature chick protein where 10 of 12 residues, as determined by N-terminal sequencing, were identical or similar to the deduced N terminus of human EMILIN. The domain organization of human EMILIN includes a C1q-like globular domain at the C terminus, a collagenous stalk, and a longer segment in which at least four heptad repeats and a leucine zipper can be identified with a high potential for forming coiled-coil α helices. At the N terminus there is a cysteine-rich sequence stretch similar to a region of multimerin, a platelet and endothelial cell component, containing a partial epidermal growth factor-like motif. The native state of the recombinantly expressed EMILIN C1q-like domain to be used in cell adhesion was determined by CD spectra analysis, which indicated a high value of β-sheet conformation. The EMILIN C1q-like domain promoted a high cell adhesion of the leiomyosarcoma cell line SK-UT-1, whereas the fibrosarcoma cell line HT1080 was negative.


EMILIN (elastin microfibril interface located protein)
is an extracellular matrix glycoprotein abundantly expressed in elastin-rich tissues such as blood vessels, skin, heart, and lung. It occurs associated with elastic fibers at the interface between amorphous elastin and microfibrils. Avian EMILIN was extracted from 19-dayold embryonic chick aortas and associated blood vessels and purified by ion-exchange chromatography and gel filtration. Tryptic peptides were generated from EMI-LIN and sequenced, and degenerate inosine-containing oligonucleotide primers were designed from some peptides. A set of primers allowed the amplification of a 360-base pair reverse transcription polymerase chain reaction product from chick aorta mRNA. A probe based on a human homologue selected by comparison of the chick sequence with EST data base was used to select overlapping clones from both human aorta and kidney cDNA libraries. Here we present the cDNA sequence of the entire coding region of human EMILIN encompassing an open reading frame of 1016 amino acid residues. There was a high degree of homology (76% identity and 88% similarity) between the chick C terminus and the human sequence as well as between the N terminus of the mature chick protein where 10 of 12 residues, as determined by N-terminal sequencing, were identical or similar to the deduced N terminus of human EMILIN. The domain organization of human EMILIN includes a C1q-like globular domain at the C terminus, a collagenous stalk, and a longer segment in which at least four heptad repeats and a leucine zipper can be identified with a high potential for forming coiled-coil ␣ helices. At the N terminus there is a cysteine-rich sequence stretch similar to a region of multimerin, a platelet and endothelial cell component, containing a partial epidermal growth factor-like motif. The native state of the recombinantly expressed EMILIN C1q-like domain to be used in cell adhesion was determined by CD spectra analysis, which indicated a high value of ␤-sheet conformation. The EMILIN C1q-like domain promoted a high cell adhesion of the leiomyosarcoma cell line SK-UT-1, whereas the fibrosarcoma cell line HT1080 was negative.
Elastic fibers are major constituents of the extracellular matrix (ECM), 1 confer to connective tissues the properties of resilience and elastic recoil, and are remarkable for the diverse range of tissues in which they are found. Elastic fibers can be identified in the ECM of many tissues as solid branching and unbranching fine and thick rod-like fibers (in elastic ligaments) or as concentric sheets of lamellae (in blood vessels) or in three-dimensional meshworks of fine fibrils (in elastic cartilage) or as a combination of these (in skin and lung) (1). Electron microscopy has provided additional insights into the structure of elastic fibers, which are composed of two morphologically distinguishable components: an amorphous core lacking any apparent regular or repeating structure (2) and a microfibrillar component (3) consisting of fibrils of 12-13 nm in diameter that are located primarily around the periphery of the amorphous core but to some extent are also interspersed within it.
We had originally isolated from chick aorta a novel glycoprotein component associated with the ECM of blood vessels, gp115 (18), later christened EMILIN (19). The major characteristics of this protein were the following. EMILIN was preferentially extracted from tissues using buffers containing guanidine HCl and reducing agents; it formed a fibrillar network in the ECM of aorta; the amino acid composition was characterized by a high content of glutamic acid and arginine (18). Subsequent studies have established that (i) EMILIN is broadly expressed in connective tissues and is particularly abundant in blood vessels, skin, heart, lung, kidney, and cornea, whereas it is undetectable in the serum (20 -22); (ii) the protein is synthesized by aortic smooth muscle cells and by tendon fibroblasts, and it is deposited extracellularly as a fine network (18,23); (iii) soon after secretion, EMILIN undergoes intermolecular cross-linking by disulfide bonds, giving rise to high molecular weight aggregates (23); (iv) EMILIN is a component of elastic fibers and is localized mainly at the interface between amorphous elastin and microfibrils (19); (v) finally and more important for the functional significance of EMILIN, the process of elastin deposition in vitro is perturbed by the addition of anti-EMILIN antibodies in the culture medium (19). Therefore, given the close co-distribution of elastin and EMI-LIN, the fine localization at the interface between elastin and microfibrils, and the interference with the deposition of elastin in vitro, it is likely that EMILIN plays a fundamental role in the process of elastogenesis also in vivo. To initiate addressing the question of the functional role of EMILIN directly, we have sought to clone its gene.
We report here the cDNA sequence and the analysis of the deduced amino acid sequence of human EMILIN. This glycoprotein, a new member of the C1q/TNF superfamily of proteins, is characterized by a gC1q-like C-terminal domain, a short collagenous domain, two leucine zippers, and an extended coiled-coil region that is uniquely shared with another member of this superfamily, multimerin (24). At the N terminus of these two members of the superfamily there is a short region of homology including a partial epidermal growth factor-like motif. In addition, the isolated recombinantly produced EMILIN gC1q-like C-terminal domain is able to support cell adhesion.

EXPERIMENTAL PROCEDURES
Purification of Avian EMILIN and Peptide Sequences-EMILIN was purified as described previously (18). Briefly, aortas and associated blood vessels were excised from 19-day-old chick embryos and dropped into ice-cold 20 mM sodium phosphate, pH 7.4, 150 mM NaCl containing the following protease inhibitors: 25 mM EDTA, 2 mM phenylmethylsulfonyl fluoride, 5 mM N-ethylmaleimide, and 1 mM p-aminobenzamidine hydrochloride. All subsequent procedures were carried out at 4°C. The tissues were homogenized, and the pellet obtained after centrifugation at 12,000 ϫ g for 45 min was re-extracted twice in the same buffer and twice in 0.1 M Tris-HCl, pH 7.5, containing 6 M guanidine HCl and protease inhibitors (guanidine buffer). The last pellet was extracted twice in guanidine buffer containing 25 mM dithioerythritol. The last two supernatants were extensively dialyzed against distilled water and lyophilized. The lyophilized extract was reduced and alkylated, and it was then fractionated by DEAE-cellulose chromatography followed by agarose gel filtration. The EMILIN-containing fractions, as assessed by an enzyme-linked immunosorbent assay and by immunoblotting with a specific monoclonal antibody, 147H11 (21), were pooled, dialyzed against distilled water, and lyophilized. An aliquot of the pooled fractions was analyzed by SDS-polyacrylamide gel electrophoresis to check for the presence of contaminating proteins.
EMILIN was then resolved on a 4 -10% SDS gradient gel. The Coomassie-stained band corresponding to EMILIN was excised, cut into small pieces, and washed with 0.2 M NH 4 HCO 3 followed by 0.2 M NH 4 HCO 3 , acetonitrile (1:1). This procedure was repeated twice. The gel pieces were lyophilized for 2 h and then rehydrated by adding three portions of 0.2 M NH 4 HCO 3 containing 0.002% Tween 20 at 5-min intervals. The first aliquot added contained 1 mg of trypsin/100-l gel pieces, and only as much liquid was used as was necessary to restore the original gel volume. Protease digestion was performed overnight at 37°C, and the peptides were extracted twice with 5% trifluoracetic acid and once with 2.5% trifluoracetic acid in 50% aqueous acetonitrile. Alternatively, the gel pieces were repeatedly washed with 50 mM Tris buffer, pH 9.0, and 50 mM Tris buffer, acetonitrile (1:1), and cleavage was carried out with a lysine-specific protease from Achromobacter (Wako) at pH 9.0 and 30°C. The peptides were then separated by high performance liquid chromatography on a reversed phase Nucleosil-120 C18 column using a linear gradient of 0 -70% aqueous acetonitrile in 20 mM ammonium acetate buffer. Sequence analysis was done on a Procise protein sequencing system (Applied Biosystems) according to the manufacturer's instructions. The amino acid sequences obtained were used to search the Swiss-Prot protein sequence data base (Geneva University Hospital and University of Geneva, Geneva, Switzerland).
Primers Design-Four of the nine chick EMILIN peptides sequenced appeared suitable for the purpose of degenerate primers design. A set of degenerate inosine-containing oligodeoxynucleotides based on peptides 4, 6, 8 and 9 were synthesized on an ABI-381A synthesizer (Applied Biosystem) in both the sense and the antisense orientations (Fig. 1). The primers were purified by native polyacrylamide gel electrophoresis (20% gel) using a standard sequencing apparatus. PCR-based Cloning Strategy and Sequencing of Chick and Human EMILIN-Total RNA from aortas of 19-day-old chick embryos was isolated using RNA fast (Molecular System, San Diego, CA). The poly(A)ϩ RNA fraction was purified from total RNA with the use of Oligotex kit (Qiagen GmbH, Germany). The first-strand cDNA was synthesized starting from 1 g of poly(A)ϩ RNA primed with hexanucleotides and reverse-transcribed with 20 units of AMV-RT (Promega Corp., Madison, WI). PCR was performed on a Robocycler Gradient Apparatus (Stratagene, La Jolla, CA) with the degenerate inosinecontaining oligonucleotides used as primers in all the possible different combinations and at different annealing temperatures ranging from 48 to 72°C with 2°C steps. The amplification products were cloned into the pGEM-T vector (Promega Corp.) and sequenced by the dideoxynucleotide chain termination method using the modified T7 polymerase (Sequenase, Amersham Pharmacia Biotech).
To clone the human EMILIN cDNA, the FASTA computer program (25) located at the EBI server (26) was used to search the GenBank™ data base for expressed sequence tags (ESTs) containing sequences homologous to those of chick EMILIN. Several human and mouse entries showed a significative homology to the 3Ј end of the chick EMILIN. Two primers, ESTH-N (5Ј-ATTATGATCCAGAGACAGGC-3Ј) and ESTH-R (5Ј-CCGAGTGCGCCAGCTGCCCC-3Ј), were designed based on the entry HSAA1823 and used in PCR to obtain a human EMILIN-specific probe. The reaction was performed on a template constituted by total kidney RNA primed with hexanucleotides and reverse-transcribed with 20 units of avian myeloblastosis virus reverse transcriptase. The amplification product was cloned into the pGEM-T vector (EST-H clone) and sequenced to confirm its identity. The EST-H insert (290 bp) was labeled by the random primer method with the multiprime kit (Amersham Pharmacia Biotech) and utilized to screen, by the plaque hybridization method, about 300,000 clones of a human kidney cDNA library in the gt10 vector (CLONTECH Laboratories Inc., Palo Alto, CA). The longest insert of the five positive plaques identified, K1, was used to rescreen the library. The second screening yielded only one positive clone, K2, constituted by a fusion between a specific EMILIN cDNA and an unrelated cDNA. Three additional rounds of screening of a human aorta cDNA library resulted in the isolation of clones A1, A2, and A3. Also in this library several clones appeared to be cloning artifacts carrying short EMILIN-specific sequences fused to unrelated cDNAs. The human sequences were performed using the Big Dye terminator cycle sequencing kit and a model 310 DNA sequencing system (Perkin-Elmer Applied Biosystem). To correct for possible TAQ polymerase errors, all sequences were determined from both strands and were repeated on clones obtained from independent PCR products. All human cDNA sequences were confirmed by sequencing the EMILIN gene by an independent analysis of a BAC clone, used to characterize the EMILIN gene. 2 Rapid Amplification of cDNA Ends (RACE)-To determine the sequence extending toward the 3Ј end, the RACE method, using the 5Ј-3Ј RACE kit (Roche Molecular Biochemicals) was applied. Reverse transcriptase of chick aorta poly(A)ϩ RNA was performed with the use of exanucleotides, and the product was subjected to PCR using the EMI-LIN specific sense primer 5Ј-GGAGCCGCTCACCATCTTCAGCGGGG-CCC-3Ј in combination with the anchor-poly(dT) from the kit.
Production of Recombinant Prokaryotic gC1q-like Domain of Human EMILIN-The EMILIN gC1q-like domain was amplified by PCR from Clone K1, ligated in-frame in the 6-His-tagged pQE-30 expression vector (Qiagen GmbH), and grown in M15 cells. To check for errors generated by PCR, all the cloned fragments were sequenced in both directions. M15 cells were centrifuged at 4000 ϫ g for 20 min, and the cell pellet was resuspended in sonication buffer (50 mM sodium phosphate, pH 8.0, 0.3 M NaCl) at 2-5 volumes/g of wet weight. The sample was frozen in a dry ice/ethanol bath, thawed in cold water, and sonicated on ice (1-min bursts/1-min cooling/2-300 watts), and cell breakage was monitored by measuring the release of nucleic acids at A 260 nm. The cell lysate was centrifuged at 10,000 ϫ g for 20 min, the supernatant was collected, and purification of the EMILIN C1q-like domain was performed by affinity chromatography on nickel nitrilotriacetic acid resin (Qiagen GmbH) under native conditions (sonication buffer). The recombinant protein was eluted from the affinity column in sonication buffer, pH 6.0, containing 10% glycerol and 0.2 M imidazole. After dialysis against cold phosphate-buffered saline, the C1q-like domain was used for CD spectra analysis and cell adhesion assays.
Circular Dichroism Spectroscopy-The purified polyhistidine EMI-LIN gC1q-like peptide was used for CD spectroscopy analysis. A Jasco J-600 CD/ORD spectrophotometer interfaced to an Olidata computer for data collection was used for all the measurements. Calibration of the instruments was performed with D(ϩ)-10-camphorsulfonic acid at 290 nm. Standard conditions were 25 mM Na 3 PO 4 , 150 mM NaCl, pH 6.0, 10°C, using a 0.2-cm path-length cuvette. The temperature was controlled by a water bath. UV circular dichroism spectra are presented as millidegrees of ellipticity. The reported results are the smoothed average over 10 measurements. To calculate the secondary structural content, data were transformed in terms of mean residue molecular ellipticity () (deg ϫ cm 2 ϫ dmol Ϫ1 ), based on a mean residue weight of 104.3. Then, the spectra were analyzed using the Menendez-Arias program (27).
Cell Adhesion-The capability of isolated recombinant human EMI-LIN gC1q-like domain to support cell attachment was evaluated using several cell lines and the centrifugal assay for fluorescence-based cell adhesion (CAFCA) (28,29). Briefly, specifically devised six-well strips of flexible polyvinyl chloride (CAFCA miniplates; TECAN Polyfiltronics, Inc., Boston, MA) covered with double-sided tape (bottom miniplates) were coated overnight with recombinant EMILIN gC1q-like fragment and subsequently incubated with 1% heat-denatured bovine serum albumin for 2-4 h at room temperature to block uncovered areas of the plastic. Two human smooth muscle (SK-UT-1 and SK-LMS-1 leiomyosarcomas) and a human fibroblastic (HT-1080 fibrosarcoma) cell lines were fluorescently labeled by incubation with the vital fluorochrome calcein AM (Molecular Probes Europe BV, Leiden, The Netherlands), rinsed extensively in Ca 2ϩ -and Mg 2ϩ -free phosphate-buffered saline, and then aliquoted into the bottom CAFCA miniplates at a density of 1-3 ϫ 10 4 cells/well. Cell adhesion to substrates was assayed in phosphate-buffered saline containing 0.1% bovine serum albumin, 1 mM MgCl 2 and CaCl 2 , and 2% India ink as a fluorescence quencher. CAFCA miniplates were placed on specifically devised hard plastic holders (TECAN Polyfiltronics, Inc.) and centrifuged at 142 ϫ g for 5 min at 37°C to synchronize contact of the cells with the substrate. The miniplates were incubated for 30 min at 37°C and then mounted together with a similar CAFCA miniplate lacking double-sided tape (top miniplate) such as to create communicating chambers to be reverse-centrifuged. The relative number of cells bound to the substrate (i.e. remaining bound to the bottom miniplates) and unbound cells (in wells of the top miniplates) was estimated by top/bottom fluorescence detection in a computer-interfaced SPECTRAFluor microplate fluorometer (TECAN Polyfiltronics, Inc.). Fluorescence values were analyzed by custom CAFCA software (TECAN Polyfiltronics, Inc.) to determine the percentage adherent cells of the total cell population analyzed according to a previously published formula (28,29).

Purification and Peptide Sequences of Chick EMILIN-A
chick aorta EMILIN preparation obtained following differential extraction procedures, DEAE ion-exchange, and size exclusion chromatography was analyzed by SDS-polyacrylamide gel electrophoresis under reducing conditions and found to be composed of a single major band with an estimated M r of about 115,000. This finding was in accord with previous results (18), and the few low M r contaminants represented less than 10% of the stained material as judged by quantitative scanning of the gel (data not shown). The protein was cleaved by trypsin and by a lysine-specific protease, the peptides were separated by reverse phase chromatography on a Nucleosil-120 C18 column, and several components were selected for sequencing (Fig. 1).
Cloning Strategy-Based on the peptide sequences, several degenerate oligonucleotides in the sense and antisense orientations were synthesized and used to amplify fragments from poly(A)ϩ RNA of chick aortas. Several amplification products of different lengths were obtained using the various combinations of oligonucleotides. The sequence of the cloned products revealed that all but one contained stop codons in all the possible frames. On the contrary, the 384-bp-long amplification product obtained with the use of the primers corresponding to peptides 9 (sense orientation) and 6 (antisense orientation) encompassed an open reading frame (clone D1) that contained peptide 1, thus suggesting that the amplified sequence corresponded to that of genuine EMILIN protein (Fig. 1).
After the initial successful search of the GenBank™ for ESTcontaining homologous sequences to those of clone D1 of chick EMILIN, five partly overlapping cDNA clones were sequentially isolated from gt10 human kidney and aorta libraries using a human specific PCR-derived probe based on the EST entries and probes derived from subclones at the 5Ј end of each successively isolated clone (Fig. 2). Remarkably, walking toward the 5Ј end of the human cDNA was hampered by the surprisingly high number of fused clones isolated from both libraries.
Nucleotide and Predicted Amino Acid Sequences-The partial chick cDNA corresponds to an open reading frame of 128 amino acids (data not shown). The coding sequence and the deduced amino acid sequence of the human EMILIN cDNA (GenBank™ accession number AF 088916) is shown in Fig. 3. Several partially overlapping EST entries (HSAA23367, HSAA1823, HS1269888) showed an almost perfect match with the 3Ј end of the human EMILIN transcript, including the last 465 bp of the coding sequence and the entire 3Ј-untranslated region. However, numerous substitutions and gaps have also been detected: C2772-GAP; C2823G; C2826G; G2827T; C2857A; C2877T; C2883-GAP; G2895-GAP. Independent sequencing of BAC clones confirmed the present sequence. 2 The open reading frame of the human EMILIN begins with a Met codon whose surrounding sequences fit into the eukaryotic translation start sites (30) and is preceded by an in-frame stop codon (data not shown). The human cDNA spans about 3400 bp and has an open reading frame of 1016 amino acids. The predictions with the highest probabilities for the initial residue of the mature protein are between position 2 (Ser, Y value of 0.508) and 3 (Tyr, Y value of 0.580) of the present sequence. Although the best prediction for the cleavage site of the signal FIG. 1. Sequences of chick EMILIN peptides. Peptides were generated as described under "Experimental Procedures." Amino acid sequences are indicated in 1-letter code, with X representing unidentified residues; the sequences of degenerate and inosine-containing oligonucleotides derived from peptides 4, 6, 8, and 9 are in small letters and underlined; y, c/t; h, a/c/t; w, t/a; n, a/g/c/t; r, a/g; s, g/c; i, inosine. peptide is at position 3, alignment between the N-terminal peptide sequence of the chick mature EMILIN and the deduced N terminus of the human EMILIN (see below) indicates the position 1 (Ala) as the most likely candidate for the initiation of the mature protein. Therefore, residues Ϫ21/Ϫ1 correspond most likely to a signal peptide, because the sequence agrees very well with the classical consensus sequence (31) and ends with a consensus signal cleavage site (32). Thus, the calculated molecular mass for the mature protein is 104.5 kDa. The human EMILIN contains 7 potential N-glycosylation sites and 20 cysteines with a number of them clustered as doublets, separated by none, one, or two residues that could be involved in intramolecular disulfide bonding. By applying the 3Ј-RACE, 383 additional bp corresponding to the 3Ј-untranslated sequence could be obtained. The observation that the most 3Ј end EST entry terminates 37 bp before the end of the present RACE product, together with a potential polyadenylation sequence 16 bp before the end of our sequence, strongly suggests that our RACE product includes the entire 3Ј-untranslated sequence.
Comparison between Chick and Human Sequences-The alignment of the 128-residue-long stretch of chick EMILIN and the C-terminal region of human EMILIN showed a high degree of homology (Fig. 4, panel C), the overall degree of amino acid sequence identity and identities plus similarities in this domain was calculated to be about 76 and 88%, respectively; furthermore, a partial sequence of the N terminus of chick EMILIN compares very well (9 residues are identical, and 2 are similar) with the deduced N terminus of the human EMILIN (Fig. 4, panel A). The high level of identities and similarities between chick and human EMILIN at both C and N termini indicates that the two sequences identify the same protein in these species.
Domain Structure-The predicted domain structure of human EMILIN is composed, starting from the C-terminus, of a globular domain (gC1q-like), an uninterrupted stretch of 17 Gly-Xaa-Yaa triplets, indicating that EMILIN possesses a straight collagen stalk, and a 641-long amino acid sequence in which there are several heptad repeats separated by unrelated sequences with the potential for forming coiled-coil ␣-helices (see below). A 91-bp-long residue sequence that includes two sequences corresponding to structures referred to as the "leucine zippers," which are typical of several gene regulatory proteins (33,34), is located between the coiled-coil region and the collagenous stretch. This finding is rather unusual especially for an ECM protein, as there are only a few precedents in the literature for the presence of leucine zippers in the extra nuclear compartments: the Drosophila ECM protein pollux (35) and the cytoplasmic protein dystrophin, whose leucine zipper has been reported to interact with troponin (36). Although the leucine zipper pattern is far from being specific, it is not clear at the moment what might be the significance of its presence in the context of EMILIN. Finally, a search of the NCBI data bank indicated the existence of a region of homology between the N-terminal end of EMILIN and the platelet and endothelial specific protein, multimerin (24), spanning amino acids 33 and 108 (Fig. 4, panel B). This region contains several identical residues, including three conserved cysteines and a partial epidermal growth factor-like consensus sequence.
Coiled-coil Prediction-The 3-4-3-4 spacing of hydrophobic residues predicts that the region of human EMILIN spanning residues 152 and 792 (Figs. 2 and 3) will form an ␣-helical coiled-coil. When analyzed by different algorithms (37,38), the probability for coiled-coil formation differed among the various heptad sequences. Using the PairCoil program (38), the highest probability score (around 1.0) was located at the level of the first heptad repeat (Fig. 5), and similar results were obtained when applying the Multicoil program (data not shown); furthermore, at least two other heptad repeats (positions 497-539 and 615-684) displayed a high probability of coiled-coil structures, and a fourth one had intermediate values.
Alignment of Sequences at the C-terminal Domain-The C terminus of EMILIN exhibits a striking homology to a gC1qlike domain of a number of proteins including the A, B, and C chains of human and mouse complement C1q protein (39,40), the ␣1 and ␣2 chains of type VIII (41,42), and the ␣1 chain of type X (43) collagens, precerebellin (44), multimerin (24), ACRP-30/AdipoQ (45,46), the HP-27 protein from Siberian chipmunks (47), and a sunfish saccular collagen (48). The length of this domain is included between 131 (gC1q-C) and 151 (EMILIN) residues, and the protein sequence comparison of the C1q-like domain of EMILIN with the known similar domains indicates a high level of conservation of several hydrophobic and uncharged residues (Fig. 6). To obtain the best sequence alignment of EMILIN with the other members of the superfamily, it was necessary to allow for the insertion of a 10residue sequence that is unique for EMILIN and is missing in all the other members. Nineteen residues are conserved in all 12 sequences analyzed, and 44 are conserved in at least half of the members. Initially, Fourier transform infrared spectroscopy and structure prediction (49) of 15 gC1q-like sequences suggested a ␤-sheet secondary structure for this domain. This prediction has been recently confirmed by the analysis of the ACRP-30/AdipoQ crystal structure (50).
Adhesion-promoting Activity of the Recombinant C1q-Domain of Human EMILIN-The gC1q domain of C1q complement component has been shown to promote cell attachment of mononuclear cells (51), platelets (52), and endothelial cells (53). Therefore, the potential cell adhesive capacity of EMILIN gC1q-like domain in comparison with that of fibronectin, a prototype adhesive molecule, was assessed using the smooth muscle cell lines SK-UT-1 and SK-LMS-1 and the fibroblast cell line HT1080. The EMILIN gC1q-like domain was cloned in the pQE30 expression vector, and the protein was purified under native conditions and coated at different concentrations. The native state of the EMILIN gC1q-like domain was examined by CD spectra analysis. The peptide (Fig. 7) exhibited a classic ␤-sheet CD spectrum with a minimum at 217 nm. The evaluation of the content in ␤-sheet conformation for the peptide using the Menendez-Arias program (27) gives a value higher than 70%, therefore confirming that also the EMILIN gC1qlike domain fits very well with the structure determined for ACRP-30/AdipoQ (50). Cell adhesion to fibronectin was high and comparable for all cell lines with a cell binding of about 90%, the EMILIN gC1q-like domain promoted a high level of adhesion of SK-UT-1 cells, whereas HT1080 were negative, and SK-LMS-1 bound with a low percentage (Fig. 8A). However, although adhesion to fibronectin induced a strong cell flattening, cell adhesion to the EMILIN gC1q-like domain was not FIG. 3. Nucleotide and predicted amino acid sequence of human EMILIN. First line, nucleotide sequence; second line, deduced amino acid sequence. Plain and bold numbers on the right indicate nucleotides and amino acids, respectively. Amino acids are numbered starting at the predicted beginning of the putative mature sequence. The presumed N terminus of the mature protein is marked by a closed arrow, and the UAG stop codon is indicated by a star. The polyadenylation signal is bold and underlined. Potential N-attachment sites for oligosaccharides are boxed, and cysteine residues are circled. Several structural features are highlighted; the partial epidermal growth factor-like motif is double-underlined; the coiled-coil sequences are underlined by a broken line with the residues in the a-position marked by a dot; the leucines in the d-position of potential leucine zippers are indicated in reverse types; the glycines (G) of the collagenous domain are shown within triangles; the C1q-like C-terminal domain is boxed.
followed by a significant level of cell spreading (Fig. 8B). The SK-UT-1 cells were for the most part round with small blebs or short projections with only a very low percentage of cells displaying a flat morphology. These results suggest either that cell adhesion to the EMILIN gC1q-like domain uses different mechanisms/receptors than adhesion to fibronectin and is not followed by cytoskeletal rearrangements or, less likely, that the standard cell adhesion conditions of time, temperature, and medium composition used to assay adhesion to fibronectin are not appropriate to measure cell adhesion to the C1q-like domain of EMILIN.

DISCUSSION
The results provided in this report concern a new human cDNA whose major structural elements were a gC1q-like Cterminal domain, a short uninterrupted collagenous domain, and an extended domain containing sequences with the potential of forming amphipathic coiled-coil ␣-helices. The determination of the primary structure through cDNA cloning and the assignment of this novel sequence to human EMILIN was made possible by the very close identity at the C terminus between the chick EMILIN sequence and that of the corresponding human cDNA. In fact, the finding that the deduced sequence of clone D1 of chick EMILIN contained peptide 1 of the tissue-purified EMILIN confirmed that the sequence amplified from chick aorta mRNA corresponded to that of the genuine EMILIN described by us (18 -23) as an elastin-associated protein with undefined function. Thus, the sequence similarity at the gC1q-like domain, the near identity between the N-terminal residues of the mature chick protein, and the deduced amino acids of the presumed mature human protein support the conclusion that we are dealing with the same ECM constituent in the two species.
Evidence that chick EMILIN stained strongly with PAS (18) and that in biosynthetic studies a treatment with tunicamycin reduced the apparent molecular mass of about 20 -25 kDa (23) indicated that chick EMILIN was highly glycosylated. The present identification of seven potential N-glycosylation sites in the human EMILIN sequence is in accord with the previous experimental data using the chick system (18,23). Similarly, the presence of 20 cysteine residues with a high potential for intermolecular S-S bonding is also in accord with the finding that newly synthesized and secreted chick EMILIN migrated as a monomer in SDS gels under reduced conditions but was present as a large aggregate that did not enter the gel in the absence of reducing agents (23).
The domain organization of EMILIN is unique; it bears features shared with several other members of the C1q/TNF superfamily (Fig. 9), i.e. C1q (A, B, C), collagens VIII, X, saccular collagen, ACRP-30/AdipoQ, and HP-27, such as the gC1q-like domain and a collagenous domain, but also EMILIN displays an extended discontinuous and potentially coiled-coil region that is absent in all the other members of the C1q/TNF superfamily, except multimerin, a large soluble glycoprotein found in platelets ␣-granules and endothelial Weibel-Palade bodies. Multimerin forms disulfide-linked homomultimers of variable sizes (55) and interacts with factor V, which is stored complexed with multimerin in the ␣-granules (56). There is experimental evidence that several members of the C1q/TNF superfamily trimerize to form either heterotrimeric collagen triple helices that are expressed as soluble plasma proteins or type II membrane-bound molecules such as in C1q (A, B, C) (57,58) or to form homotrimers as in collagen X (59) and ACRP-30/AdipoQ (46). EMILIN is also likely to form similar trimers; it possesses a gC1q-like domain, which is highly homologous to those of the other members of the superfamily and an uninterrupted collagenous domain, which can form a collagen-like stalk region. Curiously, among the members of the family, only EMILIN, ACRP-30/AdipoQ, and Hib27 possess an uninterrupted collagenous domain. The EMILIN gC1q-like domain, when compared with the other gC1q-like domains, has a much longer F ␤-strand because of a 10-residue insertion. However, the residues conserved throughout both the C1q and TNF families of proteins and important in the packing of the hydrophobic core of the individual monomer (50) are present in EMILIN gC1qlike domain in the same relative positions, and this appears sufficient to predict a similar trimeric and spatial organization also for the domain of EMILIN. The potential structural homology between the TNF family of growth factors and the gC1q-like domains of the C1q family of proteins suggested that these diverse members might derive from ancestral elements with close functional activity (50). Likely targets for the proteins containing C1q/TNF domains are cell surface receptors; these are well studied in TNF, but initial data are also available for the C1q complement component (51)(52)(53). In fact, several cell types are endowed with the capability to attach to the C1q complement component via cell surface binding sites; two types of structures have been described, a binding protein that recognizes the collagenous domain (60) and another component that binds to the gC1q domain (51). However, more recently the effective nature of this second type of binding molecule has been disputed (61), and further studies are necessary. The finding that EMILIN gC1q-like domain displayed cell pro-adhesive capacity for some smooth muscle cells but seemed to be much less reactive for fibroblastic cells is consistent with the above evidence and suggests that cell recognition of this domain might be exerted via specific cell surface receptors. The adhesion was high for SK-UT-1 cells but, within the time frame of the cell adhesion assay, was not followed by a consistent spreading. Thus, it is possible that "receptors" distinct from classical integrins such as those recognized by fibronectin are involved here. Neither the physiological significance of the observed adhesion is clear yet nor whether this adhesion plays a primary or an auxiliary role. Close contacts between amorphous elastin and smooth muscle cells in the aorta of 16-day embryos have been reported (62), and the ultrastructural localization of EMILIN (19) does not exclude that the interaction between the elastin amorphous core and the cells could also take place via an EMILIN intermediate. As EMILIN was detected in early stages of aorta development, in association with a network of thin fibrils likely representing maturing microfibrils (19), EMILIN deposition can be considered an early event in elastogenesis, and this conclusion is reinforced by the observation that the process of elastic fiber formation in vitro was greatly affected by the addition of anti EMILIN antibodies (19). Whether the process of elastin deposition and elastic fiber formation is regulated through cell adhesion via the EMILIN gC1q-like domain remains to be seen.
EMILIN, like multimerin (55), is heavily disulfide-linked and thus can be found as large aggregates in the culture medium of aorta smooth muscle cells (23). The possibility to form coiled-coil ␣-helices could further amplify its potential to asso-  Fig. 2. Bars within the collagenous domains (COL) indicate short interruptions or imperfections in the Gly-Xaa-Yaa sequence. The order in which the different members are depicted highlights that EMILIN, in addition to the C1q-like domain common to all the family members, shears a short collagenous domain with only a few of the members and the coiled-coil-containing region only with multimerin. EGF, epidermal growth factor. ciate into even larger aggregates. In fact, one of the heptad repeat sequences has a probability to form ␣-helices near 1.0, and in two other regions the probability is above 0.6. Although formal proof that these heptad repeats can form trimers is still lacking, the chances are high given that the presumed trimerization process can initiate from the C1q-like domain at the C terminus, proceeding then through the collagenous domain next to it as in collagen X (59) and ACRP-30/AdipoQ (46). Further studies are required to define how EMILIN subunits are assembled into the large disulfide-linked multimers, i.e. whether the EMILIN gC1q-like domain is a likely site for initial interchain association and whether the heptad repeats associate only intramolecularly into trimers or can also associate intermolecularly, i.e. with other EMILIN trimers. To investigate these possibilities, the preparation of full-length, truncated, and point-mutated EMILIN recombinant molecules in eukaryotic cells is in progress. 3