The Collagens of Hydra Provide Insight into the Evolution of Metazoan Extracellular Matrices*

A collagen-based extracellular matrix is one defining feature of all Metazoa. The thick sheet-like extracellular matrix (mesoglia) of the diploblast, hydra, has characteristics of both a basement membrane and an interstitial matrix. Several genes associated with mesoglea have been cloned including a basement membrane and fibrillar collagen and an A and B chain of laminin. Here we report the characterization of a further three fibrillar collagen genes (Hcol2, Hcol3, and Hcol5) and the partial sequence of a collagen gene with a unique structural organization consisting of multiple von Willebrand factor A domains interspersed with interrupted collagenous triple helices (Hcol6) from Hydra vulgaris. Hcol2 and -5 have major collagenous domains of classical length (∼1020 amino acid residues), whereas the equivalent domain in Hcol3 is shorter (969 residues). The N-propeptide of Hcol2 contains a whey acid protein four-cysteine repeat (WAP) domain, and the equivalent domain of Hcol3 contains two WAP and two von Willebrand factor A domains. Phylogenetic analyses reveal that the hydra fibrillar collagen genes form a distinct clade that appears related to the protostome/deuterostome A clade of fibrillar collagens. Data base searches reveal Hcol2, -5, and -6 are highly conserved in Hydra magnipapillata, which also provided preliminary evidence for the expression of a B-clade fibrillar collagen. All four of the H. vulgaris collagens are expressed specifically by the ectoderm. The expression pattern for Hcol2 is similar to that previously reported for Hcol1 (Deutzmann, R., Fowler, S., Zhang, X., Boone, K., Dexter, S., Boot-Handford, R. P., Rachel, R., and Sarras, M. P., Jr. (2000) Development 127, 4669-4680) but distinct from the pattern shared by Hcol3 and Hcol5. The characterization of multiple collagen genes in relatively simple diploblastic organisms provides new insights into the molecular evolution of collagens and the origins of the collagen-based extracellular matrix found throughout the multicellular animal kingdom.

Collagen genes that are associated with basement membranes and interstitial matrices in vertebrates (type IV and fibrillar collagens) are found throughout the metazoan kingdom (1,2). However, the extracellular matrices (ECM) 3 with which these collagen genes are associated, namely basement membranes and interstitial matrices, only become apparent as distinct entities in triploblasts, three cell-layered organisms (3). The simplest metazoan forms of life (Parazoa such as the sponges) exhibit both fibrillar and basement membrane collagens, and some sponges exhibit thin fibrils and evidence of a basement membrane-like structure (4,5). Diploblasts, such as the cnidarian, hydra, are intermediate between the simplest metazoans and the triploblasts (protostomes and deuterostomes) that constitute the majority of the multicellular animal kingdom. The thick sheet-like ECM of hydra (mesoglea) appears to be a "composite" matrix having characteristics of both a basement membrane and an interstitial matrix (6). The mesoglea may represent an intermediate form of matrix bridging the gap between the relatively unorganized matrices secreted by sponges and the large variety of highly structured matrices found in triploblasts, i.e. basement membranes and interstitial matrices containing large, well developed fibrils.
Previously, we have cloned and characterized several matrix genes from hydra associated with ECM including a type IV collagen, Hcol-IV, hereafter referred to as Hcol4 (7), a fibrillar collagen Hcol-I, hereafter referred to as Hcol1 (8), and laminin A and B chains (9,10). Immunohistochemical and ultrastructural analyses have indicated that the mesoglea is organized as two sub-epithelial thin layers of basal lamina sandwiching a central interstitial zone (11). Expression studies have revealed that the collagen genes (Hcol1 and Hcol4) appear to be expressed exclusively by ectoderm (7,8), whereas laminin genes are expressed by endoderm (9,10). When decapitated, hydra is capable of regenerating the missing head and tentacles. The regeneration involves cell migration to cover the wound and de novo ECM synthesis at the regenerating tip (12). The ECM re-synthesis and assembly at the regenerating tip provides an attractive and simple multicellular model for studies of ECM structure and function in vivo in the laboratory setting.
Here we report the characterization of a further three hydra fibrillar collagen genes (Hcol2, Hcol3, and Hcol5). In addition, we report the partial characterization of a novel hydra collagen gene (Hcol6) with a unique organization consisting of vWF A domains interspersed with interrupted triple-helical domains. We demonstrate by phylogenetic analyses that the hydra fibrillar collagen genes form a distinct clade that appears related to the protostome/deuterostome A clade of fibrillar collagens. We demonstrate by searching EST databases that most of these genes are highly conserved in the related Hydra magnipapillata and provide the first evidence for the existence of B-clade fibrillar collagens in diploblasts. Each collagen gene is expressed in highly specific developmentally regulated patterns solely within the ectoderm. These four novel Hydra vulgaris collagen genes provide new insights into the molecular evolution of collagens and the organization and structure of metazoan extracellular matrices.

EXPERIMENTAL PROCEDURES
Animal Maintenance-H. vulgaris, L2 stain was used for in situ hybridization experiments. Animals were cultured in M solution (13) and fed with freshly hatched Artemia nauplii three times a week. All experiments were carried out 24 -48 h after feeding.
cDNA Cloning and Sequencing-The isolation of the original cDNA clones for the different collagen genes by differential screening of regenerating hydra cDNA libraries and the subsequent cloning and sequencing of overlapping cDNAs for each gene were performed as described previously (7,8).
Phylogenetic Analysis-The novel fibrillar collagen Hcol2, -3, and -5 genes identified in H. vulgaris were aligned with representative metazoan fibrillar collagens and the previously identified Hydra Hcol1 gene (8) using ClustalX (14). Gap-containing sites were removed from each alignment, and a Neighbor Joining tree was inferred. The topology of this tree was tested using four independent methods. Neighbor Joining and Maximum Parsimony bootstrap replicates were obtained using PROTDIST, NEIGHBOR, and PROTPARS from the PHYLIP package (15). Maximum Likelihood trees were inferred using PROML from the PHYLIP package (15). The JTT model of amino acid substitutions was used with global rearrangements and correction for rate heterogeneity (␣ value was obtained from TREEPUZZLE (16)). Bayesian tree inference values were produced from the MrBayes program (17).
Identification of vWF A Domains and WAP Domains-vWF A and WAP domains were originally identified using the rpsblast and cdart protocols (www.ncbi.nlm.nih.gov/BLAST). Alignments were performed using Clustal programs as indicated in the legends to the appropriate figures. vWF A domain MIDAS motifs, ␣ helices, and ␤ strands were predicted based upon Jackson et al. (18).
In Situ Hybridization-Digoxigenin-labeled riboprobes were obtained through the following method. PCR primers were syn-FIGURE 1. Domain structure of the hydra collagens. A, schematic drawing of the hydra Hcol2 (accession number DQ679807), Hcol3 (accession number DQ679808), and Hcol5 (accession number DQ679809) domain structures, shown with the previously identified Hcol1 sequence (8) and representative human fibrillar collagen genes from the A, B, and C clades (collagens I␣1, V␣1, and XXVII␣1). B, schematic drawing of the Hcol6 (accession number DQ679810) domain structure. The asterisk indicates the sequence is incomplete.
thesized according to the sequences of hydra collagens. The primer sequences are: Hcol2, accacctggtgtcaaaaggag and aagccccaattcttccttgt; Hcol3, ttcctggtgagagaggtgct and taccttgagcgccttgttct; Hcol5, accacctggtgtcaaaaggag and aagccccaattcttccttgt; Hcol6, accacctggtgtcaaaaggag and aagccccaattcttccttgt. PCR fragments were cloned into pCR II vector (Invitrogen) and linearized with appropriate restriction enzymes. In vitro transcription was carried out using "Maxiscript" kit (Ambion) and Dig labeling mix (Roche Applied Science) according to manufacturer protocols. Both wild type adult polyps and polyps that were decapitated and followed during head regeneration for different times were used for in situ. The whole-mount in situ hybridization reaction was carried out the same way as described previously (7)(8)(9)(10). Briefly, hydra polyps were "relaxed" in 2% urethane and subsequently fixed with freshly prepared 4% paraformaldehyde. Samples were treated with ethanol and proteinase K, re-fixed with 4% paraformaldehyde, and hybridized with probes in hybridization solution (50% formamide, 5ϫ SSC (1ϫ SSC ϭ 0.15 M NaCl and 0.015 M sodium citrate), 1ϫ Denhardt's solution, 200 mg/ml tRNA, 100 mg/ml heparin, 0.1% Chaps, and 0.1% Tween 20). Hybridization was carried out for 65 h followed by washing steps using SSC. Anti-digoxigenin antibody (AP-conjugated, 1:2000 dilution, Roche Applied Science) was used to detect probe labeling, and BM Purple (Roche Applied Science) was used as an alkaline phosphatase substrate for color reaction. Some samples were embedded in JB4 after the color reaction and sectioned (20-m thickness) with a glass knife for microscopic examination.

RESULTS
Four Novel Collagen Genes Identified in H. vulgaris-Differential screening of a cDNA library using a combination of nonstringent and stringent conditions resulted in the identification of four novel collagen genes in addition to the previously described fibrillar Hcol1 (8) and basement membrane Hcol4 genes (5).

Characterization of the Three Novel Fibrillar Collagen Genes-
Three of the novel genes (Hcol2, Hcol3, and Hcol5) are members of the fibrillar collagen family based on their conserved C-terminal non-collagenous and uninterrupted collagenous domain structures ( Fig. 1 and 2; Table 1). Fibrillar collagen NC1 domains have 8 cysteine residues (19), and 6 of these (numbers 1 and 4 -8) are conserved in all of the hydra fibril-FIGURE 2. Sequence alignments of hydra and representative vertebrate fibrillar collagens. Alignment of the major triple helix and non-collagenous NC1 domains of hydra Hcol2, Hcol3, and Hcol5 with hydra Hcol1 (8) and representative human fibrillar collagen sequences from the A, B, and C clades (collagens I␣1, V␣1 and XXVII␣1). The boundary between the collagenous and C-terminal NC1 domain is indicated. Cysteine residues conserved across the NC1 domains (numbers 1 and 4 -8) are indicated by an arrow, whereas the partially conserved cysteines 2 and 3 are indicated by a broken arrow with a diamond head. The chain selectivity sequence is boxed. Levels of sequence conservation are indicated (red, Ͼ50% identical; blue, conservative substitutions). lar collagens (Fig. 2). Cysteine 2 is present in Hcol2, and cysteine 3 is in Hcol1 (Fig. 2). All of the hydra fibrillar collagens have the short chain selectivity sequence (Fig. 2) exhibited by invertebrate fibrillar collagen chains (3).
The major helical domains of Hcol2 and Hcol5 are 1020 -1023 amino acid residues in length, which is very similar to most other fibrillar collagens, whereas the same domain in Hcol3 is significantly shorter (969 residues) (Table 1, Fig. 2). The three novel hydra fibrillar collagen genes contain uninterrupted N-terminal minor triple helical domains separated from their respective major collagenous domains by telopeptide-like sequences.
By amino acid composition, all of the hydra fibrillar collagenous domains have ϳ33% glycine content (Table 2). However, the hydra chains (which appear related to the A-clade; see below) contain lower levels of alanine and higher levels of lysine than most A clade fibrillar collagen domains (Table 2). Proline levels in the hydra chains are considerably lower than those in the classical vertebrate collagens (types I-III, V, and XI) but are typical for invertebrate fibrillar collagens ( Table 2).
N-terminal Non-collagenous Domains of the Novel Fibrillar Collagens-The sequence for Hcol2 is full-length, and the N-terminal non-collagenous domain consists of a signal peptide followed by a whey acidic protein 4 disulfide core (WAP) domain (Fig. 3). The sequence of Hcol3 N-terminal to the minor collagenous domain consists of a tandem repeat of a vWF A and WAP domain (Fig. 1A, Table 1); this sequence is not fulllength as we have yet to identify a start codon followed by a signal peptide. The Hcol2 and Hcol3_1 WAP domains have complete 8 cysteine complements with the position of cysteine 2 being least constrained as seen in other WAP domains (Fig. 3). The more C-terminal Hcol3_2 WAP domain is missing the cysteine at position 7 of the 8-cysteine motif (Fig. 3). The vWF A domains of Hcol3 are discussed in detail below. The N-terminal non-collagenous region of Hcol5 has yet to be cloned.
Phylogenetic Analysis-The major triple helical and C-terminal noncollagenous domain sequences of the four known hydra fibrillar collagen sequences (Hcol1-3 and -5) were aligned with representative protostome and deuterostome orthologues as listed in Table 3. A condensed version of this alignment containing hydra and representative human fibrillar collagen sequences is presented in Fig. 2. These alignments were then used for phylogenetic analyses to determine the evolutionary relationships of the hydra genes. The results indicate that the four hydra fibrillar collagen sequences cluster weakly with the A rather than B or C clade genes in all four of the different phylogenetic methods used in this analysis (Fig. 4). The clustering of the hydra genes in the deepest region of the A clade indicates that these genes have evolved within the hydra lineage and are not directly related to individual fibrillar collagen genes in the vertebrate lineage. The analyses provide very strong support for the contention that Hcol3 and -5 and Hcol1 and -2 are paralogous pairs that have evolved after the divergence of diploblasts such as hydra and triploblasts such as arthropods (apis), echinoderms (urchin), or chordates (ciona and vertebrates) (Fig. 4).
Characterization of a Fourth Novel Collagen Gene-The fourth novel hydra collagen gene, Hcol6, exhibits a unique domain structure (Fig. 1). To date we have characterized over 7 kilobases of open reading frame encoding 2439 amino acid residues ( Fig. 5) but have not yet located either the start or termination codons. The gene has a unique structure encoding five vWF A domains interspersed with interrupted collagenous domains ( Fig. 1B and Fig. 5).
Multiple vWF A Domains Are Found in Hcol3 and Hcol6-Approximate A domain boundaries in the N-propeptide of   Hcol3 and in Hcol6 were determined by sequence alignment with vWF A domains from zebrafish matrilin-1 and human matrilin-3. The sequence alignment (Fig. 6) was produced using ClustalW with manual adjustment and is more reliable in the N-terminal portion of the domain because of the relative lack of sequence conservation toward the C terminus. The hydra sequences all show the characteristic pattern of alternating ␤ strands and ␣ helices predicted for vWF A domains (Fig. 6). The alignment reveals that the complete set of residues required for MIDAS cation-coordination are conserved in the vWF A domain repeats Hcol6-1, Hcol6-4, and Hcol6-5 (Fig. 6).

Localization of Fibrillar Collagen Gene Expression in Hydra during Development and during Tissue Regeneration-
The expression patterns of the novel fibrillar collagens, Hcol2, -3, and -5, were examined in adult and budding hydra as well as during tissue regeneration after decapitation (Figs. 7-9, respectively).
Relatively high expression of Hcol2 was apparent throughout the adult polyp (Fig. 6, A and B). In buds Hcol2 expression was marginally increased compared with adults and also widely distributed throughout the developing organism (Fig. 7C). Crosssections of the body tube reveal the expression of Hcol2 to be concentrated in the ectoderm (Fig. 7D). Hcol2 expression is strongly up-regulated during tissue regeneration within 6 h of decapitation and remains high throughout the regeneration process (Fig. 7, E-H).
The expression of Hcol3 was focused at the base of the tentacles and the peduncle region in adult polyps and in buds (Fig.  8A). A much weaker signal was apparent throughout the body column. Nevertheless, the expression of Hcol3 appears concentrated in the ectoderm (Fig. 8B). Hcol3 appears only weakly expressed during tissue regeneration, reaching detectable levels 12 h after decapitation (Fig. 8, C and D).
The expression pattern of Hcol5 was similar to that of Hcol3 being focused upon tentacles and the peduncle region in adult and through developing buds (Fig. 9, A-D). Expression of Hcol5 is up-regulated during tissue regeneration within 6 h of decapitation (Fig. 9, E and F).   MARCH 2, 2007 • VOLUME 282 • NUMBER 9

JOURNAL OF BIOLOGICAL CHEMISTRY 6797
Expression of the Novel vWF A Domain-containing Hcol6-The expression pattern of Hcol6 is limited to the base of the tentacles in adults and to developing buds (Fig. 10, A-D). The expression level for Hcol6 appears lower than that detected for the fibrillar collagens (Figs. 7-9). The expression of Hcol6 could not be detected during the first 12 h of tissue regeneration after decapitation, with trace levels detectable as the later stages of head and tentacle regeneration became apparent (Fig. 10, E-H).

DISCUSSION
Hydra is a relatively simple, diploblastic organism with a body tube and tentacles composed of two cell layers, ectoderm, and endoderm separated by a thick sheet-like extracellular matrix called the mesoglea. Previous analyses of the structure and molecular composition of the mesoglea have revealed that this extracellular matrix has features of both the basement membrane and interstitial matrices seen in higher order Metazoa (20). We reported previously the characterization of a fibrillar (Hcol1) (8) and a basement membrane collagen gene (Hcol4) (7). These hydra genes were originally named using Roman numerals. However, we now propose to switch to Arabic numerals for hydra collagen genes to avoid generating the false inference that the invertebrate genes are the direct equiv-alents of their vertebrate homologues (which are named using Roman numerals). Using differential screening techniques, we describe four novel hydra collagen genes; three encoding fibrillar collagens with the forth gene encoding a unique combination of collagenous helices interspersed with multiple vWF A domains (Fig. 1).
Fibrillar Collagen Genes of Hydra-We have now identified a total of four fibrillar collagen genes in H. vulgaris (Fig. 2, Table  1), and because these studies have involved cDNA cloning rather than genome sequencing, we cannot be certain that there are not more members of this class of collagen in hydra or related diploblasts. To complement our cloning studies, we have searched two extensive collections of EST sequences (H. magnipapillata, Hydra EST Database; Nematostella vectensis Genomic Database) to find further sequence data for the collagen genes described above or evidence for additional novel collagen genes. BLAST searches of the N. vectensis yielded no significant hits. However, the H. magnipapillata data base contained EST sequences that were virtually identical to Hcol1, -2, -5 (and -6) at the amino acid level but no hits for Hcol3. Unfortunately, the H. magnipapillata sequences did not extend any of the incomplete terminal sequences for H. vulgaris collagens described above. However, BLAST searching with the known hydra fibrillar collagen sequences yielded two overlapping clones from the H. magnipapillata data base (ace11729 and tdb11n19) encoding 539 bases that translated to give a novel collagen sequence (Hcol7). Hcol7 consists of an N-terminal 39-amino acid collagenous (Gly-X-Y) repeat followed by 128 amino acids of a fibrillar collagen NC1 domain. This sequence appears most similar to B clade collagen chains (supplemental Fig. S1). Although too short for any meaningful phylogenetic analysis, Hcol7 provides preliminary evidence for the presence of B clade fibrillar collagens in diploblasts.
Amino Acid Composition of the Collagenous Domain and Fibril Diameter-One possible explanation for the thin fibrils formed in the hydra mesoglea is that the N-propeptide (of Hcol1) is not cleaved during assembly into the matrix (8). It is unknown whether the bulkier N-propeptides of the newly described hydra collagens (see below) are processed. Nevertheless, a second possible factor that may influence fibril diameter is amino acid composition of the collagenous domain. The hydra fibrillar collagenous domains contain lower levels of alanine and higher levels of lysine than comparable A clade ␣ chains ( Table 2). The collagen chains included in Table 2 have been organized into a left and right group based on the ␣-chain propensity to form relatively thinner or thicker fibrils. Chains to the left of the table on average form thinner fibrils than their equivalent chains on the right hand side. For instance, the hydra collagens are known to form thin, 10-nm non-striated fibrils (8,11). Similarly, we have recently immunolocalized collagen XXVII to 10-nm non-striated fibrils, 4 and we assume this to be a characteristic of other C-clade collagens. The vertebrate minor fibrillar collagens (types V and XI) are predominantly composed of B-clade ␣ chains and assemble to form significantly thinner fibrils than their A-clade collagen counterparts such as type I or II collagen (21). When arranged in this order, all of the chains to the left in Table 2 have reduced levels of alanine (the smallest amino acid normally occupying the X or Y position of the Gly-X-Y repeat) in comparison with the chains to the right (with the exception of the Apis A clade chain). Likewise, all of the chains on the left have much higher lysine contents (in particular, lysine located at the Y position of the Gly-X-Y repeat that are the substrate for forming hydroxylysine; Table 2) than those on the right (again with the exception of the Apis A clade chain). Fibril thickness has previously been inversely related to the abundance of bulky hydrophobic amino acids and glycosylated hydroxylysine (21). Assuming that the level of lysine in the Y position of the collagen repeat is related to the final level of lysine hydroxylation and glycosylation, it would appear that the relatively high levels of alanine and low levels of lysine may be a predictor of the capacity to form thick fibrils and vice versus for thin fibrils. Furthermore, it is hypothesized that, based on their alanine and lysine contents, the fibrils formed by both A and B clade collagens in the bee (Apis) are likely to be more akin to the thin fibrils formed by hydra collagens than the well developed cross-striated fibrils of vertebrates.
Novel N-propeptide Domains in Hydra Fibrillar Collagens-As reported previously, Hcol1 has little sequence N-terminal to its minor collagenous helix other than the signal peptide and start codon (8), and in this respect Hcol1 is reminiscent of the vertebrate pro␣2(I) collagen chain and the Strongylocentrotus purpuratus a1 chain (22). However, it is of interest to note that the N-terminal propeptides of Hcol2 and Hcol3 have completely novel arrangements of previously characterized protein domains. Both share WAP domains, which have not previously been reported associated with collagen genes (Fig. 1). WAP domains consist of about 50 amino acid motifs containing eight highly conserved cysteine residues that form four disulfide bridges (Fig. 3 and Ref. 23). WAP domains in some proteins have protease inhibitory activity, and others have antibacterial activities contributing to innate immunity (24), particularly important in invertebrates, which have no acquired immunity. It is an intriguing possibility that the N-propeptides of Hcol2 and Hcol3 function as part of the hydra innate immune system providing antibacterial activities upon and if released during collagen assembly. In addition, Hcol3 has vWF A domains in its N terminus ( Figs. 1 and 6), and these domains could serve to concentrate the released N-propeptides to the mesoglea by way of interaction with the collagenous helices (see Ref. 29).
How Do Hydra Fibrillar Collagens Relate to the Fibrillar Collagen Family-Detailed phylogenetic analyses (Fig. 4) indicated that the four H. vulgaris genes described in the results are weakly related to each other in two paralogous pairs (Hcol3 and -5 and Hcol1 and -2) and that these pairs cluster with the A rather than B or C clade. The phylogenetic analyses coupled with the fact that Hcol2 and 3 share, uniquely to date compared with all other known fibrillar collagens, a WAP domain in their N-propeptides raise the possibility that these hydra fibrillar collagens arose by duplication from a common progenitor after the divergence of the diploblast and triploblast lineages. The N-propeptide organizations of these hydra fibrillar collagen genes are in contrast to those of the protostome/deuterostome A-, B-, and C-clade N termini that are characterized by either vWF C domains (A clade) or a thrombospondin N-terminal and variable domain (B and C clade) (3, 26 -28). The lack of homology in the N-propeptide of hydra fibrillar collagens in compar- ison with homologous genes from higher order Metazoa raises the possibility that diploblasts diverged before the establishment of the A and B clades, which are found in both protostome and deuterostome lineages (28). However, the discovery of a short fibrillar collagen sequence in H. magnipapillata that appears most similar to B-clade ␣ chains (Hcol7; see above and supplemental Fig. S1) raises the alternative possibility that the A and B clades of fibrillar collagens were already established in diploblasts and subsequently rearranged their N-propeptide domains as lineages diverged.

vWF A Domains Have Combined with Collagenous Domains on Several Different
Occasions-The hydra Hcol6 has an unusual domain organization consisting of a string of vWF A domains separated by interrupted Gly-X-Y repeats (Fig. 1). This combination of multiple vWF A domains and collage-    nous sequences is most reminiscent of the vertebrate-specific type VI collagen. However, whereas vWF A domains flank the single collagenous domain in each of the collagen VI alpha chains (29), in hydra Hcol6 the collagenous domains flank the A domains. A homotrimeric Hcol6 would, therefore, consist of short stretches of collagen triple helix flanking three vWF A domains (one from each ␣ chain) followed by more triple helixes and further clusters of three A domains and so on. Three of the five vWF A domains characterized in Hcol6, but neither of the A domains characterized in the N-propeptide of Hcol3 retain complete MIDAS sequences (Fig. 6) capable of metal ion chelation in their own right (25). It will be of interest to determine whether the hydra Hcol6 forms a microfibrillar structure similar to that of the vertebrate-specific type VI collagen.
The discovery that two hydra collagens (Hcol3 and Hcol6) contain vWF A domains provides new insight into how the collagen-based ECMs of Metazoa have evolved. Based on phylogenetic and gene structure data, it seems that on separate occasions during metazoan evolution (probably twice on the hydra lineage to produce Hcol3 and Hcol6 and, likewise, twice on the deuterostome lineage to produce the progenitor vWF A domain-containing FACIT and type VI collagen genes) collagenous domains have become combined with vWF A domains to produce novel A domain-containing collagens (25,30). One can conclude, therefore, that the combining of these two types of domain is particularly advantageous in an evolutionary sense, producing apparently unrelated yet versatile collagens capable of novel homotypic or heterotypic interactions. Certainly these data indicate that there is a natural propensity for vWF A and collagenous domains to become associated, presumably where gene duplication provides redundant genetic material to allow recombination without detriment to the host. Furthermore, this trait of combining collagenous and vWF A domains, which was previously thought to be a characteristic of chordate evolution (25), is now proven to be far more widespread in the animal kingdom.
Hydra Collagen Genes Are Expressed in Distinct Developmentally Controlled Patterns-Based on the model proposed by Sarras and Deutzmann (20), hydra mesoglea is organized into two sub-epithelial basal lamina zones and a central (acellular) interstitial matrix. This interstitial component occupies 80 -90% thickness of the mesoglea and is composed of fibrous molecules (31). The three new hydra fibrillar collagens described here are candidates to participate in the formation of the interstitial component of the mesoglea. Using a monoclonal antibody, we have previously shown that Hcol1 is localized to 10-nm microfibrils in the interstitial component of the mesoglea (8), and more recent work shows it forms a grid like fibrous network. 5 The expression pattern of Hcol2 (Fig. 7) is very similar to that previously reported for Hcol1. The possibility that Hcol1 and Hcol2 could co-assemble in a heterotypic trimer or co-assemble into heterotypic fibrils must await further biochemical and immunochemical characterization of the mesoglea. It is of interest to note, however, that all of the collagenous peptide sequences generated from a biochemical characterization of collagen in the mesoglea of adult polyps originated from either the Hcol1 or Hcol4 genes (8). It is, therefore, possible that Hcol2 may have a lower relative abundance in the mesoglea of adults than Hcol1 despite its widespread and apparently strong expression throughout the polyp (Fig. 7).
The other two fibrillar collagens, Hcol3 and Hcol5, present different expression patterns from Hcol1 and Hcol2 but similar expression patterns to each other (Figs. 7-9). Both Hcol3 and Hcol5 are expressed more intensely on tentacles and buds, where mesoglea is thinner, and tissue extension/cell migration is undertaken and at much lower levels in the body column (Figs. 8 and 9). The restricted distribution of these two collagens perhaps explains why no evidence for their existence was revealed by the previous biochemical characterization of mesoglea collagens (8). It seems unlikely that Hcol3 and Hcol5 would co-trimerize since the collagenous domain of Hcol3 is significantly shorter than that of Hcol5 (Table 1).
During head regeneration, mesoglea is synthesized at the regenerating tip de novo (12). Hcol2 shows stronger expression than Hcol3 and Hcol5, indicating Hcol2 (together with Hcol1 (6)) is more likely used for the initial assembly of the mesoglea, with Hcol3 and Hcol5 participating later, perhaps as tentacle regeneration is initiated. The hydra fibrillar collagen genes characterized to date therefore seem to fall into two groups. One group, represented by Hcol1 and -2, is expressed throughout the polyp and is probably responsible for forming the main fibrous framework of the mesoglea. The other group, represented by Hcol3 and Hcol5, is probably less involved in the general process of mesoglea synthesis but more involved in modifying the characteristics of the mesoglea where it needs to be thickened such as at the base of the tentacles.
Hydra Collagens Are Expressed Exclusively by the Ectoderm-A remarkable feature of collagen synthesis in hydra is that the mRNAs for all six of the collagen genes characterized to date are expressed exclusively by the ectoderm (Figs. 7-10; Refs 7 and 8). This is in contrast to the expression of, for instance, laminin mRNAs, which are localized to the endoderm (9,10). Although much has been learnt recently about the roll of epithelial cell plasma membrane receptors and laminin in controlling the assembly of basement membranes (32), it will be of interest to examine how the assembly of the mesoglea is controlled given that the interstitial component is sandwiched between two basement membrane-like layers of matrix and the collagenous and noncollagenous components of the mesoglea are secreted from opposite sides.
There can be little doubt, given the level of sequence conservation between ECM genes in hydra and the rest of the animal kingdom, that common principles govern the assembly, structure, and function of ECMs in the metazoan kingdom. Invertebrate matrices offer alternative model systems that can help highlight and unravel these general principles. Examples such as the independent generation of collagens with multiple vWF A domains on different metazoan lineages serve to highlight general principles, such as the requirements for complex multidomain proteins capable of multiple interactions that under-lie the evolution, structure, and function of collagen-based ECMs.