The Immunoglobulin Heavy Chain Locus of the Duck GENOMIC ORGANIZATION AND EXPRESSION OF D, J, AND C REGION GENES*

The region of the duck IgH locus extending from upstream of the proximal diversity (D) segment to downstream of the constant gene cluster has been cloned and mapped. A sequence contig of 48,796 base pairs established that the organization of the genes is D-J H - (cid:1) - (cid:2) - (cid:3) . No evidence for a functional homologue (or remnant) of a (cid:4) gene was found. The (cid:2) gene is in inverted transcriptional orientation; class switch to IgA expression thus requires inversion of the (cid:1) 27-kilobase pair region that includes both (cid:1) and (cid:2) genes. The secreted forms of duck (cid:2) and (cid:1) are each encoded by 4 constant region exons, and the hydrophobic C-terminal regions of the membrane receptor forms of (cid:2) and (cid:1) are encoded by one and two transmembrane exons, respectively. Putative switch (S) regions were identified for duck (cid:1) and (cid:3) by comparison with chicken S (cid:1) and S (cid:3) sequences and for duck (cid:2) by comparison with mouse S (cid:2) . The duck IgH locus is rich in complex variable number tandem repeats, which occupy (cid:1) 60% of the sequenced region, and occur at a much higher frequency in the IgH locus than in other sequenced regions of the duck genome. Antibodies, found only in to generate 1000 bootstrap replicates of the original alignments. The data were then analyzed for DNA distances using the DNADIST program in PHYLIP. The Kimura two-parameter model option was employed with a transition/transversion ratio of 2.0. The resulting 1000 distance estimates were then used to compare the aver-age distance between the (cid:2) alleles and those of other duck genes by paired t tests.

The region of the duck IgH locus extending from upstream of the proximal diversity (D) segment to downstream of the constant gene cluster has been cloned and mapped. A sequence contig of 48,796 base pairs established that the organization of the genes is D-J H --␣-. No evidence for a functional homologue (or remnant) of a ␦ gene was found. The ␣ gene is in inverted transcriptional orientation; class switch to IgA expression thus requires inversion of the ϳ27-kilobase pair region that includes both and ␣ genes. The secreted forms of duck ␣ and are each encoded by 4 constant region exons, and the hydrophobic C-terminal regions of the membrane receptor forms of ␣ and are encoded by one and two transmembrane exons, respectively. Putative switch (S) regions were identified for duck and by comparison with chicken S and S sequences and for duck ␣ by comparison with mouse S␣. The duck IgH locus is rich in complex variable number tandem repeats, which occupy ϳ60% of the sequenced region, and occur at a much higher frequency in the IgH locus than in other sequenced regions of the duck genome.
Antibodies, found only in the vertebrates (1), are molecules that show enormous diversity in structure. The greatest diversity is associated with their almost limitless repertoire of antigen-binding sites, which are encoded by the V, 1 (D), and J genes. However, the constant (C) regions of the Ig molecules also show diversity within an individual. The antibody heavy chains specify the functions associated with the different classes of antibody, such as complement activation, recognition by phagocytic cells, and secretion across mucous membranes. Within the vertebrates, the V, (D), J, and C genes are arranged in many different patterns. In cartilaginous fishes, the genes encoding both heavy and light chains are arranged in multiple small clusters, each of which contains a single V, one or two D elements (in the case of the heavy chain), a single J, and a single C gene. In contrast, in bony fishes, amphibia, and mammals the IgH loci typically show the so-called translocon arrangement, in which groups of V, D, J, and C genes occur sequentially, from 5Ј to 3Ј within the locus (reviewed in Ref. 2).
The birds are, apart from the mammals, the most highly evolved vertebrate lineage, and their Ig genes show some of the most unusual arrangements and forms of expression. The IgL and IgH loci of chicken each possess only a single functional V and J segment but have up to 100 V region pseudogenes located upstream of the functional V gene (3,4). The major mechanism creating the large and effective repertoire of the chicken antibody molecule is gene conversion, from the upstream pseudogene segments into the functional V gene. Gene conversion occurs both before B cells encounter antigen and during antigen-induced diversification of the binding site, a process in which point mutation events are also involved (5,6). Birds possess homologues of IgM and IgA (7-9) and a third class of antibody, IgY (10 -12). Avian IgY (sometimes termed IgG) shows homologies with both IgG and IgE of mammals. Extant avian IgY is likely descended from the evolutionary precursor of both IgG and IgE (10,13).
Ducks and their relatives have unusual antibody structure and expression, with consequences for their function. Their immune responses are often ineffectual (14), a feature that is explained, in part, by the properties of their antibodies. Although ducks produce a typical avian IgY, they can also generate large amounts of a truncated IgY, termed IgY(⌬Fc) because it is missing the two C-terminal domains of its H () chains (11,12). This structural abnormality of the duck IgY(⌬Fc) would result in the loss of biological effector functions (such as complement activation) associated with the Fc region. The IgA-dependent mucosal immune response of ducks is also problematic, being delayed in its development following hatching (9,15), as compared with the chicken, in which IgA secretion develops more rapidly (16).
The search for a genetic basis to the inept antibody response of the duck has been informative. The IgY(⌬Fc) molecule results from the utilization (by alternative pathways of RNA processing) of a novel, small terminal exon within the gene (11,12). Furthermore, the and ␣ genes have been shown to be in head-to-head configuration within the IgH locus (17). This arrangement would require that class switching to produce either IgA or IgY in the duck involves the inversion of a segment of the IgH locus, rather than the usual deletional mechanism of class switching (18)  organization, recombination, and expression is to be achieved. Presented here are the results of a study to establish the structure of the IgH locus in the duck (Anas platyrhynchos), including the organization of the D, J H , and C region genes, the mechanism of expression of the membrane receptor forms of IgM and IgA, and the genetic basis for class switching.

EXPERIMENTAL PROCEDURES
Library Screening-Approximately 5 ϫ 10 5 plaque-forming units of an unamplified recombinant duck genomic library (12), constructed in DASHII from erythrocyte DNA from one Super M strain duck (duck number 5, Cherry Valley Farms, Rothwell, Lincolnshire, UK) (12), was plated on Escherichia coli strain XL1-Blue MRA (Stratagene, La Jolla, CA). The library was lifted on Nytran filters (Schleicher & Schuell) and hybridized with probes for the C regions of duck , ␣, and (9, 12). Probes were generated by PCR amplification and labeled with [␣-32 P]dATP (19). Following hybridization, filters were washed 3 times for 20 min at 52°C in 1ϫ SSC and 0.1% SDS and dried. X-Omat TM AR films (Eastman Kodak Co.) were exposed to the filters for ϳ60 h at Ϫ80°C and developed.
Mapping and Sequencing of Recombinant Duck Genomic Clones-DNA was isolated from plaque-purified recombinant genomic clones containing Ig C region genes (12,17) and subjected to restriction enzyme digestion. Fragments were separated on agarose gel by electrophoresis and subcloned into pBluescript (Stratagene, La Jolla, CA). The sequencing strategy involved cloning overlapping restriction fragments from the phage inserts. Subclones were completely sequenced (Biotechnology Resource Laboratory, Medical University of South Carolina) on both strands using a combination of double-stranded nested deletions (Nested Deletion Kit, Amersham Pharmacia Biotech) and transposonmediated sequencing (Primer Island System, Applied Biosystems, Inc., Foster City, CA). Sequences were assembled into a contig using the SeqMan program (DNAstar Inc., Madison, WI). The exons encoding the constant domains of the secreted and transmembrane (TM) forms were identified by comparison with cDNA sequences. The exon encoding the ␣TM region was identified through a blastx search (www.ncbi.nlm. nih.gov/blast/blast.cgi) of the region of the locus 3Ј of the C␣4 exon and confirmed by RT-PCR analyses.
Genomic PCR-The C4 to TM1 intron was amplified from genomic DNA (prepared from the erythrocytes of Super M strain ducks by Dr. David Higgins, Hong Kong University) (12) using specific primers (G-1411, C4 forward, 5Ј-CAGCTCAACGCCCACGAGA-3Ј, and for TM1 reverse, G-1530, 5Ј-CTTGATCAAGGTGACGGTGG-3Ј) with the Advantage GC Genomic PCR kit (CLONTECH, Palo Alto, CA). The C2 to T intron was amplified from genomic DNA using the forward primer G-1283, 5Ј-GGTGCGTCGCCGGAGGTGAACCAA-3Ј, and the reverse primer G1284, 5Ј-GGAGGACAACAAAGGTGGTCAGAA-3Ј. The PCRs were performed using 100 ng of genomic duck DNA as template, with initial denaturation at 94°C for 5 min, 30 cycles of 94°C for 25 s, 65°C for 1 min, 68°C for 8 min, and a final step of 68°C for 15 min. The amplified fragments were gel-purified and directly sequenced (C2 to T intron) or ligated into pGemT-Easy vector (C4 to TM1 intron) (Promega, Madison, WI) and subjected to sequencing as described above.
Reverse Transcription and PCR Analysis-Total RNA from the spleen and duodenum of two 6-week-old Super M ducklings was prepared (9) by Dr. David Higgins, Hong Kong University. To detect the duck ␣m message, the RNA was reverse-transcribed using the SMART TM IV oligonucleotide and PowerScript TM (CLONTECH, Palo Alto, CA). Amplification of the first strand cDNA was then carried out using a two-step protocol (20 cycles of 94°C, 1 min, 68°C 6 min, following an initial denaturation of 94°C for 3 min) using the 5Ј-PCR and CDSIII/3Ј-PCR primers (CLONTECH, Palo Alto, CA). A PCR to specifically amplify regions of the ␣m sequence was then carried out on the amplified PCR product utilizing an initial denaturation at 94°C for 3 min, followed by 25 cycles of 94°C for 30 s, 55°C for 30 s, and 68°C for 1 min. One reaction used primers G-778 (forward, 5Ј-GTGACTTG-GACCCAGCAG-3Ј) and G-1752 (reverse 5Ј-AGGGTGACGCCGGTGC-TGTA-3Ј), and the second reaction used primers G-1805 (forward, 5Ј-CACGGTTTCCCCAGAATGC-3Ј) and G-1819 (reverse, 5Ј-ATCAGAGG-ACCTGTGGAGACACC-3Ј). To detect the duck m sequence, 3Ј-RACE (20) was performed. The primer used for reverse transcription was G-413 (5Ј-TCTGAATTCTCGAGTCGACATC(T) 17 -3Ј), and the anchor primer was G-414 (5Ј-TCTGAATTCTCGAGTCGACATC-3Ј), and the gene-specific primer was G-1411 (5Ј-CAGCTCAACGCCCACGAGA-3Ј). PCR products were separated by electrophoresis in a 1.2% agarose gel, purified using the Nucleospin kit (CLONTECH, Palo Alto, CA), and directly sequenced.

Analysis of Sequences and Prediction of Class Switch
Regions-An initial analysis of repeat sequences within the duck IgH locus and in the duck ␦2-crystallin gene (Ref. 21, GenBank TM accession number U06050) and the adjacent duck S-acyl fatty-acid synthase thioesterase and acyl-CoA-binding protein genes (Refs. 22 and 23, GenBank TM accession numbers M21635 and S73733) was carried out using DotPlot within the Megalign (DNAstar TM ) program. In addition, the duck IgH sequence was compared with the switch regions of chicken and genes (Ref. 24, GenBank TM accession numbers AB029075 and AB029077, respectively) and with the switch region of the mouse ␣ gene (Ref. 25, GenBank TM accession number D11468). The nature of repeat sequences within the putative duck switch regions was also analyzed using a local copy of the EMBOSS program etandem (www.uk.embnet. org/Software/EMBOSS/). The significance of differences in the degree of allelic polymorphism in the duck ␣ gene and in other immune-related and nonimmune-related genes was assessed as follows. The coding sequences of alleles of duck ␣, , , interferon-␥, Mx, S-acyl fatty-acid synthase thioesterase and serum amyloid A type B (GenBank TM accession numbers AJ314754, U27222, U27213, X65218, X65219, X78355, X78356, AF087134, AF100929, Z21549, Z21550, M12101, M21635, U59909, and U64985) were aligned using Megalign (DNAstar TM ). The aligned sequences for each gene were then randomly subsampled by the SeqBoot program in PHYLIP (Ref. 26, evolution.genetics.washington. edu/phylip.html) to generate 1000 bootstrap replicates of the original alignments. The data were then analyzed for DNA distances using the DNADIST program in PHYLIP. The Kimura two-parameter model option was employed with a transition/transversion ratio of 2.0. The resulting 1000 distance estimates were then used to compare the average distance between the ␣ alleles and those of other duck genes by paired t tests.

Organization of D, J H , and Constant Region Genes-
The physical map of the IgH locus of the duck extending from, at the 5Ј end, the proximal D segment to downstream of the gene is shown in (Fig. 1). Nucleotide sequences of recombinant clones 13.1, 3.2, 2.1, and 12.2 formed a contig that included the and ␣ genes and the 3Ј region of the gene but which did not overlap with the sequence of clone 5.1, which included the D and J H segments and the 4 exons encoding the secreted form of . The 3.8-kb fragment 00-106, generated by genomic PCR between exons 4 and TM1, filled the gap and linked these sequences (Fig. 1). The sequenced region spans more than 48 kb of the duck IgH locus and includes, in addition to a D and a J H segment, the gene, an inverted ␣ gene, and the exons that encode the ⌬Fc form of the chain. The region downstream of the terminal (T) exon of the gene showed frequent recombinations and deletions upon attempted subcloning, and a reliable sequence could not be determined. The intron between the C2 and T exons was also unstable upon cloning, and its sequence was confirmed by direct sequencing of an 850-bp fragment derived by genomic PCR. The 36-nt-long D segment is open in all three reading frames (Fig. 2)  The splice boundaries of the exons of the secreted forms of duck , ␣ (9), and (the ⌬Fc splice variant (11)) were identified (Fig. 3) by comparisons to corresponding cDNA sequences (Gen-Bank TM U27213, U27222, and X65218, respectively). Although many of the splice-donor sites (Fig. 3) show substantial divergence from the consensus (AG2GTGAG), in all cases the GT/AG splicing rule is observed. The identification of the exons encoding the m and ␣m forms (which also revealed the cryptic donor splice sites for the membrane receptor forms of and ␣) is described below.
A Duck Homologue of ␦?-In primates, rodents and teleost fish a ␦ gene, encoding the H chain of IgD, is found immediately downstream of the gene. The distance between the cleavage/ polyadenylation sequences of the duck and ␣ genes is 2535 bp, which is a sufficient distance to include an additional C region gene. Extensive homology searches (using blastx) and open reading frame analyses did not give any indications of a ␦ gene (or remnants of a ␦ gene) in the TM2/␣TM intergenic region, the C␣1/C1 intergenic region, or elsewhere within the sequenced contig established in this study.
Identification of m and ␣m Transcripts-The membrane receptor forms of duck and ␣ were identified by RT-PCR approaches. The m form was readily identified by 3Ј-RACE using a forward C4-specific primer (G-1411) and an anchor primer (G-414) with duck spleen cDNA. The 547-bp product was subjected to direct sequencing (Fig. 4A), identifying the cryptic splice donor site in C4 and the TM1 and TM2 exons (Fig. 3). The inferred amino acid sequence of the TM (Fig. 4A) showed an extracellular connecting peptide, a hydrophobic transmembrane region, and a cytoplasmic tail with the classical -KVK motif at the C terminus. The conserved CART motif (27), involved in the signal transduction through protein-protein interactions with the CD79a/b complex, was present in the hydrophobic membrane-spanning region.
Attempts to detect the TM form of the duck ␣ message using 3Ј-RACE with forward primers in the C␣4 domain were unsuccessful; the secreted form of the message was the only PCR product detected. Open reading frames that could encode an ␣TM segment were then sought by analysis of the genomic sequence 3Ј of C␣4. A candidate sequence was identified between bases 22,680 and 22,852. To determine whether this sequence was expressed, RT-PCR was performed on duck duodenum mRNA using a forward primer specific for C␣4 (G-778) and a reverse primer within the putative TM exon (G-1752, Fig.  4B). A product of ϳ350 bp was amplified and sequenced. This sequence confirmed that the genomic region putatively identified as encoding the ␣TM exon was expressed and identified the cryptic donor splice site within C␣4 (Fig. 3). It was then possible to amplify the 3Ј end of the ␣TM message in RT-PCR by using a forward primer overlapping the C␣4/TM splice site (G-1805) and a reverse primer 3Ј of the termination codon (G-1819), confirming that the duck ␣TM segment is encoded by a single exon (Fig. 4B). The duck ␣ TM exon is shorter than the ␣TM exons of mouse and human in both the cytoplasmic and extracellular regions. However, the highly conserved residues of the CART motif are present in the membrane-spanning region (Fig. 4B).
Identification of Switch Regions-The switch from expression of IgM antibodies to the production of IgY or IgA involves chromosomal recombination at switch (S) regions typically characterized by long (several kb) regions of complex VNTRlike sequences. A DotPlot analysis of the duck IgH locus gave a surprising result (Fig. 5A); the locus is very rich in repeats of the VNTR type, which accounts for ϳ60% of the sequence even as assessed at relatively high stringency (Ն90% identity). The repeats also show strong local clustering. The expected locations of functional switch regions would be in the J H /C1 intron for S and in the C␣1/C1 intergenic region for both S and (because of the reverse orientation of the ␣ gene) S␣. The three largest blocks of VNTRs in the locus occur immediately upstream of exons C1, C␣1, and C1. The functional S and S regions have been identified in the chicken (24), and DotPlot comparison of the duck IgH sequence with the S and S sequences of the chicken (Fig. 6, A and B, respectively) strongly suggests that the large blocks of VNTRs immediately upstream of S and S in the duck (Fig. 6, A and B) are candidates for functional S regions. The S␣ of chicken is not known. Although DotPlot comparisons of mouse S␣ with the duck IgH sequence (Fig. 6C) did not yield clear results, the heaviest density of similarities was seen in two sites within the ␣ to intergenic region. One of these sites was already identified as the likely S region (Fig. 6B). The second site was the large block of VNTR immediately upstream of ␣ (Fig. 6C) and is a candidate for the duck S␣ region. An analysis of the putative switch regions, using the EMBOSS program etandem, identified a number of repeated motifs for each region (Table I). The arrangement of these motifs within each putative S region is summarized in Fig. 6D. The presence of large numbers of complex VNTR in the duck IgH locus raises the question of whether this is a general feature of duck genes or might be restricted to the IgH locus. Few duck genes have been sequenced. The three genomic sequences that have been deposited in GenBank TM and are of substantial length are the ␦2-crystallin gene (5,069 bp), the adjacent S-acyl fatty-acid synthase thioesterase, and acyl-CoAbinding protein genes (together forming a contig of 12,800 bp). Analyses of these sequences by DotPlot (Fig. 5B), with parameters identical to those used to examine the IgH locus showed, in contrast to the IgH locus (Fig. 5A), a very low prevalence of VNTRs.
Allelic Variations-Comparisons of the previously published cDNA sequences of duck , ␣, and clones and the genomic sequence permitted an analysis of sites of allelic polymorphisms (Table II). This analysis showed an unexpectedly high concentration of polymorphic sites in the ␣ gene: 37 sites of substitution were observed in the ␣ gene, as compared with 10 and 7 in the and genes, respectively (Table II). The substitutions in the ␣ gene are found in all 5 exons but are concentrated in the first part of the C␣2 exon, where 11 of the 37 substitutions are found within a 30-bp region. The overall rate of allelic polymorphism observed in the duck ␣ gene (2.6%) was determined (as described under "Experimental Procedures") to be significantly greater (p Ͻ 0.001) than that calculated for all duck Ig genes (0.67%) or for the duck non-Ig sequences (0.29%) that have been described to date. Thus, whereas Ig genes are known to evolve relatively rapidly (28), the duck ␣ gene seems to be evolving at a particularly accelerated rate. DISCUSSION The IgH locus can be subject to four processes that modify its coding sequence in the course of B cell development and the immune response as follows: site-specific recombination (of V/D/ J), point mutations, gene conversions (in some species), and region-specific recombination (of C region genes). Knowledge of the structure and expression of the IgH locus in ducks sheds light on the genetic basis of their poorly functional antibody response and on the evolution of this complex locus in the vertebrates. The observations on the duck IgH locus made in this study include the following: 1) the unusual organization (-␣-) of the C region genes, 2) the inversion that has accompanied the apparent transposition of the ␣ gene in the locus, 3) the absence of a duck ␦ gene, and 4) the remarkable prevalence of VNTR sequences in the locus. These observations, in total, point to a unique structure for  Table I, is shown in transcriptional orientation under the corresponding putative S region. The arrows above the genes indicate transcriptional orientations. this locus and have significance for the expression of a functional antibody response in the duck. The genes mapped and sequenced in this study are in the order D-J H --␣-, suggesting strongly that the duck, like the chicken, possesses a single J H segment (4). The inverted transcriptional orientation of the duck ␣ gene, while definitively shown here, was the only logical interpretation of previous mapping analyses (17) and, interestingly, may be widespread in the birds, as PCR-based approaches have indicated a similar arrangement in the chicken (29). The current position and orientation of the duck ␣ gene is most readily explained by an ancient translocation event that also inverted the ␣ gene as it was inserted into its present position. This follows from the observation that mammalian ␣ genes are found as the 3Ј-most C region genes and in the same transcriptional orientation as the other C region genes (30). The translocation and inversion of ␣ must have occurred in a common ancestor of chickens and ducks, but whether this feature is restricted to the galloanserine lineage (31) or shared by all birds is unknown. The position immediately downstream of the gene is the site in which all known ␦ genes are found (32,33). The insertion of the ␣ gene in this position downstream of may have disrupted the ␦ gene and accounted for the apparent absence of ␦ from the present day IgH locus of the duck. No evidence has been found to support the presence of a ␦ gene, or the discernible remnant of one, in the duck IgH locus. The absence of an IgD from ducks provides further evidence for the functional redundancy of this class of antibody (34).
All Igs can be expressed, by alternative RNA processing, in either secreted forms or as membrane-bound receptors for antigen on B cells. Typically, the hydrophobic transmembrane tail of the receptor form of Igs is encoded by 2 exons (TM1 and TM2) that splice into a cryptic site in the terminal secreted C region exon (35). The membrane receptor form of IgA has been studied previously only in mammals, where the transmembrane region has been shown, uniquely, to be encoded by only a single exon (36). The results presented here show, similarly, a single TM exon in the duck ␣ gene. Thus, the single TM exon in the vertebrate ␣ gene must have developed prior to the divergence of the lineages that would give rise to birds and mammals. In the case of the mammalian ␣ gene, the low frequency of the ␣m message has been suggested to reflect, at least in part, the long (ϳ2.5 kb) intron separating the TM exon from C␣3 (37). The homologous C␣4/TM intron in the duck is close to 5 kb long and may, by the same reasoning, be responsible for the low frequency of ␣m message and in part explain the difficulty in detecting it, even in RT-PCR. However, undefined cis-acting elements present in the mammalian C␣3/TM intron also appear to influence the regulation of ␣ mRNA processing (38). Thus, the principal difference between the IgA of mammals and that of birds (only ducks and chickens have been examined in detail) is that in mammals the ␣ chain is shorter by one C region domain, reflecting the loss of the original C␣2 exon (39), which has been replaced by a flexible hinge region.
The information presented here allows further examination of the likelihood that the structure of the IgH locus is the cause of the "inept" antibody response in the duck. In one instance it is clear that the structure of the IgH locus leads to expression of a deficient antibody. The IgY(⌬Fc) antibody, which lacks the functionally important Fc region, results from an alternative pathway of processing of the primary transcript from the gene, in which the small terminal exon between the exons encoding 2 and 3 ( Fig. 1) is used (12). In a second instance, the inverted position of the ␣ gene in the duck raises the possibility that its orientation is related to the delayed production of IgA observed in ducks (9,15). This is because the expression of ␣ requires, of necessity, an inversional mechanism of class switching, as opposed to the typical deletional rearrangement. The frequency of inversions during Ig class switching in the IgH locus has been shown, in a mouse cell line, to be lower than that of deletion events (40). Inversions occurred in ϳ23% of the rearrangements, indicating that deletions are apparently, in mammals, the favored outcome of class switch events at the IgH locus. However, the simple correlation of an inverted ␣ gene with a delayed switch to IgA production is not supported by the evidence from the chicken. The ␣ gene in chickens also appears, from indirect evidence based on PCR approaches (29), to be inverted. However, IgA production in chickens develops rapidly after hatching (16), indicating that an inverted ␣ gene is not, per se, linked to inefficiencies of expression. Thus, delayed IgA expression in the duckling must result from other causes, such as the cytokine control of the mechanisms driving class switching to IgA.
Birds are considered to have a condensed genome, about one-third the size of that of mammals (41). Although the IgH locus in ducks is shorter overall than in mammals, each duck C region gene is considerably larger than its mammalian homologue. For example, the mouse gene covers ϳ4 kb (42), whereas the duck gene measures close to 10 kb in length, a difference that is attributable to differences in intron length. In the case of the ␣ gene, lengths are ϳ6 kb in the mouse versus 11 kb in the duck, a difference attributable both to longer introns and to the presence of an additional exon in the duck gene. Whereas the overall condensation of the avian genome is generally considered to have been accompanied by a loss of repetitive DNA (43,44), the data presented here show that the duck IgH locus is, unexpectedly, very rich in VNTRs, which account for ϳ60% of its sequence (Fig. 5A), a much higher value than seen in mammalian IgH loci. Whereas other sequenced regions of the duck genome (Fig. 5B) contain much lower numbers of VNTRs than the IgH locus, the interpretation of these comparisons is complicated by the fact that VNTRs are asymmetrically distributed on chromosomes. For example, a telomeric bias in VNTR distribution has been reported on human chromosome 22 and chromosome 1 of Caenorhabditis elegans, but a centromeric bias is present in chromosome 4 of Arabidopsis thaliana (45). As only a small proportion of the duck genome has been sequenced, it is not possible to conclude definitively that a high content of VNTRs is unique to the IgH locus. However, VNTRs are often associated with sites of recombination (46), and the VNTRs in the duck IgH locus may, in addition to a role in the Ig class switch mechanism, have facilitated the inversion and translocation of the ␣ gene that is a prominent feature of this locus.