 |
INTRODUCTION |
A large number of interactions in the cell are mediated by
families of protein binding modules that are found repeatedly and in
different combinations in several proteins. Typically these modules
mediate protein-protein interactions through recognition of short
peptides in the target protein (1). Several approaches, based upon the
screening of repertoires of combinatorial peptides, have been developed
to investigate the recognition specificity of these domain families.
Phage display of small peptides of random sequence has been
successfully used for the characterization of binding domains such as
SH2, SH3, WW, EH, etc. (reviewed in Ref. 2). PDZ domains (identified as
conserved elements in postsynaptic density protein PSD-95, Disc-large
tumor suppressor Dlg, Zonula occludens protein ZO-1) differ from the
remaining domains since they bind to specific carboxyl-terminal
sequences of target proteins and/or dimerize with other PDZ domains
(reviewed in Ref. 3). This peculiarity has limited the possibility of
using "classical" peptide repertoires displayed by fusion to M13
coat proteins, since these display systems present random peptides by
fusing them to the amino terminus of pIII or pVIII coat proteins. As a
consequence, PDZ specificity has been studied using repertoires of
chemically synthesized random peptides (4, 5) or alternative display
systems, such as fusion to the carboxyl terminus of Lac repressor (6).
Cyclic peptides may also be forced into conformations that mimic
carboxyl termini (7). More recently Fuh et al.
(8) show that the M13 pVIII protein can tolerate peptide extensions at
its carboxyl terminus and have assembled and exploited a repertoire of
random carboxyl-terminal peptides to study the binding specificity of
two PDZ domains of the protein MAGI (membrane-associated guanylate kinase with inverted orientation). This approach yielded a family of
specific ligands for PDZ-2, whereas only one ligand for the other
domain (PDZ-3) could be identified. This result is probably a
consequence of the low copy number of the displayed recombinant peptides, because even when the display vector was modified to increase
the display density by 10-fold, the majority of the phage coat was made
up of wild type pVIII molecules supplied by the helper phage (8, 9).
Furthermore, because of the topology of the assembled pVIII coat
protein, whose carboxyl terminus is buried into the capsid in contact
with the genomic single-stranded DNA (10), it is possible that only a
fraction of peptides of random sequence can be tolerated in this
display system without causing steric hindrance.
To overcome these limitations we have designed a new type of
combinatorial library, where random peptides are displayed on the
surface of
phage by fusion to the carboxyl terminus of the D-capsid
protein. In this system about 95% of the D proteins are recombinant,
and because the carboxyl terminus of this protein does not appear to be
involved in head formation (11), subunits containing short peptide
extensions are expected to be assembled at the same high density
irrespective of their sequence.
To investigate the binding specificity of different PDZ domains, we
have panned such a carboxyl-terminal library with each of the PDZ
modules of hINADL,1 a protein
homologous to dINAD of Drosophila melanogaster
belonging to the family of multi-PDZ proteins (12). dINAD is the
prototypical member of this family characterized by the exclusive
presence of multiple copies of PDZ domains without any catalytic or
other known binding domain (13). The five PDZ domains of dINAD
participate in distinct structural and signaling functions by binding
to different protein partners to assemble a macromolecular transduction
complex that enables high speed signaling in Drosophila
photoreceptors (3).
The hINADL gene was isolated because of its sequence homology to dINAD
and was found to be expressed in different spliced forms in various
organs and tissues (12). Differently from dINAD, hINADL encodes a
protein that contains seven PDZ domains. Although no natural partner
has been characterized to date, the identification of PDZ domains
repeated in tandem suggests that hINADL also functions as a molecular
scaffold for organizing protein complexes. In this manuscript we
analyze the binding specificity of each of the seven PDZ domains of
hINADL, and we implement a computational procedure to infer the binding
probability of different peptide-domain complexes.
 |
EXPERIMENTAL PROCEDURES |
Materials--
Bacterial strains used were Escherichia
coli BL21(DE3), hsdS gal (
cIts857
ind1 Sam7 nin5 lacUV5-T7
gene 1) and BB4, supF58 supE44 hsdR514
galK2 galT22 trpR55 metB1
tonA
lacU169 F'
[proAB+ lacIq
lacZ
M15 Tn10 (tetr)].
Bovine serum albumin and Tween 20 were from Sigma. pGEX-2TK, glutathione-Sepharose, and anti-glutathione S-transferase
(GST) antibody were from Amersham Pharmacia Biotech. SpeI
and NotI restriction enzymes and T4 ligase were from
Biolabs. Dynabeads were from DYNAL; Gigapack® III Gold packaging
extract was from Stratagene.
Library Construction--
The
display vector,
Dsplay1, is
described in detail in Castagnoli et al. (14). It contains
an additional D gene under the control of pTRC promoter
followed by SpeI and NotI unique cloning sites.
To assemble the carboxyl-terminal peptide library, the oligonucleotide
R384 (5'-GTCGAATTCTTAGCGGCCGCATTA-3') was used as a primer for
oligonucleotide R383
(5'-TCATGCCATGGAGACTAGT(NNK)9TAATGCGGCCGCTAAGAATTCGAC-3') (where K = G or T) containing nine degenerate codons
followed by the stop codon TAA (in bold). The restriction sites
SpeI and NotI are underlined. The two
oligonucleotides were biotinylated at their 5' end and annealed by
mixing 200 pmols of each in 200 µl, heating at 65 °C for 5', and
slowly cooling to room temperature. The annealed DNA was made
double-stranded by incubating for 2 h at 37 °C after the
addition of dNTPs (500 µM) and 50 units of the Klenow
fragment of DNA polymerase I. After digestion with SpeI and
NotI, the biotinylated terminal fragments were removed using
streptavidin-Dynabeads. The SpeI/NotI DNA
fragments were passed through a Sephadex G-50 column, ligated into
SpeI/NotI-digested
vector, and incorporated
into phage particles by in vitro packaging. Bacteriophages
were propagated in BB4 cells on L agar plates.
GST Fusion Proteins--
cDNA sequences coding for the seven
INADL PDZ domains (accession number AJ224747) were amplified by
polymerase chain reaction from plasmid hINADL, kindly provided by S. Philipp and V. Flockerzi (12), using oligonucleotides designed to
hybridize to the DNA regions flanking the PDZ domains coding sequences.
The borders of the domains are indicated in Fig. 1. The amplified
fragments were inserted in-frame into pGEX-2TK (Amersham Pharmacia
Biotech) using the BamHI and EcoRI restriction
sites. GST fusion proteins were expressed in E. coli
BL21(DE3) and purified according to manufacturer's instructions.
Panning--
Affinity selection of the peptide displaying
phages was performed as described in Zucconi et al. (15)
with minor modifications. Glutathione-Sepharose 4B beads (30 µl of
slurry) were coated with ~10 µg of GST fusion proteins and
pre-incubated in phosphate-buffered saline with 3% bovine serum
albumin for 4 h at 4 °C. The library was then added (about
2 × 109 plaque-forming units), and the incubation was
performed for 1 h at 4 °C in SM buffer (150 mM
NaCl, 10 mM MgSO4, 50 mM Tris HCl, pH 7.5). The Sepharose beads were then collected by centrifugation and
washed 10 times with ice-cold phosphate-buffered saline, Tween 0.05%.
The adsorbed (A) and non-adsorbed (N) phages were titered by counting
plaques on a BB4 bacterial lawn. The adsorbed phages were propagated by
plate lysate, eluted, concentrated by polyethylene glycol
precipitation, and subjected to the next panning cycle. After three
selection cycles, phage clones were purified and tested by solid phase
immunoassay against their bait proteins.
Solid Phase Immunoassay, Phage Enzyme-linked Immunosorbent
Assay--
Microtiter wells were coated overnight at 4 °C with
anti-GST antibody (10 µg/ml), and after washing, PDZ-GST fusion
proteins (or control GST) were added at a concentration of 20 µg/ml
phosphate-buffered saline. After washing, about 107 phage
particles of each selected clone were added to the appropriate well and
incubated overnight at 4 °C. The bound phage particles were revealed
using a purified IgG fraction of anti-
serum (rabbit) diluted 1:500
from a170 µg/ml stock (16) and a secondary, alkaline phosphatase-conjugated, anti-rabbit goat antibody (Sigma A-8025). The
chromogenic reaction was developed for 1 h at 37 °C by adding p-nitrophenyl phosphate substrate (Sigma 1047), and
the reading was performed at 405-620-nm dual wavelength.
Sequencing--
The DNA inserts of selected phage clones were
amplified by polymerase chain reaction, purified on QIAquick columns
(Qiagen), and sequenced using an ABI PrismTM 310 genetic analyzer
(PerkinElmer Life Sciences).
Mutagenesis--
Site-specific mutagenesis was performed using
the unique site mutagenesis kit (Amersham Pharmacia Biotech)
that utilizes a two-primer system to generate site-specific mutations
in double-stranded plasmids. PGEX-PDZ7 was used as a template.
The mutagenic oligonucleotides were R722,
5'-CCTTCTTCATAGACTGAACGGATAACTATAGCATTC-3' (for PDZ7RS), and R723,
5'-GGCTGTGATGGCTTCTGCCAGGCTGGAGTTCCTCAG-3' (for PDZ7LA).
Homology Modeling--
The model structures of the PDZ7 wild
type, LA, and RS mutant domains of hINADL were built using the PSD-95-3
PDZ domain (Protein Data Bank code: 1be9) as template structure
(17). Direct alignment of the two sequences gives an overall similarity
of 61 and 47% identity. Modeling was carried out on a Silicon Graphics O2 work station using the program InsightII (18).
The model was subjected to limited energy refinement (program Discover, steepest descent algorithm, Molecular Simulations S.a.r.l.). The SearchLoop Protein command was used to find suitable geometries for
residue insertion in one loop of the defined model. The SURFNET procedure (19) was used to measure the volumes of the cavities with
minimum and maximum gap sphere radii equal to 1 and 4 Å, respectively.
Computational Analysis--
Details about the SPOT procedure
were essentially as previously described (20). The contact matrix shown
in Fig. 3 was derived from the analysis of the three-dimensional
structures of PDZ domain-ligand complexes solved by x-ray
crystallography, Protein Data Bank entry codes 1qav (syntrophin-nNOS)
(21), 1kwa (Cask-LIN-2) (22), 1be9 (PSD95-3) (23). The residue-residue
interaction data base was constructed starting from a total of 311 peptides selected from 10 different PDZ domains: 90 ligands of nNOS (6,
24); 8 of Af6 (25, 26); 8 of PSD95-2 (27, 28); 13 of PSD95-3 (6,
29)2; 6 of
1-syntrophin
(6, 7, 21, 30); 5 of Cask (31, 32); 24 of dINAD1 (33, 34); 33 of
Na+/H+ exchanger regulatory factor (NHERF)-1
and 16 of NHERF-2 (35); 10 of MAGI-2 (8); plus 97 ligands of the 7 PDZ
domains of INADL described in this paper (Fig. 1). The frequency of the
residues in the contacting positions was deduced from the alignment of the carboxyl-terminal portions of the ligand peptides of each PDZ domain.
 |
RESULTS |
Construction of a Carboxyl-terminal Random Peptide Library
Displayed on
Phage--
We have assembled a new type of peptide
repertoire in which nonapeptides of random sequence are fused to the
carboxyl terminus of the D protein and are efficiently displayed on the
surface of the
phage capsid. To this end we have used a modified
vector,
Dsplay1 (14) derived from
pRH825 phage constructed by
Sternberg and Hoess (36).
Dsplay1 drives abundant and stable
expression of recombinant D proteins, because one of the loxP sites
flanking the D coding sequence and causing genetic instability in the
original vector has been deleted. In assembled
heads, ~95% of
the total D protein is chimeric, and the remaining 5% is synthesized
by a second wild type gene (14).
The carboxyl-terminal peptide library that we have
constructed contains 107 independent clones, each
displaying a different nonapeptide. This complexity is sufficient to
ensure that all the possible tetrapeptides (1.6 × 106) are represented in the library. Since the four
carboxyl-terminal residues in the target peptide make the main contacts
with the PDZ domain (21-23), this repertoire represents a powerful
tool to characterize PDZ peptide recognition specificity. The
heterogeneity of the displayed peptides was verified by sequencing the
3' ends of the D gene in randomly isolated phage clones. A minority of displayed peptides, whose degenerate sequence contains internal stop
codons, were shorter than nine residues. To test the effectiveness of
the approach, we performed a pilot panning experiment using as bait a
PDZ domain whose recognition specificity has already been determined.
We overexpressed the third PDZ domain of PSD-95 protein (37) as fusion
to GST, and we used it to select ligands from the
displayed
carboxyl-terminal repertoire. All selected peptides matched the
carboxyl-terminal sequence of the natural partners (xT/SxV) (6, 29,
38).
Selection of INADL PDZ Binding Phages--
To identify the
preferred ligands of each of the seven PDZ domains of the hINADL
protein (12), we constructed a panel of GST fusion proteins, each
expressing a different PDZ domain. Domain borders, determined on the
basis of the Pfam v5.5 (39) sequence alignment, are indicated in
Fig. 1, top.

View larger version (43K):
[in this window]
[in a new window]
|
Fig. 1.
Peptides selected by hINADL PDZ domains.
Top, schematic representation of hINADL-PDZ domains. The
numbers above the PDZ domains indicate the first and last
residue relative to human INADL sequence (GenBankTM
accession number AJ224747) that were included in our GST-PDZ fusion
proteins. Bottom, lists of PDZ-binding peptides. The
numbers on the left are the names of the phage
clones. Clones that were independently isolated more than once are
underlined. Peptides shorter than nine residues derive from
internal stop codons. The single-letter code for amino acids is used;
conserved residues are in bold; asterisks
indicate stop codons. and represent residues with hydrophobic
or aromatic side chains respectively. Residues that are conserved in
more than 50% of the peptides are represented in the
consensus.
|
|
Purified fusion proteins bound to glutathione-Sepharose were used to
pan the
-displayed carboxyl-terminal library. Three selection cycles
were sufficient to enrich the number of binding clones of at least 2 orders of magnitude. Phages sorted by panning experiments were further
tested by solid phase immunoassay. The selected clones were subjected
to DNA sequencing to deduce the amino acidic sequence of the exposed
peptides; the sequences were aligned at the carboxyl terminus, and a
consensus motif representing the ligands of each domain was derived
(Fig. 1). In agreement with previous observations, conserved residues
(in bold) were only found in the three/four positions
preceding the carboxyl terminus (4). Ligand peptides that share a
carboxyl-terminal motif containing Ser or Thr at
P
2, were defined as class I ligands, whereas class
II ligands have hydrophobic or aromatic residues at that position (4).
According to this criterion INADL-PDZ domains 1, 2, 3, and 4 were found to bind preferentially to class II, whereas PDZ domains 5, 6, and 7 bound to class I ligands.
None of the INADL-PDZ domains showed a preference for class III
peptides, characterized by acidic residues in P
2 (6).
PDZ5 and PDZ6, however, besides the preferred Ser/Thr, also
tolerate some acidic or hydrophobic residues at P
2.
PDZ3 domain was assigned to a new specificity class, which we
designated as class IV, because it was never described before. PDZ3
displays a preference for acidic residues in position P0 in
a subgroup of its ligands (the other subgroup conforming to class II).
Peptides belonging to either class were found to bind to the PDZ3
domain with comparable affinity, as judged by solid phase immunoassay
(Fig. 2: peptides 3.16 and 3.17). In this
respect PDZ3 is unique among the PDZ domains characterized so far,
since most members of this domain family prefer hydrophobic residues at
the P0 position.

View larger version (60K):
[in this window]
[in a new window]
|
Fig. 2.
Cross-reactivity of PDZ domains. Two
phage clones that display peptides representative of those selected by
panning with each of the seven INADL PDZ domains were challenged by
solid phase immunoassay with the seven GST-INADL-PDZ domains,
immobilized on a multi-well plate. Only the four carboxyl-terminal
residues of the displayed peptides are reported in the first column
together with the clone number. GST protein was used as control. The
values (A405-620 nm) are the average of
three independent experiments.
|
|
Close inspection of the peptide sequences aligned in Fig. 1
indicates that, aside from the simplified distinction into classes, the
seven domains display unique preferences. Peptides that bind to PDZ1
often have basic residues such as Arg or Lys at position P
3, (consensus [V/R]
W
*) (
stands for aromatic,
and
stands for hydrophobic residue). PDZ2 prefers Phe as the
carboxyl-terminal residue and often selects acidic residues at position
P
1 or P
3 (consensus E
DF*). The peptides
selected by PDZ3, by contrast, can be divided into two families; a
subset of peptides conforms to type II with preference for Asp at
P
1 (consensus
D
*), and the remaining peptides
define the novel class IV (consensus
[D/E]*). PDZ4 prefers
aromatic, hydrophobic, and acidic residues at positions
P
1, P
2, and P
3, respectively
(consensus E
V*). PDZ5 has a strong preference for Trp at
P
1 (consensus [S/T]W[VL]*) and can bind peptides with
Glu at P
2. The PDZ6 binding consensus is
similar to the one of PDZ5 but shows a higher variability both at
P
1 and P
2. At P
3 hydrophobic
residues are preferred (consensus
S
V*). Finally, PDZ7 is a
canonical class I domain that has a marked preference for Ser at
P
2 and Val at P0 and very little selectivity,
if any, at P
1 and P
3 (consensus [S] × [V]*).
Cross-reactivity of PDZ Domains--
To test the cross-reactivity
of the different PDZ domains, we performed a solid phase immunoassay
where two representative phage clones for each domain were challenged
with all the remaining INADL-PDZ domains immobilized on a multi-well
plate (Fig. 2). The analysis revealed that some peptides have a high
specificity and only bind to one domain; others were found to be more
promiscuous and reacted also with domains different from the one that
originally selected them in the panning experiment. In most cases,
cross-reacting domains, as defined by this test, belong to the same
domain class. As expected, PDZ7 displayed the lowest specificity,
accepting any peptide containing Ser or Thr at position
P
2. By contrast, PDZ1 and PDZ4 are characterized by the
highest selectivity and do not bind to any of the peptides selected by the other domains.
Site-directed Mutagenesis of Contact Positions and Homology
Modeling--
To contribute to the characterization of the molecular
basis of the binding specificity mediated by PDZ domains, we generated site-directed mutations in the PDZ7 domain, with the aim of altering its ligand preference and converting it into a domain with class II
binding specificity.
The contact matrix in Fig. 3a
was derived from the analysis of the three-dimensional structures of
PDZ domains crystallized with their targets (21, 22, 23). The four rows
represent the four carboxyl-terminal positions of a ligand peptide,
whereas the 23 columns represent the PDZ residues that contact the
target peptide in at least one of the complexes of known structure. Two residues are defined as being in contact when, in any of the three complexes of known crystallographic structure, the shortest distance between their atoms is less than the sum of their van der Waals radii
(r) + 3 Å (Fig. 3, gray cells). When the
distance is shorter than r + 0.6 Å, the cell at the
intersect is shown in black. To identify residues that may
be involved in target recognition in hINADL-PDZ domains, we aligned
their primary sequences with those of the three crystallized domains.
Several residues in the contact positions (in bold) are
conserved (Fig. 3b). The first residue of the helix
B,
corresponding to position 16 in the contact matrix, was found to
influence the preference for specific residues at P
2 in
the ligand peptide (4, 6, 40). His at that location correlates with
preference for Ser/Thr
2 (class I), whereas PDZ domains
containing a hydrophobic residue preferentially bind to class II
ligands (4, 23, 41). The results that we have obtained with the PDZ
domains of hINADL are only partially in accord with these observations.
The three class I-PDZ domains of hINADL have His at contact position 16 (shadowed in Fig. 3), but among the four class II-PDZ
domains, PDZ1 has His and only PDZ4 has a hydrophobic amino acid (Leu)
at position 16.

View larger version (58K):
[in this window]
[in a new window]
|
Fig. 3.
PDZ-specific contact matrix. A contact
point matrix was derived from the three-dimensional structures of PDZ
domains crystallized with their targets (see "Experimental
Procedures"). The rows represent the four
carboxyl-terminal positions of a ligand peptide, and the
columns represent the 23 positions of the multiple alignment
of PDZ sequences that contact the target peptide. The lines
leading from the columns extend to residues in the PDZ sequence that
contact one or more residues in the target. Cells at the intersect are
shaded according to the distance between the interacting
atoms; gray or black colors correspond to
distances shorter than the sum of the van der Waals radii
(r) + 3 Å or r + 0.6 Å, respectively.
b, multiple alignment of the sequences of PDZ domains whose
structure in complex with a target peptide has been solved. The PDZ
residues in contact with residues of the peptide are shown in
bold. c, multiple alignment of the hINADL-PDZ
domain sequences. Positions 12-13 and 16-17, which were altered by
site-directed mutagenesis in this work, are shaded.
d, multiple alignment of the sequences of the other PDZ
domains, whose peptide ligands were inserted in the SPOT data
base.
|
|
To convert the class specificity of PDZ7, we changed by site-directed
mutagenesis, the di-peptide H16E17 in PDZ7
(class I) into L16A17, as in the corresponding
helix of PDZ4 (class II). The ligand preferences of the mutated domain
PDZ7LA were analyzed by panning the
-displayed carboxyl-terminal
library (Fig. 4). Phages selected by
PDZ7LA do not bind to wild type PDZ7 in solid phase immunoassay. On the
other hand, we did not observe a clear shift from class I to class II
specificity but rather a decrease in selectivity of the mutant domain,
since no preference for a specific residue at P
2 was
detected. Both class II and class I ligands were represented among the
selected peptides. Thus the dipeptide L16A17
was not sufficient to graft onto PDZ7 the striking preference for a
hydrophobic residue at P
2, as observed for PDZ4.

View larger version (58K):
[in this window]
[in a new window]
|
Fig. 4.
Peptides selected by PDZ7 mutant
domains. Numbers and symbols are defined as
in the legend of Fig. 1.
|
|
To rationalize this result, we used homology modeling to build and
compare the three-dimensional models of mutant PDZ7LA domains in
complex with peptide LA11 carboxyl-terminal residues (RVSV*) and
wild-type PDZ7 in complex with one of its target peptides, 7.16 (RSSV*). When the mutant and the wild type models are compared, the
most prominent difference is the larger cavity observed in the mutant
binding site (Fig. 5). We used SURFNET, a
procedure for visualizing molecular surfaces, cavities, and
intermolecular interactions (19) to measure the volume of the two
clefts; the difference between the mutant and the wild type cavities
(shown in magenta in Fig. 5) is about 60 Å3.
This structural feature is consistent with the experimental observation
that the binding pocket has become less selective and tolerates
residues with large side chains at P
2.

View larger version (48K):
[in this window]
[in a new window]
|
Fig. 5.
Model complex of PDZ7 and PDZ7LA domains with
their peptide ligands. Surface representation of domain-ligand
complexes obtained with the WebLab WieverLite 3.20 software by
Molecular Simulations Inc.
(www.msi.com/life/products/weblab/index.html).
Residues that form the binding pocket that hosts the side chain at
P 2 in the ligand peptide are indicated with white
letters and purple. In the wild type PDZ7 the shortest
distance between His-16 and Arg-23 is 7.85 Å; in the LA mutant the
distance between Leu-16 and Arg-23 is 9.09 Å.
|
|
PDZ7 accepts any residue at position P
1, where, in
contrast, PDZ4 prefers aromatic residues. Several contact positions
might influence this preference. Residues at
B2 (position 7 in the contact matrix in Fig. 3) and
C5 (position 13) have been suggested to be the main determinants of side chain preference at
P
1 (38). Since PDZ7 and PDZ4 have the same residue
(Ser) at position 7, we decided to exchange residues of the
C strand
that in the matrix are predicted to contact the ligand at
P
1 and P
3. Amino acids
H12E13 of PDZ7 were substituted with
R12S13 as in PDZ4. The ligand preference of
mutant PDZ7RS was determined by panning phage-displayed peptide
repertoires. The consensus sequence (S[
/D]V*), derived from the
selected peptides (Fig. 4), identifies an acquired preference for
peptides carrying either an aromatic residue or an Asp at
P
1. Whereas the aromatic residue at P
1 is
consistent with the PDZ4 consensus, the preference for Asp was not
anticipated. On the other hand, we have observed that the two domains
displaying specificity for acidic residues at P
1, PDZ2
and PDZ3, have similar residues at the positions that we have mutated,
R12T13) and (K12S13), respectively.
All peptides selected by PDZ7RS are class I ligands; they also bind to
wild type PDZ7 when tested by solid immunoassay (data not shown). We
compared the homology models of PDZ7 and PDZ7RS domains, in complex
with the carboxyl-terminal residues of peptide RS14 (ETDV*) (Fig.
6). In the PDZ7RS·RS14 model
complex, the residue at position 12 forms an additional hydrogen bond
with the side chain of the residue at position P
1 of the
ligand. After energy minimization, the lowest energy level for this
complex was reached with the Asp (P
1) side chain pointing
toward the side chain of the mutated Arg12. In contrast,
the Asp (P
1) side chain of the peptide is solvent-exposed
in the wild-type/peptide model.

View larger version (33K):
[in this window]
[in a new window]
|
Fig. 6.
Model complexes of PDZ7 and PDZ7RS domains
with the peptide ligand RS14 (ETDV*). Ribbon diagram of the
domain/ligand complexes obtained with the Swiss-PDBWiever software
v3.7b2 and with the POV-RayTM software (www.povray.org)
(48). The side chains of the residues that were mutated, and the
peptide ligands are shown. The hydrogen bonds (yellow
sticks) formed by the residues at position 12 in the models are
highlighted. According to the model derived from the
structure of PSD95-3 complexed with its ligand (23), the residue at
position 12 in PDZ7 wild type forms one hydrogen bond with the side
chain at P 3 in the peptide. In contrast, the Arg at
position 12 of PDZ7RS forms a second hydrogen bond with position
P 1.
|
|
In principle the identity of the side chains at position 12 and 13 should also influence the amino acid preference at P
3,
where PDZ4, different from PDZ7, shows a preference for acidic side
chains. However, exchange of the residues at positions 12 and 13 was
not sufficient to transfer this specificity from PDZ4 to PDZ7, since
only one of the selected peptides (RS14) has a Glu at P
3.
In both the three-dimensional models shown in Fig. 6, the side chain of
Glu at P
3 of the RS14 peptide points toward Arg or His at
position 12.
Application of the SPOT Algorithm to PDZ Domains--
By
changing some residues in the PDZ peptide binding pocket, we have been
able to modulate the recognition specificity of this domain. However,
different from previous reports, our experiments have not revealed
simple rules that permit the inference the preferred ligand of PDZ
domains. This suggests that several contacts may influence the
preferred amino acid at the different ligand positions in a way that is
often difficult to predict. To approach a similar problem related to
recognition specificity mediated by SH3 domains, the algorithm SPOT
(Specificity Prediction of Target) was recently developed (20). This is
based on a statistical method that, by taking into account the
frequency with which residue X in the domain binding surface
faces residue Y in a collection of ligand peptides at any of
the contact positions, permits the evaluation of the likelihood
that any SH3 domain binds to any peptide.
The applicability of the approach depends on the availability of
crystal structures of at least one domain-peptide complex to identify
the contact positions and of a collection, as large as possible, of
experimentally determined ligands for a variety of domains of the same
family. Furthermore, the domain family and the ligand peptides should
be sufficiently homogeneous to permit their confident alignment in the
binding region, to allow a correct identification of the residues in
the defined contacting positions. The sequence identity between the PDZ
domains (shown in Fig. 3b) and the INADL PDZ domains (Fig.
3c) allows a fairly reliable alignment of their sequences,
and the ligand peptides can be unambiguously aligned since they all
bind the PDZ domains through their carboxyl-terminal end. The results
of the phage display screening described above substantially enrich the
collection of specific ligands of PDZ domains so far available and may
permit the extension of the SPOT algorithm to the PDZ domain.
The matrix shown in Fig. 3 defines the PDZ/peptide contacts. Each of
these contact positions is associated to a 20 × 20 matrix that
contains the frequencies of occurrence of the residues observed in
those positions in PDZ domains and peptides able to form a stable
complex. This data base was constructed using available experimental
data deriving from the screenings of combinatorial repertoires with PDZ
domains or from reports where multiple ligands for the same PDZ domain
were described (see "Experimental Procedures").
SPOT permits the ranking of a collection of peptides according to their
propensity to bind a specific domain or to infer the sequence of a
consensus ligand by comparing the amino acid sequence of the peptides
that obtain the highest scores. This can in turn be matched to the
experimentally determined consensus. When this procedure is applied to
the 7 PDZ domains of INADL, in all cases the SPOT consensus compares
well with that experimentally determined.
We then questioned whether the information provided to the algorithm
was sufficient to infer the interaction of ligands that were not
included in the interaction data base. For this purpose we chose to
investigate the performance of SPOT when tested on MUPP1, a protein of
the multi PDZ family containing the highest number of PDZ domains.
MUPP1 was identified in three independent laboratories on the basis of
its ability to bind to the carboxyl terminus of the serotonin (5-HT2C)
receptor (13), of the 9ORF1 viral transforming protein (42), and of NG2
proteoglycan (43). Although the MUPP1 domains that are responsible for
target recognition have been experimentally determined, the binding
specificity of the different MUPP1 PDZ domains was never
investigated in detail. We have used the SPOT algorithm to
establish which of the 13 PDZ domains is most likely to be responsible
for binding to the carboxyl terminus of the proteins that were found to
form a complex with MUPP1. The carboxyl-terminal residues of each MUPP1
protein ligand were ranked by SPOT against the 13 PDZ domains (Table
I). The size of the data base clearly
influences the performance of the algorithm, since the addition of the
INADL ligand peptides significantly improved the prediction results.
The SPOT inferred and the experimentally determined binding domains
compared rather well; PDZ1, the main binder for NG2 proteoglycan (43),
is ranked first by SPOT. PDZ10 obtains the highest score when tested
with 5-HT2A, -2B, -2C receptors carboxyl-terminal peptides, in
agreement with the results of two-hybrid binding assays (44). Finally,
PDZ13 and PDZ11 are predicted to be the best ligands of the 9ORF1
peptide, with PDZ10 ranking third, whereas experimental data indicate
that PDZ10 and PDZ7 are the receptors of the carboxyl terminus of 9ORF1
(45). The predictive reliability of the method will increase with the
enrichment of the PDZ-specific matrix with interaction data derived
from more comprehensive lists of peptide ligands. In this respect, the
approach that we have developed, based on the screening of
-displayed carboxyl-terminal peptide libraries, is likely to facilitate the rapid accumulation of new binding information.
View this table:
[in this window]
[in a new window]
|
Table I
Application of the SPOT algorithm to the MUPP1 PDZ domains and their
ligands
PDZ domains are numbered from 1 to 13 and ordered according to the SPOT
score. The experimentally defined binding domains are indicated in
bold. Binding of NG2 to MUPP1-PDZ1 and of 5-HT2 receptors to MUPP-1
PDZ10 were determined by two-hybrid analysis and pull-down and
immunoprecipitation assays (43, 44). Binding of E4 orf1 to MUPP1
(9BP-1) was identified by screening a gt11 cDNA expression
library (42); pull-down and immunoprecipitation assays allowed
definition of PDZ7 and PDZ10 as preferential binding regions (45).
|
|
 |
DISCUSSION |
PDZ domains are frequently found in proteins associated with the
cellular membrane, where they coordinate the assembly of trans-membrane
and cytosolic components into multiprotein complexes. Multiple PDZ
domains, often in association with other modules such as SH3, guanylate
kinase, etc. may be found in a single polypeptide, suggesting that
these modules have been utilized in evolution to assemble protein
adapters that work as scaffolds to cluster and regulate the activity of
various proteins (1, 4, 5). Some PDZ domains promote co-localization of
target proteins to different sub-cellular compartments (3). Schneider
et al. (46) elegantly show that it is possible to modulate
this activity in vivo by exploiting artificial PDZ domains
(46). Understanding the principles whereby distinct binding domains
recognize their substrates provides the rationale to infer the
recognition specificity of a domain from its primary sequence. The
rules underlying the recognition specificity of PDZ domains are only
partially understood (3, 46). Different from other protein binding
modules, PDZ domains show a preference for binding to the free carboxyl
terminus of target proteins. In some cases they may also dimerize with other PDZ-containing proteins by binding to an internal region folded
in a
-hairpin finger (21, 40). This structure mimics a free
carboxyl-terminal peptide, which can be fitted into the binding pocket
of the receptor domain (21). All PDZ domains whose three-dimensional
structure have been solved contain a core of five or six
-sheets
(
A-
F) and two
-helices (
A and
B). The ligand fits into a
hydrophobic pocket created by the
-helix (
B), the second
-strand (
B), and the conserved GLGF loop that connects the
A
and
B strands.
Depending on the consensus sequence of preferred ligands, PDZ domains
have been grouped into classes. Often domains that belong to the same
class share conserved residues in crucial contact positions of the
binding pocket (4). Class I PDZ domains bind to peptides containing the
(S/T)XV motif and have a conserved His as first residue of
-helix B (
B1). Class II PDZ domains favor a
/
X
motif and have a hydrophobic residue at
B1
(4, 22). Class III is represented by Mint PDZ that binds to peptides with (E/D)XW(C/S) consensus sequence (47) and by nNOS PDZ,
which has a preference for ligands with negatively charged amino acids at P
2: (6, 40). Stricker et al.
(6) show that substitutions of
B1-
B2 residues change the binding
specificity of nNOS to class I type (6). Finally, positions
P
3 and P
1, previously considered irrelevant
for binding because they are solvent-exposed in the PSD95-3/peptide
structure (23), turned out to be important in determining the binding
of other domains (8, 38).
These simplified rules are not always sufficient in explaining the
binding preferences of distinct PDZ domains. For instance, INADL PDZ1,
which has a His at
B1 (position 16 in the matrix in Fig. 3),
according to these rules should be classified as class I domain. On the
other hand, PDZ1 and PDZ4 that have a Ser at
B2 (position 7) would
be predicted to bind to peptides containing Asp at P
1
(38).
We defined the binding preference of the seven PDZ domains of INADL by
screening a repertoire of random peptides displayed at high density on
the capsid of bacteriophage
. The high display density guarantees
that because of avidity effects, relatively low affinity ligands (10 µM) are not at a disadvantage in the binding step. This
is particularly important since sometimes the physiologically relevant
PDZ ligands are not the ones that bind with the highest affinity (8).
Therefore, it was important to identify most of the residues that are
accepted at each peptide position by each PDZ domain of INADL
irrespective of the relative affinity of the single peptides.
A different consensus binding sequence was defined for each INADL PDZ
domain. In some cases the result was in contrast with the predictions
based upon the rules cited above. The seven PDZ domains are arranged
into two blocks of class II (PDZ1-4) and class I (PDZ5-7) domains.
The question of whether this ordered topographical distribution has any
physiological relevance is likely to be answered by the identification
of the natural INADL-interacting proteins. At the moment, INADL is an
"orphan" adapter protein, whose natural targets are unknown.
The PDZ3 domain was found to bind to two distinct classes of peptides.
One family is a typical class II, whereas the second family is defined
by a Asp or Glu as the carboxyl-terminal residue. With the exception of
Mint-1 (class III), which binds to peptides ending with Cys or Ser, all
the other known PDZ domains bind to peptides characterized by
hydrophobic or aromatic terminal residues. Therefore, INADL-PDZ3
represents a novel class (class IV) of PDZ domains.
In principle the peptide binding consensus can be used as
patterns to search protein data bases for proteins that contain carboxyl-terminal residues that match the consensus. This approach was
shown to be successful in the case of several SH3 domains (2) and of
few PDZ domains (5). The consensus that characterize the ligands of PDZ
domains, however, are poorly selective, and these pattern search
experiments yield far too many candidate partners to be useful as a
hint to guide more complex biological experiments. For instance, an
attempt to scan the Trembl/Swiss-Prot data base with the class IV
consensus for potential PDZ3-interacting proteins yielded multiple
matches, among these, several ion channels, tyrosine kinase substrates,
and viral proteins, whose physiological binding relevance should be
experimentally tested.
Each hINADL-PDZ domain, with the exception of PDZ7, showed specific
preferences not only for residues at position P
2
(defining the class) but also for positions P
3 and
P
1. These further differences in the binding specificity
might enhance the combinatorial possibilities for assembling different
proteins into a complex.
Mutagenesis experiments aimed at altering the preference for one of the
four ligand positions are very informative and help in the design of
artificial PDZ domains with desired properties (46) or in the
prediction of putative ligand for a given PDZ. Substitution of the
second residue of the
B strand of PSD-95 PDZ3 from Gln to Ser,
changed its preference for ligands carrying Asp or Glu at
P
1 (38), whereas changing the dipeptide
Y16D17 of nNOS PDZ into
H16E17 (the first two residues of
B helix)
altered its preference for the residue at P
2 (6). In this
paper we have shown that substitution of amino acids
H12E13 of PDZ7 with
R12S13 (as in PDZ4) induced an acquired
preference for peptides carrying either an aromatic residue or an
aspartate at P
1. On the other hand, substitution of the
dipeptide H16E17 of hINADL-PDZ7 into
L16A17 was not sufficient to convert the class
I specificity of PDZ7 into class II, as one would have expected from
previous analyses (6).
All together the experiments described in this manuscript substantially
enrich the information on PDZ recognition specificity so far available
and permit the extension of the applications of the SPOT algorithm to
the PDZ domain family. Anyway, the possibility to compute a complete
PDZ-specific matrix depends upon the availability of interaction data
coming from as many as possible different families of PDZ domains.
Because the procedure is based on the assumption that the interaction
between two proteins can be described in a first approximation as the
sum of independent interactions between their contacting residues (see
"Experimental Procedures"), a reliable prediction can be obtained
by the SPOT procedure only if the matrix contains data about PDZ
domains sharing at least some sequence identity with the query PDZ
domains. The results shown in Table I demonstrate that the INADL
interaction data obtained in the experiments described in this work
substantially helped in the construction of a more complete
PDZ-specific matrix. Specifically, the new interaction data of the
INADL PDZ domains added information about the preferred interactions of
residues in the binding pocket of the MUPP1-PDZ domains, which were
previously missing from the PDZ-specific matrix. The good correlation
between experimental results (43, 44, 45) and SPOT predictions
confirmed the validity of the approach. However further experimental
results about the binding specificity of more PDZ domains need to be
added to the PDZ-specific matrix before inferring with confidence the specificity of any member of the PDZ family.