Advertisement

Crystal structure of the collagen prolyl 4-hydroxylase (C-P4H) catalytic domain complexed with PDI: Toward a model of the C-P4H α2β2 tetramer

Open AccessPublished:October 17, 2022DOI:https://doi.org/10.1016/j.jbc.2022.102614
      Collagen prolyl 4-hydroxylases (C-P4H) are α2β2 tetramers, which catalyze the prolyl 4-hydroxylation of procollagen, allowing for the formation of the stable triple-helical collagen structure in the endoplasmic reticulum. The C-P4H α-subunit provides the N-terminal dimerization domain, the middle peptide-substrate-binding (PSB) domain, and the C-terminal catalytic (CAT) domain, whereas the β-subunit is identical to the enzyme protein disulfide isomerase (PDI). The structure of the N-terminal part of the α-subunit (N-terminal region and PSB domain) is known, but the structures of the PSB-CAT linker region and the CAT domain as well as its mode of assembly with the β/PDI subunit, are unknown. Here, we report the crystal structure of the CAT domain of human C-P4H-II complexed with the intact β/PDI subunit, at 3.8 Å resolution. The CAT domain interacts with the a, b’, and a’ domains of the β/PDI subunit, such that the CAT active site is facing bulk solvent. The structure also shows that the C-P4H-II CAT domain has a unique N-terminal extension, consisting of α-helices and a β-strand, which is the edge strand of its major antiparallel β-sheet. This extra region of the CAT domain interacts tightly with the β/PDI subunit, showing that the CAT-PDI interface includes an intersubunit disulfide bridge with the a’ domain and tight hydrophobic interactions with the b’ domain. Using this new information, the structure of the mature C-P4H-II α2β2 tetramer is predicted. The model suggests that the CAT active-site properties are modulated by α-helices of the N-terminal dimerization domains of both subunits of the α2-dimer.

      Keywords

      Abbreviations:

      BCO (Biocenter Oulu), CAT (catalytic domain of the α-subunit of C-P4H), C-P4H (collagen prolyl 4-hydroxylase), Cr-P4H (Chlamydomonas reinhardtii P4H isoform 1), DSBH (double-stranded β-helix), HIF-P4H-2 (hypoxia-inducible factor P4H, type-2), LSSR (local structural similarity restraint), MALS (multi-angle light scattering), MS (mass spectrometry), MTP (microsomal triglyceride transfer protein), NEM (N-ethylmaleimide), PDB (Protein Data Bank), PDI (protein disulfide isomerase), P4H-TM (transmembrane P4H), PSB (peptide-substrate-binding domain), SEC (size-exclusion chromatography)
      Collagen prolyl 4-hydroxylases (C-P4Hs; Enzyme Commission no.: 1.14.11.2), localized in the lumen of the endoplasmic reticulum, catalyze a vital cotranslational and post-translational modification of procollagen polypeptides required for the assembly of the collagen triple helix (
      • Myllyharju J.
      Prolyl 4-hydroxylases, the key enzymes of collagen biosynthesis.
      ,
      • Myllyharju J.
      Prolyl 4–hydroxylases, key enzymes in the synthesis of collagens and regulation of the response to hypoxia, and their roles as treatment targets.
      ,
      • Gorres K.L.
      • Raines R.T.
      Prolyl 4-hydroxylase.
      ,
      • Salo A.M.
      • Myllyharju J.
      Prolyl and lysyl hydroxylases in collagen synthesis.
      ). C-P4Hs are Fe(II) and 2-oxoglutarate-dependent dioxygenases, which use molecular oxygen to hydroxylate proline residues that are present at the Y position of the -X-Y-Gly- repeats of the procollagen chain. The Fe(II) ion is bound in the active site to a fully conserved His-X-Asp--His sequence motif, being important for activating the O2 molecule. The catalysis involves oxidative decarboxylation of 2-oxoglutarate to succinate and carbon dioxide and the hydroxylation of procollagen peptidyl prolines into 4-hydroxyprolines. This modification is needed to provide stability for the triple helical collagen molecules. Subsequently, the collagens are secreted into the extracellular matrix where they are further modified and assembled into various supramolecular structures, and where they are involved in, for example, cell adhesion, cell migration, and cell remodeling during growth, differentiation, and wound healing (
      • Myllyharju J.
      Prolyl 4–hydroxylases, key enzymes in the synthesis of collagens and regulation of the response to hypoxia, and their roles as treatment targets.
      ,
      • Salo A.M.
      • Myllyharju J.
      Prolyl and lysyl hydroxylases in collagen synthesis.
      ,
      • Shoulders M.D.
      • Raines R.T.
      Collagen structure and stability.
      ,
      • Rappu P.
      • Salo A.M.
      • Myllyharju J.M.
      • Heino J.
      Role of prolyl hydroxylation in the molecular interactions of collagens.
      ). Collagens play also important roles in many pathological states, like fibrosis and cancer (
      • Myllyharju J.
      Prolyl 4–hydroxylases, key enzymes in the synthesis of collagens and regulation of the response to hypoxia, and their roles as treatment targets.
      ,
      • Franklin T.J.
      Therapeutic approaches to organ fibrosis.
      ,
      • Vasta J.D.
      • Raines R.T.
      Collagen Prolyl 4-hydroxylase as a therapeutic target.
      ,
      • Shi R.
      • Gao S.
      • Zhang J.
      • Xu J.
      • Graham L.M.
      • Yang X.
      • et al.
      Collagen prolyl 4-hydroxylases modify tumor progression.
      ). C-P4Hs are α2β2 heterotetramers (
      • Myllyharju J.
      Prolyl 4-hydroxylases, the key enzymes of collagen biosynthesis.
      ,
      • Myllyharju J.
      Prolyl 4–hydroxylases, key enzymes in the synthesis of collagens and regulation of the response to hypoxia, and their roles as treatment targets.
      ,
      • Gorres K.L.
      • Raines R.T.
      Prolyl 4-hydroxylase.
      ). The α-subunit contains the catalytic site for the C-P4H activity (
      • Myllyharju J.
      • Kivirikko K.I.
      Characterization of the iron- and 2-oxoglutarate-binding sites of human prolyl 4-hydroxylase.
      ), and the β-subunit is identical to protein disulfide isomerase (PDI, Enzyme Commission no.: 5.3.4.1), which is an enzyme and chaperone that functions in protein folding in the endoplasmic reticulum (
      • Elgaard L.
      • Ruddock L.W.
      The human disulphide isomerase family: substrate interactions and functional properties.
      ,
      • Hatahet F.
      • Ruddock L.W.
      Protein disulfide isomerase: a critical evaluation of its function in disulfide bond formation.
      ,
      • Khan H.A.
      • Mutus B.
      Protein disulfide isomerase a multifunctional protein with multiple physiological roles.
      ). In mammals, there are three isoforms of the α-subunit (Fig. 1) (
      • Helaakoski T.
      • Annunen P.
      • Vuori K.
      • MacNeil I.A.
      • Pihlajaniemi T.
      • Kivirikko K.I.
      Cloning, baculovirus expression, and characterization of a second mouse prolyl 4-hydroxylase α-subunit isoform: formation of an α2β2 tetramer with the protein disulfide-isomerase/β subunit.
      ,
      • Kukkola L.
      • Hieta R.
      • Kivirikko K.I.
      • Myllyharju J.
      Identification and characterization of a third human, rat, and mouse collagen prolyl 4-hydroxylase isoenzyme.
      ), which are complexed with the same β-subunit, giving rise to three C-P4H tetramers, referred to as C-P4H-I, C-P4H-II, and C-P4H-III according to their α-subunit. Two splice variants have been characterized for both α(I) and α(II) (
      • Helaakoski T.
      • Veijola J.
      • Vuori K.
      • Rehn M.
      • Chow L.T.
      • Taillon-Miller P.
      • et al.
      Structure and expression of the human gene for the α subunit of prolyl 4-hydroxylase. The two alternatively spliced types of mRNA correspond to two homologous exons the sequences of which are expressed in a variety of tissues.
      ,
      • Nokelainen M.
      • Nissi R.
      • Kukkola L.
      • Helaakoski T.
      • Myllyharju J.
      Characterization of the human and mouse genes for the alpha subunit of type II prolyl 4-hydroxylase. Identification of a previously unknown alternatively spliced exon and its expression in various tissues.
      ) (Fig. S1). These splice variants have small sequence differences in the catalytic domain. C-P4H-I is the major isoenzyme, and its inactivation in mouse leads to early death during embryonic development, whereas mice lacking C-P4H-II are almost normal, but in combination with reduced amounts of C-P4H-I, show specific phenotypic abnormalities (
      • Holster T.
      • Pakkanen O.
      • Soininen R.
      • Sormunen R.
      • Nokelainen M.
      • Kivirikko K.I.
      • et al.
      Loss of assembly of the main basement membrane collagen, type IV, but not fibril-forming collagens and embryonic death in collagen prolyl 4-hydroxylase-I null mice.
      ,
      • Tolonen J.-P.
      • Salo A.M.
      • Finnilä M.
      • Aro E.
      • Karjalainen E.
      • Ronkainen V.-P.
      • et al.
      Reduced bone mass in collagen prolyl 4-hydroxylase P4ha1+/-;P4ha2-/- compound mutant mice.
      ). C-P4H-I and C-P4H-II have different affinities for procollagen-like substrate peptides (
      • Annunen P.
      • Helaakoski T.
      • Myllyharju J.
      • Veijola J.
      • Pihlajaniemi T.
      • Kivirikko K.I.
      Cloning of the human prolyl 4-hydroxylase alpha subunit isoform α(II) and characterization of the type II enzyme tetramer. The α(I) andα(II) subunits do not form a mixed α(I)α(II)β2 tetramer.
      ,
      • Myllyharju J.
      • Kivirikko K.I.
      Identification of a novel proline-rich peptide-binding domain in prolyl 4-hydroxylase.
      ). C-P4H-III is far less characterized than the first two isoforms (
      • Kukkola L.
      • Hieta R.
      • Kivirikko K.I.
      • Myllyharju J.
      Identification and characterization of a third human, rat, and mouse collagen prolyl 4-hydroxylase isoenzyme.
      ), and its in vivo role is as yet unknown.
      Figure thumbnail gr1
      Figure 1Sequence alignment of the α-subunits of C-P4H-II, C-P4H-I, and C-P4H-III (labeled as Hs-α(II), Hs-α(I), and Hs-α(III)) as well as of Cr-P4H. The C-P4H sequences include the endoplasmic reticulum–targeting sequences (this region is shown with a dashed line above the sequences). Cr-P4H concerns the algal Chlamydomonas reinhardtii isoform-1 prolyl 4-hydroxylase. Helices and β-strands are indicated by cylinders and arrows. The annotations above and below the sequences concern, respectively, Hs-α(II) and Cr-P4H. The nomenclature of the secondary structure elements of Cr-P4H is the same as used previously (
      • Koski M.K.
      • Hieta R.
      • Böllner C.
      • Kivirikko K.I.
      • Myllyharju J.
      • Wierenga R.K.
      The active site of an algal prolyl 4-hydroxylase has a large structural plasticity.
      ). V29 marks the beginning of the Cr-P4H construct, and W38 identifies the first ordered residue of the Cr-P4H structure (Protein Data Bank entry: 3GZE). The secondary structure annotation of Hs-α(II) refers to the structure of the double-domain construct (
      • Anantharajan J.
      • Koski M.K.
      • Kursula P.
      • Hieta R.
      • Bergmann U.
      • Myllyharju J.
      • et al.
      The structural motifs for the substrate binding and dimerization of the α subunit of collagen prolyl 4-hydroxylase.
      ) and of the C-P4H-II–Δ281 complex (this study). The dotted secondary structure elements of the Hs-α(II) hairpin loop (which is disordered in the structure of the C-P4H-II–Δ281 complex) is assigned as observed in the Cr-P4H structure. The residues highlighted above the Hs-α(II) sequence refer to important residues mentioned in the text. D141, D282, R305, and A325 identify the N-terminal residue of the C-P4H-II-Δ140, C-P4H-II-Δ281, C-P4H-II-Δ304, and C-P4H-II-Δ324 constructs, respectively, which have been investigated in these studies.
      The α-subunits of human C-P4H-I and C-P4H-II share a sequence identity of about 65%, whereas α(I) and α(II) share about 35% sequence identity with α(III). The C-P4H α-subunit of each of the isoforms consists of four parts: the N-terminal (dimerization) domain, the peptide-substrate-binding (PSB) domain (middle domain), a linker region (L, of unknown structure and function), and the C-terminal catalytic domain (Fig. 2) (
      • Myllyharju J.
      • Kivirikko K.I.
      Identification of a novel proline-rich peptide-binding domain in prolyl 4-hydroxylase.
      ,
      • Anantharajan J.
      • Koski M.K.
      • Kursula P.
      • Hieta R.
      • Bergmann U.
      • Myllyharju J.
      • et al.
      The structural motifs for the substrate binding and dimerization of the α subunit of collagen prolyl 4-hydroxylase.
      ,
      • Koski M.K.
      • Anantharajan J.
      • Kursula P.
      • Dhavala P.
      • Murthy A.V.
      • Bergmann U.
      • et al.
      Assembly of the elongated collagen prolyl 4-hydroxylase α2β2 heterotetramer around a central α2 dimer.
      ). Crystal structures of the PSB domains of human C-P4H-I and C-P4H-II (
      • Pekkala M.
      • Hieta R.
      • Bergmann U.
      • Kivirikko K.I.
      • Wierenga R.K.
      • Myllyharju J.
      The peptide-substrate-binding domain of collagen prolyl 4-hydroxylases is a tetratricopeptide repeat domain with functional aromatic residues.
      ,
      • Murthy A.V.
      • Sulu R.
      • Koski M.K.
      • Tu H.
      • Anantharajan J.
      • Sah-Teli S.K.
      • et al.
      Structural enzymology binding studies of the peptide-substrate-binding domain of human collagen prolyl 4-hydroxylase (type-II): high affinity peptides have a PxGP sequence motif.
      ), with and without bound proline-rich peptides, show the mode of binding of these peptides in a groove lined by highly conserved tyrosine residues. Furthermore, a crystal structure of a construct consisting of the N-terminal (dimerization) domain and the PSB domain, which is referred as the double-domain construct (Fig. 2), shows how the N-terminal domain forms a protein–protein dimer interface via a coiled-coil helical dimerization motif, whereas the PSB domains point away from this interface (
      • Anantharajan J.
      • Koski M.K.
      • Kursula P.
      • Hieta R.
      • Bergmann U.
      • Myllyharju J.
      • et al.
      The structural motifs for the substrate binding and dimerization of the α subunit of collagen prolyl 4-hydroxylase.
      ).
      Figure thumbnail gr2
      Figure 2Schematic diagrams of the domain organization of the α and β/PDI subunits of C-P4H. Top, domain structure of the α-subunit. The construct that includes the N-terminal dimerization domain and the middle peptide-substrate-binding (PSB) domain is referred to as the double-domain construct of which the structure is known (
      • Anantharajan J.
      • Koski M.K.
      • Kursula P.
      • Hieta R.
      • Bergmann U.
      • Myllyharju J.
      • et al.
      The structural motifs for the substrate binding and dimerization of the α subunit of collagen prolyl 4-hydroxylase.
      ). The truncation variants of the α-subunit, being studied here, are identified below the schematic figure. The structure and function of the linker region, residues 258 to 328, labeled L, are unknown from previous studies, but the studies reported here show that its C-terminal region, residues 284 to 328, forms an integral part of the catalytic (CAT) domain. The catalytic residues of the CAT domain are identified above the schematic figure. These residues are fully conserved in the P4H family of enzymes (). Bottom, domain structure of the β/PDI subunit. The a, b, b’, and a’ domains adopt the thioredoxin fold. The a and a’ domains are also referred to as the catalytic domains, and their catalytic cysteines are identified. The proximal cysteines are Cys53 and Cys397. C-P4H, collagen prolyl 4-hydroxylase; PDI, protein disulfide isomerase.
      The structure of the human C-P4H CAT domain is as yet unknown, but crystal structures of a monomeric Chlamydomonas reinhardtii P4H isoform 1 (Cr-P4H) (about 35% sequence identity with the human C-P4H-I CAT domain (Fig. 1)), both in its unliganded form (without peptide) and complexed with a peptide substrate, (Ser-Pro)5, have been solved (
      • Koski M.K.
      • Hieta R.
      • Böllner C.
      • Kivirikko K.I.
      • Myllyharju J.
      • Wierenga R.K.
      The active site of an algal prolyl 4-hydroxylase has a large structural plasticity.
      ,
      • Koski M.K.
      • Hieta R.
      • Hirsilä M.
      • Rönkä A.
      • Myllyharju J.
      • Wierenga R.K.
      The crystal structure of an algal prolyl 4-hydroxylase complexed with a proline-rich peptide reveals a novel buried tripeptide binding motif.
      ). Like in all 2-oxoglutarate-dependent dioxygenases, the core of the fold consists of eight antiparallel β-strands (labeled as βI to βVIII), having the spiral topology of the double-stranded β-helix fold (the DSBH-fold) (
      • Dunwell J.M.
      • Purvis A.
      • Khuri S.
      Cupins: the most functionally diverse protein superfamily?.
      ,
      • Islam Md.S.
      • Leissing T.M.
      • Chowdhury R.
      • Hopkins R.J.
      • Schofield C.J.
      2-oxoglutarate-dependent oxygenases.
      ) and forming a major and a minor β-sheet. In the 2-oxoglutarate-dependent dioxygenases, the cofactor 2-oxoglutarate is bound in a deeply buried cavity between these two sheets, shielded from bulk solvent by the bound substrate molecule (
      • Dunwell J.M.
      • Purvis A.
      • Khuri S.
      Cupins: the most functionally diverse protein superfamily?.
      ,
      • Islam Md.S.
      • Leissing T.M.
      • Chowdhury R.
      • Hopkins R.J.
      • Schofield C.J.
      2-oxoglutarate-dependent oxygenases.
      ). In the structure of Cr-P4H complexed with its proline-rich substrate peptide, the peptide is bound in a tunnel shaped by two substrate-binding loops, being the hairpin loop and the βII-βIII loop (
      • Koski M.K.
      • Hieta R.
      • Hirsilä M.
      • Rönkä A.
      • Myllyharju J.
      • Wierenga R.K.
      The crystal structure of an algal prolyl 4-hydroxylase complexed with a proline-rich peptide reveals a novel buried tripeptide binding motif.
      ). In the unliganded Cr-P4H structure, the substrate-binding loops are disordered and/or adopt different conformations (
      • Koski M.K.
      • Hieta R.
      • Böllner C.
      • Kivirikko K.I.
      • Myllyharju J.
      • Wierenga R.K.
      The active site of an algal prolyl 4-hydroxylase has a large structural plasticity.
      ). In addition, in the unliganded structure, the βI-βII catalytic loop (residues Tyr134 to Tyr140) adopts a different conformation when comparing liganded and unliganded structures. The conformational flexibility properties of the two substrate-binding loops and the catalytic loop are important for the catalytic function of this enzyme (
      • Koski M.K.
      • Hieta R.
      • Hirsilä M.
      • Rönkä A.
      • Myllyharju J.
      • Wierenga R.K.
      The crystal structure of an algal prolyl 4-hydroxylase complexed with a proline-rich peptide reveals a novel buried tripeptide binding motif.
      ).
      PDI has a four-domain structure, referred to as the a, b, b’, and a’ domains (Fig. 2) (
      • Elgaard L.
      • Ruddock L.W.
      The human disulphide isomerase family: substrate interactions and functional properties.
      ). These domains have a thioredoxin fold. The a domain and a’ domain of PDI are also referred to as its catalytic domains, as they both have the conserved CGHC-sequence motif (Figs. 2 and S2). The cysteines of this motif are the CAT residues, needed to catalyze the formation, breaking, and isomerization of disulfide bridges of the PDI substrate proteins in their folding pathway. Studies have shown that the β/PDI subunit is required for providing the soluble and catalytically competent conformation of the α-subunit, but it is not understood how it is assembled with the α-subunit, how it is involved in intersubunit disulfide bridges, and what its role is in the procollagen hydroxylation reaction mechanism of the mature C-P4H α2β2 tetramer (
      • John D.C.
      • Grant M.E.
      • Bulleid N.J.
      Cell-free synthesis and assembly of prolyl 4-hydroxylase: the role of the beta-subunit (PDI) in preventing misfolding and aggregation of the alpha-subunit.
      ,
      • Vuori K.
      • Pihlajaniemi T.
      • Myllylä R.
      • Kivirikko K.I.
      Site-directed mutagenesis of human protein disulphide isomerase: effect on the assembly, activity and endoplasmic reticulum retention of human prolyl 4-hydroxylase in Spodoptera frugiperda insect cells.
      ,
      • Koivunen P.
      • Pirneskoski A.
      • Karvonen P.
      • Ljung J.
      • Helaakoski T.
      • Notbohm H.
      • et al.
      The acidic C-terminal domain of protein disulfide isomerase is not critical for the enzyme subunit function or for the chaperone or disulfide isomerase activities of the polypeptide.
      ,
      • Pirneskoski A.
      • Ruddock L.W.
      • Klappa P.
      • Freedman R.B.
      • Kivirikko K.I.
      • Koivunen P.J.
      Domains b' and a' of protein disulfide isomerase fulfill the minimum requirement for function as a subunit of prolyl 4-hydroxylase. The N-terminal domains a and b enhances this function and can be substituted in part by those of ERp57.
      ,
      • Koivunen P.
      • Salo K.E.
      • Myllyharju J.
      • Ruddock L.W.
      Three binding sites in protein-disulfide isomerase cooperate in collagen prolyl 4-hydroxylase tetramer assembly.
      ). The crystal structure of human PDI is known in oxidized (having a disulfide bond between the two cysteines of the CGHC-motif) and reduced (the cysteines of the CGHC motif are reduced) forms (
      • Wang C.
      • Li W.
      • Ren J.
      • Fang J.
      • Ke H.
      • Gong W.
      • et al.
      Structural insights into the redox-regulated dynamic conformations of human protein disulfide isomerase.
      ). Also known is the structure of the human heterodimer microsomal triglyceride transfer protein (MTP), which includes PDI as its β-subunit (
      • Biterova E.I.
      • Isupov M.N.
      • Keegan R.M.
      • Lebedev A.A.
      • Sohail A.A.
      • Liaqat I.
      • et al.
      The crystal structure of human microsomal triglyceride transfer protein.
      ). The α-subunit of MTP is a lipid-binding protein, and its amino acid sequence and fold are not related to the α-subunit of C-P4H. In addition, the crystal structure of ERp57 (a PDI family member, present in the endoplasmic reticulum), complexed with tapasin (not related to the α-subunits of C-P4H and MTP), is known (
      • Dong G.
      • Wearsch P.A.
      • Peaper D.R.
      • Cresswell P.
      • Reinisch K.M.
      Insights into MHC class I peptide loading from the structure of the tapasin-ERp57 thiol oxidoreductase heterodimer.
      ).
      Small-angle X-ray scattering studies of mature C-P4H-I suggest that in the mature α2β2 C-P4H-I tetramer, the CAT domains of the two α-subunits point away from the α2-dimer interface (formed by the two N-terminal dimerization domains) and are capped by the β/PDI subunit, positioned at both ends of the elongated tetramer, suggesting that the β/PDI subunit interacts solely with the CAT domain (
      • Koski M.K.
      • Anantharajan J.
      • Kursula P.
      • Dhavala P.
      • Murthy A.V.
      • Bergmann U.
      • et al.
      Assembly of the elongated collagen prolyl 4-hydroxylase α2β2 heterotetramer around a central α2 dimer.
      ). These small-angle X-ray scattering studies have provided also the shape information of proteolytically truncated forms of C-P4H-I, which were copurified together with the mature recombinant C-P4H-I. One of these truncated forms was a heterodimer complex of a truncated α-chain (consisting of the PSB and CAT domains but lacking the N-terminal dimerization domain) complexed with the intact β/PDI subunit. The truncation site of this heterodimer is near residue Asp139 of the α-subunit of C-P4H-I (just before the PSB domain) (Fig. 1) (
      • Koski M.K.
      • Anantharajan J.
      • Kursula P.
      • Dhavala P.
      • Murthy A.V.
      • Bergmann U.
      • et al.
      Assembly of the elongated collagen prolyl 4-hydroxylase α2β2 heterotetramer around a central α2 dimer.
      ). The more detailed characterization of such a complex can provide key information on the structure of the CAT domain and its assembly with the β/PDI subunit, which is currently completely lacking. However, the optimal α-subunit truncation site for obtaining a heterodimer that has only the CAT domain of the α-subunit, complexed with the intact β/PDI subunit, is not clear, as it depends on the unknown structural and functional role of the linker region (Fig. 2).
      Here, we report on the characterization of two truncated human C-P4H-II variants in which the α-subunit is lacking either the N-terminal dimerization domain (C-P4H-II-Δ140) or the dimerization domain and the PSB domain (C-P4H-II-Δ281). These truncated variants of C-P4H-II are stable and soluble heterodimer complexes. The C-P4H-II-Δ281 complex could be crystallized, and its crystal structure reveals the structure of the CAT domain as well as its interactions with the β/PDI subunit. This crystal structure, together with the AlphaFold2 (DeepMind company and EMBL-EBI) (
      • Jumper J.
      • Evans R.
      • Pritzel A.
      • Green T.
      • Figurnov M.
      • Ronnegerger O.
      • et al.
      Highly accurate protein structure prediction with AlphaFold.
      ) model of the complete α-subunit dimer, enabled the prediction of the mode of assembly of the C-P4H-II α2β2 heterotetramer, which is also discussed.

      Results and discussion

      Expression, purification, and properties of mature C-P4H-II and two truncated C-P4H-II variants

      Considerable proteolytic degradation has been a problem in previous experiments on recombinant production of human C-P4H-I in the Escherichia coli Origami strain (
      • Koski M.K.
      • Anantharajan J.
      • Kursula P.
      • Dhavala P.
      • Murthy A.V.
      • Bergmann U.
      • et al.
      Assembly of the elongated collagen prolyl 4-hydroxylase α2β2 heterotetramer around a central α2 dimer.
      ). Therefore, the E. coli CyDisCo expression system (
      • Hatahet F.
      • Nguyen V.D.
      • Salo K.E.
      • Ruddock L.W.
      Disruption of reducing pathways is not essential for efficient disulfide bond formation in the cytoplasm of E. coli.
      ,
      • Nguyen V.D.
      • Hatahet F.
      • Salo K.E.
      • Enlund E.
      • Zhang C.
      • Ruddock L.W.
      Pre-expression of a sulfhydryl oxidase significantly increases the yields of eukaryotic disulfide bond containing proteins expressed in the cytoplasm of E.coli.
      ), which has been used for the expression of MTP for crystallization (
      • Biterova E.I.
      • Isupov M.N.
      • Keegan R.M.
      • Lebedev A.A.
      • Sohail A.A.
      • Liaqat I.
      • et al.
      The crystal structure of human microsomal triglyceride transfer protein.
      ), was tested for production of human C-P4H-I and C-P4H-II. For this purpose, codon-optimized constructs of the α-subunit of C-P4H-I and C-P4H-II (Table S1) were coexpressed with β/PDI, being part of the CyDisCo expression vector. The complexes were expressed and purified as described in the Experimental procedures section. The C-P4H-I and C-P4H-II samples were pure after the size-exclusion chromatography (SEC) step. The yield of C-P4H-II was better, and proteolytic degradation was less than for C-P4H-I. Mass spectrometric (MS) peptide mapping of C-P4H-II confirmed that the purified C-P4H-II includes the N-terminal and C-terminal peptides of both subunits (Table S2). Therefore, subsequent experiments, aimed at finding the best possible truncation sites of the α-subunit that would allow for the recombinant generation and purification of a complex of the CAT domain assembled with β/PDI, were done with C-P4H-II. Four truncated constructs of the α-subunit were tested, referred to as C-P4H-II-Δ140, C-P4H-II-Δ281, C-P4H-II-Δ304, and C-P4H-II-Δ324, in which the first 140, 281, 304, and 324 residues, respectively, were deleted (Fig. 2). The truncated α-subunits were expressed with a 6× His tag at the N terminus, whereas the coexpressed β/PDI subunit was without a tag, identical as used for the expression of the mature C-P4H-I and C-P4H-II complexes (Table S1). Coexpression of PDI with the C-P4H-II-Δ140 construct (encoding for PSB and CAT, Fig. 2) as well as with the C-P4H-II-Δ281 construct (encoding for CAT only, Fig. 2) resulted in soluble complexes, whereas the C-P4H-II-Δ304 and C-P4H-II-Δ324 constructs were only expressed as insoluble proteins, which could not be purified. Apparently, the residues between 281 and 304 include amino acids that are required to form a soluble complex of the CAT domain, assembled with β/PDI. The C-P4H-II-Δ140 and C-P4H-II-Δ281 complexes were purified using the same protocols as used for mature C-P4H-I and C-P4H-II. The SDS-PAGE gel analysis showed that these samples are highly pure, and truncated α and intact β-subunits are present, which was also confirmed by MS characterization. Characterization by multiangle light scattering (MALS) (Fig. S3 and Table 1) confirmed that the purified C-P4H-II-Δ140 and C-P4H-II-Δ281 complexes are heterodimers. The CD spectra (Fig. S4A) show that wildtype C-P4H-II and its two truncated variants have the expected secondary structure properties. The CD melting curves of the truncated complexes (Fig. S4B) show Tm values of 50.2 and 53.2 °C for, respectively, C-P4H-II-Δ140 and C-P4H-II-Δ281, slightly higher than the Tm of mature C-P4H-II (Tm = 46.5 °C) (Table 1), which confirms that the truncated complexes indeed are stable proteins. Activity assays show that the two truncated variants have C-P4H activity, although to a lesser extent than mature C-P4H-II (Table 1). Subsequently, crystallization experiments were carried out with these purified complexes, which resulted in diffraction quality crystals of the C-P4H-Δ281 complex, as outlined in the next section.
      Table 1The molecular weights, the Tm values, and the activity assay results of mature C-P4H-II and its truncated variants
      ConstructMolecular weight (SEC–MALS
      The SEC–MALS graphs are shown in Fig. S3.
      )
      Tm (CD
      The CD melting curves are shown in Fig. S4B.
      )
      Indirect assay
      The mean values of the results of three experiments are provided.
      with 2-oxo[1-14C]glutarate and (PPG)10 substrate, measuring the hydroxylation coupled amount of formed14CO2
      Direct assay
      The results of two experiments are reported. The listed activity values obtained by the indirect and direct assays cannot be compared directly.
      with [14C]proline-labeled procollagen substrate, measuring the amount of formed 4-hydroxy[14C]proline
      (kDa)(C)DPM/(nmole active site)DPM/(nmole active site)
      Experiment 1Experiment 2
      C-P4H-II23046.5 ± 0.2320,500 ± 20,90083,700114,500
      C-P4H-II–Δ14010150.2 ± 0.244,300 ± 310076005800
      C-P4H-II–Δ2818553.2 ± 0.225,700 ± 19001000800
      a The SEC–MALS graphs are shown in Fig. S3.
      b The CD melting curves are shown in Fig. S4B.
      c The mean values of the results of three experiments are provided.
      d The results of two experiments are reported. The listed activity values obtained by the indirect and direct assays cannot be compared directly.

      The determination of the crystal structure of the C-P4H-II-Δ281 variant

      The C-P4H-II-Δ281 complex was crystallized in space group R3. A complete dataset was collected at the P14 beamline (of PETRA III, operated by the EMBL outstation at DESY, Hamburg) at 3.8 Å resolution (Table 2). The structure was solved by molecular replacement using as search models the crystal structures of the monomeric algal Cr-P4H and of the a, b’, and a’ domains of human PDI, as described in the Experimental procedures section. The asymmetric unit consists of two C-P4H-II-Δ281 complexes, which are formed by the CAT domain of the α subunit, complexed with the complete β/PDI subunit (Fig. 3). This complex is also referred to as the CAT-PDI complex. At the end of the refinement protocol, the main-chain and side-chain conformations of both copies of the CAT-PDI complex are well defined by the electron density map (Figs. S5–S7), and there are no significant structural differences between these two CAT-PDI complexes. The CAT domains are better ordered than the β/PDI subunits. At the N terminus of both α-subunits, the model starts at residue Leu284 and the β/PDI subunits start at Glu21. Not included in the model of the CAT domain are also the C-terminal residues of the α-subunit (residues 534–535) and residues of the hairpin loop (known from the Cr-P4H studies to be important for substrate binding, Fig. 1). The highly acidic C-terminal region of PDI (residues 477–498) is also completely disordered and not seen in the electron density maps. The latter finding agrees with the notion that the PDI C-terminal KDEL sequence functions as the endoplasmic reticulum retention signal (
      • Vuori K.
      • Pihlajaniemi T.
      • Myllylä R.
      • Kivirikko K.I.
      Site-directed mutagenesis of human protein disulphide isomerase: effect on the assembly, activity and endoplasmic reticulum retention of human prolyl 4-hydroxylase in Spodoptera frugiperda insect cells.
      ) and that the C-terminal region of PDI is not needed for the tetramer assembly (
      • Koivunen P.
      • Pirneskoski A.
      • Karvonen P.
      • Ljung J.
      • Helaakoski T.
      • Notbohm H.
      • et al.
      The acidic C-terminal domain of protein disulfide isomerase is not critical for the enzyme subunit function or for the chaperone or disulfide isomerase activities of the polypeptide.
      ).
      Table 2Data collection and statistics of the data processing and refinement of the structure of the C-P4H-II–Δ281 complex
      Data collection
      BeamlineP14, PETRA III (DESY, EMBL Hamburg)
      Wavelength (Å)0.9763
      DetectorDECTRIS EIGER X 16M
      Temperature (K)100
      Data processing softwareXDS/Aimless
      Data processing statistics
      Unit cell parameters (Å,°)a = 252.9
      b = 252.9
      c = 89.4
      α, β = 90
      γ = 120
      Space groupR3
      Resolution range (Å)
      The numbers in parentheses refer to the outer shell.
      46.69–3.85 (3.99–3.85)
      Vm3/Da)3.2
      Molecules per asymmetric unitTwo heterodimer complexes
      Number of observations107,514 (10,863)
      Multiplicity5.3 (5.5)
      Completeness (%)99.9 (100)
      I/σ(I)8.2 (0.7)
      Rmerge (%)6.2 (167.3)
      CC(1/2)0.996 (0.495)
      Rpim (%)4.7 (123.5)
      Refinement statistics
      Resolution (Å)46.69–3.85
      Rwork (%)24.4
      Rfree (%)27.8
      Number of unique reflections20,135
      Number of atoms
      Nonhydrogen atoms.
      All protein atoms11,088
       α-subunit 1 (chain A)1920
       α-subunit 2 (chain B)1897
       β-subunit 1 (chain C)3631
       β-subunit 2 (chain D)3640
      Sulfate (SO4)10
      Rmsd bond length (Å)0.011
      Rmsd bond angle (°)1.6
      Average B-factors (Å2)
      All atoms73.2
       α-subunit 1 (chain A)49.1
       α-subunit 2 (chain B)41.7
       β-subunit 1 (chain C)89.3
       β-subunit 2 (chain D)86.4
      The two sulfate solutes24.5
      Ramachandran plot
      Calculated by MolProbity.
      Favored (%)96.5
      Allowed (%)2.7
      Outliers (%)0.8
      PDB entry7ZSC
      a The numbers in parentheses refer to the outer shell.
      b Nonhydrogen atoms.
      c Calculated by MolProbity.
      Figure thumbnail gr3
      Figure 3The structure of the C-P4H-II-Δ281 complex. A, standard view, showing that the CAT active site (brown, near the βI-βII loop) faces the bulk solvent. In this view, the β/PDI subunit (blue) is located at the back side of the CAT domain. B, top view, rotated by 90° around the horizontal with respect to panel (A), visualizing the mode of assembly of the CAT domain with the domains of the β/PDI subunit. The sulfur atoms of the intersubunit and intrasubunit disulfide bridges are shown in yellow color. C, ribbon diagram of the fold of the CAT domain of the C-P4H-II-Δ281 complex (standard view). The DSBH-fold is formed by β-strands βI to βVIII; the minor sheet is shown on the left side, and the major sheet is shown on the right side. The star identifies the catalytic site. The ribbon diagram is color coded by color ramping from N terminus (dark blue) to C terminus (red). The substrate-binding loops (hairpin loop and βII-βIII loop) and the catalytic loop (βI-βII loop) are labeled. The secondary structure elements α12, α13, β1, and α14 form a unique N-terminal extension of the C-P4H CAT domain, not present in previously reported structures of P4Hs (). SS identifies a disulfide bridge. D, schematic visualization of the C-P4H-II-Δ281 construct, visualizing also the observed disulfide bridges in the crystal structure of this complex, also referred to as the CAT-PDI complex. CAT, catalytic domain of the α-subunit of C-P4H; C-P4H, collagen prolyl 4-hydroxylase; DSBH, double-stranded β-helix; PDI, protein disulfide isomerase.

      The overall structure of the CAT domain and its mode of assembly with the β/PDI subunit

      The CAT domain is folded into the typical DSBH-fold (Fig. 3), as also seen in other P4H structures, such as Cr-P4H (being the closest homolog) (
      • Koski M.K.
      • Hieta R.
      • Böllner C.
      • Kivirikko K.I.
      • Myllyharju J.
      • Wierenga R.K.
      The active site of an algal prolyl 4-hydroxylase has a large structural plasticity.
      ,
      • Koski M.K.
      • Hieta R.
      • Hirsilä M.
      • Rönkä A.
      • Myllyharju J.
      • Wierenga R.K.
      The crystal structure of an algal prolyl 4-hydroxylase complexed with a proline-rich peptide reveals a novel buried tripeptide binding motif.
      ), transmembrane P4H (P4H-TM) (
      • Myllykoski M.
      • Sutinen A.
      • Koski M.K.
      • Kallio J.P.
      • Raasakka A.
      • Myllyharju J.
      • et al.
      Structure of transmembrane prolyl 4-hydroxylase reveals unique organization of EF and dioxygenase domains.
      ), hypoxia-inducible factor P4H, type-2 (HIF-P4H-2) (
      • Chowdhury R.
      • McDonough M.A.
      • Mecinovic J.
      • Loenarz C.
      • Flashman E.
      • Hewitson K.S.
      • et al.
      Structural basis for binding of hypoxia-inducible factor to the oxygen-sensing prolyl hydroxylases.
      ), hypoxia-inducible factor P4H, type-1 (
      • Ahmed S.
      • Ayscough A.
      • Barker G.R.
      • Canning H.E.
      • Davenport R.
      • Downham R.
      • et al.
      1,2,4-Triazolo-[1,5-a]pyridine HIF prolylhydroxylase domain-1 (PHD-1) inhibitors with a novel monodentate binding interaction.
      ), viral P4H (
      • Longbotham J.E.
      • Levy C.
      • Johannissen L.O.
      • Tarhonskaya H.
      • Jiang S.
      • Loenarz C.
      • et al.
      Structure and mechanism of a viral collagen prolyl hydroxylase.
      ), and bacterial P4H (
      • Culpepper M.A.
      • Scott E.E.
      • Limburg J.
      Crystal structure of prolyl 4-hydroxylase from Bacillus anthracis.
      ). The N-terminal region of the α-subunit of the C-P4H-II-Δ281 complex, residues 284–328, which was previously proposed to be part of the linker region between the PSB and CAT domains (Fig. 2) (
      • Koski M.K.
      • Anantharajan J.
      • Kursula P.
      • Dhavala P.
      • Murthy A.V.
      • Bergmann U.
      • et al.
      Assembly of the elongated collagen prolyl 4-hydroxylase α2β2 heterotetramer around a central α2 dimer.
      ), is actually an integral part of the CAT domain (Fig. 3). This unique N-terminal extension of the CAT domain, not seen in any other P4H family member (Fig. S1), forms two short α-helices (α12 and α13) followed by an extra β-strand (β1, including Cys311) of the major sheet of the DSBH-fold and another short α-helix (α14, including Ile324), before continuing into the secondary structure elements known from the other P4H structures (Fig. S1). Also the C terminus is more extended in the C-P4H-II-Δ281 structure, as compared with Cr-P4H, by eight residues, and this extension includes Cys529 (Fig. 1). Cys529, as well as Cys311 of the N-terminal extension, is conserved in the C-P4H enzymes (Fig. S1).
      The structure of the C-P4H-II-Δ281 complex shows that in the CAT domain there is an intrasubunit disulfide bridge between two conserved cysteines, Cys294 (of the N-terminal extension of the CAT domain, in the α12-helix) and Cys504, located in β-strand βVII of the DSBH-fold (Fig. 3). The latter β-strand provides a side chain (His501) that sequesters the active site Fe(II) ion. MS peptide mapping of C-P4H-II showed that treatment with DTT reduces these two cysteines to free thiols. However, corresponding disulfide-linked peptides were not detected in the native sample despite extensive searching of the MS data, possibly because of low ionization efficiency. These peptides were also not detected in the N-ethylmaleimide (NEM)-labeled sample, in agreement with the presence of a disulfide bridge between Cys294 and Cys504 (Tables S2 and S3). An intrasubunit disulfide bridge in the α-subunit has been suggested by previous experimental studies with C-P4H-I, where the importance of the conserved Cys294 and Cys504 (Fig. S1) for the assembly and function of C-P4H-I has been noted (
      • John D.C.
      • Bulleid N.J.
      Prolyl 4-hydroxylase: Defective assembly of alpha-subunit mutants indicates that assembled alpha-subunits are intramolecularly disulfide bonded.
      ,
      • Lumb R.A.
      • Bulleid N.J.
      Is protein disulfide isomerase a redox-dependent molecular chaperone?.
      ,
      • Lamberg A.
      • Pihlajaniemi T.
      • Kivirikko K.I.
      Site-directed mutagenesis of the alpha subunit of human prolyl 4-hydroxylase. Identification of three histidine residues critical for catalytic activity.
      ). This intrasubunit disulfide bridge in CAT is unique for this subfamily of P4Hs because Cys294 is located in the unique N-terminal extension of CAT. However, in some other P4H enzymes, Cys504 is conserved (Fig. S1) and in fact also engaged in an intrasubunit disulfide bridge, for example, in P4H-TM (
      • Myllykoski M.
      • Sutinen A.
      • Koski M.K.
      • Kallio J.P.
      • Raasakka A.
      • Myllyharju J.
      • et al.
      Structure of transmembrane prolyl 4-hydroxylase reveals unique organization of EF and dioxygenase domains.
      ) and Cr-P4H (
      • Koski M.K.
      • Hieta R.
      • Böllner C.
      • Kivirikko K.I.
      • Myllyharju J.
      • Wierenga R.K.
      The active site of an algal prolyl 4-hydroxylase has a large structural plasticity.
      ) but not in HIF-P4Hs (
      • Chowdhury R.
      • McDonough M.A.
      • Mecinovic J.
      • Loenarz C.
      • Flashman E.
      • Hewitson K.S.
      • et al.
      Structural basis for binding of hypoxia-inducible factor to the oxygen-sensing prolyl hydroxylases.
      ,
      • Ahmed S.
      • Ayscough A.
      • Barker G.R.
      • Canning H.E.
      • Davenport R.
      • Downham R.
      • et al.
      1,2,4-Triazolo-[1,5-a]pyridine HIF prolylhydroxylase domain-1 (PHD-1) inhibitors with a novel monodentate binding interaction.
      ).
      The assembly of the CAT-PDI heterodimer complex is stabilized by interactions of the unique N-terminal and C-terminal extensions of the CAT domain with the a’, b’, and a domains of PDI, whereas there are no interactions between the CAT domain and the b domain (Fig. 3). Two regions of the CAT N-terminal extension are involved in these interactions, located around Cys311 (of the β1 β-strand) with the a’ domain and around Ile324 (α14-helix) with the b’ domain. A third main interaction region is near Cys529 (near the C terminus of the CAT domain), which interacts with the a domain of the β/PDI subunit (Fig. 3). These CAT-PDI interactions are at the “back” side of the CAT domain, in such a way that the active site of the CAT domain points to bulk solvent, and the presence of the β/PDI subunit does not affect the conformational flexibility of its three active-site loops.

      The interactions of the CAT domain with the β/PDI subunit

      The CAT-PDI protein–protein interfaces near Cys311 and Cys529 (of the CAT domain) include disulfide bridges with Cys397 and Cys53 (of PDI), respectively, which are the proximal cysteines (Fig. 2) of the CGHC motifs of the a’ and a domains (Figs. 4, S6, and S7). The CGHC motif is located at the N-terminal end of helix H2 of the thioredoxin fold (Fig. S8). The presence of these intersubunit disulfide bridges was validated by calculating omit maps and maps obtained from models of structures in which the disulfide bridges have been switched from an intersubunit disulfide bridge to an intrasubunit disulfide bridge between the proximal and distal cysteines of the CGHC motifs (Fig. S9). This switch requires only minimal structural changes. The obtained electron density maps suggest that the Cys311 disulfide bridge has high occupancy, whereas the Cys529 disulfide bridge could be partially intrasubunit, between Cys53 and Cys56 of the PDI a domain CGHC motif, and partially intersubunit, with the latter geometry occurring predominantly (Fig. S9).
      Figure thumbnail gr4
      Figure 4The interactions of the CAT domain with the a’ and a domains of the β/PDI subunit. A, Cys311 of the β1 strand of the CAT domain forms an intersubunit disulfide bridge with the proximal Cys397 of the a’ domain. B, Cys529 of the extended C-terminal tail of the CAT domain forms a disulfide bridge with the proximal Cys53 of the a domain. The a’ and a domains are shown in light blue ribbon and sticks. The CAT peptide is shown in red color, and the sulfur atoms are shown in yellow color. The sequences of the peptides near Cys311 (KRLFCRYHH) and Cys529 (FLRPCGSTE) are very different; N and C identify the N-terminal and C-terminal regions of these peptides. The dotted lines highlight hydrogen-bond interactions, including the hydrogen-bond interactions of OE1(Glu330) and O(Ser492) of the α-subunit with the main chain peptide NH of the glycine of the CGHC motif in a’ (A) and a (B), respectively. The orientation of the a’ and a domains of panels (A) and (B) is similar as the orientation of the thioredoxin fold depicted in the schematic diagram of . Stereo images of these two interaction sites, including the corresponding electron density maps, are provided in and . CAT, catalytic domain of the α-subunit of C-P4H; PDI, protein disulfide isomerase.
      The sequences around the Cys311 (β1 β-strand of the CAT domain) and Cys529 (C terminus of CAT domain) that interact with the PDI domains are very different (Figs. 4 and S10), being KRLFCRYHH and FLRPCGSTE, respectively. However, residues Cys311 and Cys529 have the same main chain hydrogen-bond interactions with Phe440 (of the a’ domain) and Tyr99 (of the a domain), respectively, which are possible because of the conserved cis-(Phe440-Pro441) and cis-(Tyr99-Pro100) peptide bond (Figs. S6 and S7). The interactions of Cys311 and Cys529 with the a’ and a domains are complemented with interactions of the β2-region (near Glu330) and the βVI–βVII loop region (near Ser492) of the CAT domain with the CGHC loop of the a’ and a domains of PDI, respectively (Fig. S10). Cys311 and Cys529 are conserved cysteines in the C-P4H enzymes (Fig. S1), and their presence at the CAT-PDI interfaces is in agreement with earlier work on C-P4H-I (
      • John D.C.
      • Bulleid N.J.
      Prolyl 4-hydroxylase: Defective assembly of alpha-subunit mutants indicates that assembled alpha-subunits are intramolecularly disulfide bonded.
      ,
      • Lamberg A.
      • Pihlajaniemi T.
      • Kivirikko K.I.
      Site-directed mutagenesis of the alpha subunit of human prolyl 4-hydroxylase. Identification of three histidine residues critical for catalytic activity.
      ) that highlighted the importance of these two cysteines for the formation of the functional tetrameric C-P4H enzyme. In the two failed constructs of this study, C-P4H-II-Δ304 and C-P4H-II-Δ324, the truncation site is just before or just after Cys311, and therefore, the key structural elements present in this region of the structure are disrupted, resulting in insoluble proteins that could not be purified.
      It is possible that the intersubunit disulfide bridges formed between CAT and PDI are transient disulfide bonds, which are easily converted into intrasubunit disulfide bridges (between the cysteines of the CGHC-cysteines) by nucleophilic attack of the distal cysteine of PDI, as suggested by other studies (
      • Dong G.
      • Wearsch P.A.
      • Peaper D.R.
      • Cresswell P.
      • Reinisch K.M.
      Insights into MHC class I peptide loading from the structure of the tapasin-ERp57 thiol oxidoreductase heterodimer.
      ,
      • Walker K.W.
      • Gilbert H.F.
      Scanning and escape during protein-disulfide isomerase-assisted protein folding.
      ). This breaks the intersubunit disulfide bridge and generates an oxidized β/PDI subunit on dissociation of the complex, as observed experimentally for C-P4H-I (
      • Lumb R.A.
      • Bulleid N.J.
      Is protein disulfide isomerase a redox-dependent molecular chaperone?.
      ). This is also in agreement with the MS peptide mapping data of the mature C-P4H-II complex (Tables S2 and Table S3), where it is found that the sample treatment dissociates the complex, generating oxidized β/PDI active sites, having a disulfide bridge between the catalytic cysteines.
      The geometries of the intersubunit disulfide bridges formed between Cys311(CAT) and Cys397(PDI) and between Cys529(CAT) and Cys53(PDI) are very similar, as shown when superimposing the a and a’ domains (Fig. 5A), and this geometry is within acceptable ranges of disulfide bridges when compared with the geometry of disulfide bridges in other proteins (
      • Wijma H.J.
      • Floor R.J.
      • Jekel P.A.
      • Baker D.
      • Marrink S.J.
      • Janssen D.B.
      Computationally designed libraries for rapid enzyme stabilization.
      ). This geometry is also preserved in the structure of the ERp57–tapasin complex when considering the interface of the a domain of ERp57 and tapasin (Fig. 5B). The PDI geometry is also conserved in the structure of the MTP complex when considering the interface of the a’ domain of PDI and the MTP α-subunit (Fig. 5C). In the ERp57–tapasin complex (the assembly is visualized in Fig. S11), the distal cysteine is mutated into an alanine, facilitating the trapping of a complex in which there is a stable covalent disulfide bond between the proximal cysteine of the ERp57 CGHC motif and the cysteine of tapasin (
      • Dong G.
      • Wearsch P.A.
      • Peaper D.R.
      • Cresswell P.
      • Reinisch K.M.
      Insights into MHC class I peptide loading from the structure of the tapasin-ERp57 thiol oxidoreductase heterodimer.
      ). In the MTP complex, the β/PDI cysteines are both reduced, and a covalent complex is not observed in this structure. In Figure 5D, the comparison of the structures of the a’ domain of the C-P4H-II-Δ281 complex (with the intersubunit disulfide bond) with the structure of the oxidized a’ domain of the uncomplexed PDI (with a disulfide bond between Cys397 and Cys400) shows no structural differences between these two states. Clearly, the various oxidation states of the PDI domains can be accommodated with minor structural changes.
      Figure thumbnail gr5
      Figure 5Comparison of the geometry of the CGHC interaction sites of human PDI (in various oxidation states and complexes) and human ERp57. Similar view as for and . The a’ interaction site with the CAT domain of the C-P4H-II–Δ281 complex is used as a reference structure, and the Cα-atoms of β-strand β2 and helix H2 of the PDI domains have been used for the superpositioning. The a’ domain is in light blue color, the CAT region is in red color, and the sulfur atoms are shown in yellow color. A, comparison of the a’ and a (light red) interaction sites of the C-P4H-II–Δ281 structure. B, comparison of the a’ and a (cyan) interaction sites of the complexes of C-P4H-II–Δ281 and ERp57–tapasin (Protein Data Bank [PDB] entry: 3F8U), respectively. In the latter complex, the distal cysteine of the a domain of ERp57 is mutated into an alanine. C, comparison of the a’ interaction sites of the complexes of C-P4H-II–Δ281 and MTP (PDB entry: 6I7S) (yellow). In the MTP complex, the CGHC cysteines of PDI are reduced. D, comparison of the geometry of the a’ interaction site of the C-P4H-II–Δ281 complex with the geometry of the unliganded and oxidized structure (with disulfide bridge) (magenta) of a’ of human PDI (PDB entry: 4EL1). CAT, catalytic domain; MTP, microsomal triglyceride transfer protein; PDI, protein disulfide isomerase.
      The third important interface between the CAT domain and the β/PDI subunit concerns the hydrophobic interactions of the Ile324 side chain of the CAT domain with the hydrophobic pocket of the PDI b’ domain. Various studies on the interactions of PDI with peptides have highlighted the importance of the b’ hydrophobic pocket for the PDI peptide interactions (
      • Klappa P.
      • Ruddock L.W.
      • Darby N.J.
      • Freedman R.B.
      The b' domain provides the principal peptide-binding site of protein disulfide isomerase but all domains contribute to binding of misfolded proteins.
      ,
      • Nguyen V.D.
      • Wallis K.
      • Howard M.J.
      • Haapalainen A.M.
      • Salo K.E.
      • Saaranen M.J.
      • et al.
      Alternative conformations of the x region of human protein disulphide-isomerase modulate exposure of the substrate binding b' domain.
      ). This interaction site is formed by a region of the thioredoxin fold, which is different from the a and a’ domain interaction sites, being shaped by residues of helix H1 and helix H3 (Figs. 6A and S12). The rim of this pocket is formed by residues Phe240, Phe249 (of helix H1), Ile301, Phe304, Phe305 (of helix H3), and Ile318. The bottom of this hydrophobic pocket is formed by the side chains of residues Leu258, Ile289, and Ile291, which protrude out of the core β-strands. The importance of Ile289 for the peptide-binding properties of PDI has been reported (
      • Koivunen P.
      • Salo K.E.
      • Myllyharju J.
      • Ruddock L.W.
      Three binding sites in protein-disulfide isomerase cooperate in collagen prolyl 4-hydroxylase tetramer assembly.
      ). Ile324 of the LLIA sequence of the α14-helix of the CAT domain points into this hydrophobic pocket. The hydrophobicity of the LLIA region is highly conserved in the C-P4H family (Fig. S1). In the MTP crystal structure, the mode of assembly of the α-subunit with the β/PDI subunit is very different (Fig. S11); however, the same hydrophobic peptide-binding pocket of the b’ domain is also in MTP interacting with the MTP α-subunit (
      • Biterova E.I.
      • Isupov M.N.
      • Keegan R.M.
      • Lebedev A.A.
      • Sohail A.A.
      • Liaqat I.
      • et al.
      The crystal structure of human microsomal triglyceride transfer protein.
      ). In this complex, Tyr605 of the α-subunit points into this hydrophobic pocket with its aromatic ring stacked between Leu258 and Phe249, and the hydroxyl moiety of Tyr605 has a hydrogen bond with the side chain of His256 (Fig. 6B). There is no further sequence or structural similarity in the regions of the α-subunits of the C-P4H-II-Δ281 and MTP complexes, which interact with the b’ domain, but nevertheless, there are no conformational differences between the structural elements of the b’ domain that form the binding pocket in these two complexes. This is consistent with the required promiscuity of substrate binding that this pocket must have as part of PDIs role in oxidative folding. The structure of the C-P4H-II-Δ281 complex suggests that the a’ and b’ interfaces contribute much to the stability of the complex, which is consistent with previous mutational studies, which have shown that the a’ and b’ domains are most important for the assembly of the mature complex (
      • Pirneskoski A.
      • Ruddock L.W.
      • Klappa P.
      • Freedman R.B.
      • Kivirikko K.I.
      • Koivunen P.J.
      Domains b' and a' of protein disulfide isomerase fulfill the minimum requirement for function as a subunit of prolyl 4-hydroxylase. The N-terminal domains a and b enhances this function and can be substituted in part by those of ERp57.
      ,
      • Koivunen P.
      • Salo K.E.
      • Myllyharju J.
      • Ruddock L.W.
      Three binding sites in protein-disulfide isomerase cooperate in collagen prolyl 4-hydroxylase tetramer assembly.
      ).
      Figure thumbnail gr6
      Figure 6The interactions of the CAT domain with the hydrophobic pocket of the b’ domain of the β/PDI subunit. The viewing direction is similar as for and . A, the hydrophobic and conserved α14 region (including Ile324) of the CAT domain (red ribbon and sticks) interacts with the hydrophobic peptide-binding pocket of b’ of the β/PDI subunit (light blue ribbon and sticks). The key side chains of the residues shaping the hydrophobic peptide-binding pocket of the b’ domain are also shown. The stereo view is given in . B, the b’ domains of the superimposed structures of the C-P4H-II-Δ281 complex (light blue) and the MTP complex (Protein Data Bank entry: 6I7S, shown as ribbons and sticks, colored light magenta). Small adjustments of the side chains of the b’ hydrophobic pocket allow the side chains of Ile324 (C-P4H-II-Δ281 complex, red) and Tyr605 (MTP complex, yellow) to bind in the same hydrophobic peptide-binding pocket, formed between helices H1 and H3. In the MTP complex, the hydroxyl group of Tyr605 is hydrogen bonded to the side chain of His256 (marked with a dotted line). CAT, catalytic domain of the α-subunit of C-P4H; MTP, microsomal triglyceride transfer protein; PDI, protein disulfide isomerase.

      The structure of the active site of the CAT domain

      The CAT active site is in its apo form, without a bound Fe(II) ion, which is consistent with the presence of EDTA in the protein buffer of the crystallization experiment. The main chain and side chains of the fully conserved His-X-Asp----His motif (Fig. S1) (residues His430, Asp432, and His501 in C-P4H-II) known to bind the Fe(II) ion (
      • Myllyharju J.
      • Kivirikko K.I.
      Characterization of the iron- and 2-oxoglutarate-binding sites of human prolyl 4-hydroxylase.
      ) superimpose well on the structure of the corresponding region of the Cr-P4H complex (Figs. 7 and S13) known to bind the active site of Fe(II) ion (
      • Koski M.K.
      • Hieta R.
      • Böllner C.
      • Kivirikko K.I.
      • Myllyharju J.
      • Wierenga R.K.
      The active site of an algal prolyl 4-hydroxylase has a large structural plasticity.
      ). Also, the main-chain and side-chain atoms of the conserved residues Tyr421, Thr467, and Lys511, which are proposed to interact with the buried 2-oxoglutarate carboxylate group (as seen in the Cr-P4H complex structure with a 2-oxoglutarate analog (
      • Koski M.K.
      • Hieta R.
      • Böllner C.
      • Kivirikko K.I.
      • Myllyharju J.
      • Wierenga R.K.
      The active site of an algal prolyl 4-hydroxylase has a large structural plasticity.
      )), superimpose well on the corresponding residues of the Cr-P4H complex.
      Figure thumbnail gr7
      Figure 7Comparison of the structures of the CAT domains of C-P4H-II-Δ281 and Cr-P4H. A, superimposed Cα traces of the CAT domains of the C-P4H-II-Δ281 complex (light brown) and Cr-P4H (light blue, Protein Data Bank [PDB] entry: 3GZE, molecule A) in complex with Zn(II) (gray sphere) (the Zn(II) ion has replaced the Fe(II) active site ion, resulting in an incompetent active site) and the (Ser-Pro)5 peptide substrate (magenta sticks). The α12-helix, the α13-helix, and β-strand β1 are additional structural elements of C-P4H-II-Δ281, missing in Cr-P4H. Also the C terminus (labeled with “C”) is more extended in C-P4H-II-Δ281. The loop regions, which are important for peptide substrate binding (the hairpin and βII–βIII loop), are labeled. The hairpin loop is disordered in the structure of the C-P4H-II-Δ281 complex, and it is in the “closed” conformation in the Cr-P4H peptide complex. The side chains of the three important catalytic amino acids, two histidines and one aspartate, coordinating the bound Zn(II) ion in the 3GZE structure, are shown. Also the intrasubunit disulfide bond (labeled as S-S) of the conserved cysteine of the βVII-strand (Cys504 in the C-P4H-II-Δ281 complex and Cys230 in Cr-P4H) with Cys294 (α12-helix in the N-terminal region of the CAT domain of C-P4H-II-Δ281) and Cys195 (extended βIV–βV loop of Cr-P4H), respectively, is shown. B, zoomed in view of panel (A) of the active site of the CAT domain, highlighting the βI-βII loop region (Gly422-Pro429) and the following His-X-Asp----His motif (residues His430, Asp432, and His501), which are shown as sticks, like the side chain of Tyr427. Included is also the superimposed βI-βII loop region as seen (i) in the structure of the Cr-P4H enzyme–substrate complex (PDB entry: 3GZE, molecule A, as in panel (A), light blue) and (ii) as seen in the structure of Cr-P4H without bound peptide (molecule B, PDB entry: 2JIG, yellow), highlighting the two different conformations of Tyr140. In the structure of the Cr-P4H enzyme substrate complex, the Tyr140 side chain is hydrogen bonded to the bound (Ser-Pro)5 peptide substrate, as shown with a dashed line. The proline residue to be hydroxylated by the catalytic site is identified with an arrow. Zn identifies the Zn(II) ion–binding site in the catalytic site of the Cr-P4H enzyme–substrate complex. In the structure of Cr-P4H without bound peptide, the Tyr140 side chain is rotated outward (yellow). The latter conformation is not possible in the C-P4H-II-Δ281 structure, as its corresponding tyrosine (Tyr427) would clash with Tyr290 of the α12-helix. Stereo views are given in . CAT, catalytic domain of the α-subunit of C-P4H; C-P4H, collagen prolyl 4-hydroxylase; Cr-P4H, Chlamydomonas reinhardtii P4H isoform 1.
      From the crystal structures of Cr-P4H, it is known that the two active-site substrate-binding loops, which are the hairpin loop (residues Ser77 to Thr94 in Cr-P4H) and the βII-βIII loop (residues His148 to Gly159 in Cr-P4H) sequester its (Ser-Pro)5 substrate peptide, by which the peptide binds in a tunnel, so that residue Pro5 of this peptide points toward the catalytic metal ion (
      • Koski M.K.
      • Hieta R.
      • Hirsilä M.
      • Rönkä A.
      • Myllyharju J.
      • Wierenga R.K.
      The crystal structure of an algal prolyl 4-hydroxylase complexed with a proline-rich peptide reveals a novel buried tripeptide binding motif.
      ). It can be noted that also in HIF-P4H-2 (Protein Data Bank [PDB] entry: 3HQR) (
      • Chowdhury R.
      • McDonough M.A.
      • Mecinovic J.
      • Loenarz C.
      • Flashman E.
      • Hewitson K.S.
      • et al.
      Structural basis for binding of hypoxia-inducible factor to the oxygen-sensing prolyl hydroxylases.
      ), the hairpin loop is important for peptide binding. In the structure of the C-P4H-II-Δ281 complex, the active site is unliganded and the hairpin loop (residues Arg363 to Val376) is disordered. The βII-βIII region (residues Arg435 to Gly448) is included in the model of the CAT-PDI complex, but it has been built in weak density. Importantly, these two loops are not near the interaction site of the β/PDI subunit and therefore can adopt other conformations, for example, in the presence of bound substrate peptide. The sequences of the hairpin loop and the βII-βIII loop are not well conserved when comparing the sequences of human C-P4H-I, C-P4H-II, and C-P4H-III (Fig. 1). Large sequence divergence is also apparent when taking into account the sequences of other P4H enzymes (Fig. S1). For example, in P4H-TM, the hairpin loop is much more extensive, forming even a separate domain, known as the Ca(II)-binding EF domain (Fig. S1) (
      • Myllykoski M.
      • Sutinen A.
      • Koski M.K.
      • Kallio J.P.
      • Raasakka A.
      • Myllyharju J.
      • et al.
      Structure of transmembrane prolyl 4-hydroxylase reveals unique organization of EF and dioxygenase domains.
      ). In this respect, it is also interesting to note that the splice variants of C-P4H-I affect the sequence of the hairpin loop (Fig. S1), whereas those of C-P4H-II change the sequence and length of the βII-βIII loop (Fig. S1).
      Previous structural studies of Cr-P4H have shown that conformational flexibility of the βI-βII catalytic loop (residues Tyr134 to Tyr140 in Cr-P4H, Fig. 7B) (
      • Koski M.K.
      • Hieta R.
      • Böllner C.
      • Kivirikko K.I.
      • Myllyharju J.
      • Wierenga R.K.
      The active site of an algal prolyl 4-hydroxylase has a large structural plasticity.
      ,
      • Koski M.K.
      • Hieta R.
      • Hirsilä M.
      • Rönkä A.
      • Myllyharju J.
      • Wierenga R.K.
      The crystal structure of an algal prolyl 4-hydroxylase complexed with a proline-rich peptide reveals a novel buried tripeptide binding motif.
      ) is important for the function. In the structure of Cr-P4H complexed with its substrate (Ser-Pro)5, the side chain of the conserved Tyr140 of the βI-βII region is pointing inward and contacts Pro5 of the (Ser-Pro)5 peptide that is hydroxylated in the catalytic cycle. This tyrosine, which is Tyr427 in C-P4H-II, is highly conserved in P4H sequences (Fig. S1), and indeed, in the structure of the HIF-P4H-2 peptide complex, the corresponding tyrosine (Tyr310) also interacts with the bound peptide. Structure analysis of the complex of Cr-P4H with bound peptide suggests that the product of the hydroxylation reaction, which is the hydroxylated proline, will clash with the side chain of Tyr140 (
      • Koski M.K.
      • Hieta R.
      • Hirsilä M.
      • Rönkä A.
      • Myllyharju J.
      • Wierenga R.K.
      The crystal structure of an algal prolyl 4-hydroxylase complexed with a proline-rich peptide reveals a novel buried tripeptide binding motif.
      ). This then could result in the tyrosine side chain flipping out, being linked to an induced conformational switch of the βI-βII region. Such an out conformation is captured in the unliganded structure, as shown in Figure 7B. In the C-P4H-II-Δ281 structure, the βI-βII region near Tyr427 adopts a conformation, which is different from both the competent in-conformation as well as the unliganded out-conformation of Cr-P4H (Fig. 7B). The notion that the conformation of the βI-βII region of the C-P4H-II-Δ281 complex is different from the conformation competent for catalysis is consistent with the observation that the C-P4H-II-Δ281 complex has low catalytic activity (Table 1). The flipped-out conformation of Tyr427 would clash with Tyr290 of the α12-helix (as shown in Fig. 7B), showing that the flipped-out conformation, as observed in the Cr-P4H structure, is not possible in the structure of the assembly that is captured in this crystal form of the C-P4H-II-Δ281 complex.

      The predicted structure of the mature C-P4H-II α2β2 tetramer

      The structure prediction AlphaFold2 protocol (
      • Jumper J.
      • Evans R.
      • Pritzel A.
      • Green T.
      • Figurnov M.
      • Ronnegerger O.
      • et al.
      Highly accurate protein structure prediction with AlphaFold.
      ), as implemented in a CoLab notebook (AlphaFold2_advanced.ipynb), was used to predict the structure of the α2-dimer, using the sequence of the mature α-subunit, starting at Glu22 (Fig. 1). Subsequently, the CAT domain of the structure of the C-P4H-II-Δ281 complex was superimposed on the two CAT domains of the predicted structure of the α2-dimer, providing then the predicted structure of the mature C-P4H α2β2 complex (Fig. 8). The fold of the α-subunit as predicted for the α2-dimer is essentially the same as seen in the model obtained for the monomeric α-subunit using either Robetta (
      • Baek M.
      • DiMaio F.
      • Anishchenko I.
      • Dauparas J.
      • Ovchinnikov S.
      • Lee G.R.
      • et al.
      Accurate prediction of protein structures and interactions using a three-track neural network.
      ) or AlphaFold2 (
      • Jumper J.
      • Evans R.
      • Pritzel A.
      • Green T.
      • Figurnov M.
      • Ronnegerger O.
      • et al.
      Highly accurate protein structure prediction with AlphaFold.
      ), using the protocols as described in the Experimental procedures section. The confidence score (the pLDDT score) of the AlphaFold2 model of the α2-dimer is high for most regions of the model (Fig. S14). The lowest scores are for the sequence region just after the PSB domain (and before the α12-helix of the C-P4H-II-Δ281 complex) and for the sequence regions of the two substrate-binding loops (the hairpin loop and the βII-βIII loop) of the CAT domain. The latter two loops are disordered (the hairpin loop) or built in weak density (the βII-βIII loop) in the C-P4H-II-Δ281 structure. The structure of the N-terminal double-domain region of the model agrees very well with its crystal structure (
      • Anantharajan J.
      • Koski M.K.
      • Kursula P.
      • Hieta R.
      • Bergmann U.
      • Myllyharju J.
      • et al.
      The structural motifs for the substrate binding and dimerization of the α subunit of collagen prolyl 4-hydroxylase.
      ). In the model, the CAT domain folds over the double-domain region in such a way that the minor sheet of the DSBH-fold is closely interacting with the α5-helix of the same α-subunit. The βI-βII region of the CAT domain is interacting with (i) the α5-helix and (ii) the N terminus of the same α-subunit as well as with (iii) the α2-helix of the other α-subunit of the α2-dimer (Fig. 8). The residues of the α5-helix are highly conserved in sequence alignments (Figs.1 and S1), as was noted already when discussing the structural properties of the double-domain dimer crystal structure (
      • Anantharajan J.
      • Koski M.K.
      • Kursula P.
      • Hieta R.
      • Bergmann U.
      • Myllyharju J.
      • et al.
      The structural motifs for the substrate binding and dimerization of the α subunit of collagen prolyl 4-hydroxylase.
      ). The sequence alignment (Figs. 1 and S1) shows that Phe24 (at the N terminus of the α-subunit) and Pro83 (near the C terminus of the α2-helix) are also conserved. These conserved residues are in the model close to the βI-βII region. The predicted structure of the βI-βII region is the same as in the structure of the Cr-P4H peptide complex (
      • Koski M.K.
      • Hieta R.
      • Hirsilä M.
      • Rönkä A.
      • Myllyharju J.
      • Wierenga R.K.
      The crystal structure of an algal prolyl 4-hydroxylase complexed with a proline-rich peptide reveals a novel buried tripeptide binding motif.
      ). The sequence alignment (Fig. S1) shows that this region is highly conserved in the C-P4H sequences (Gly-(V/I/M)-Gly-Gly) but being different from the sequences of other P4Hs.
      Figure thumbnail gr8
      Figure 8The predicted structure of the mature C-P4H-II α2β2 tetramer. A, the mature tetramer. The α-subunits are in orange and salmon colors, respectively, and the β/PDI subunits (at the outer edge of the α2-dimer) are colored light blue. The view is down the twofold axis (identified by a black oval) of the α2-dimer. The locations of the peptide-binding sites of the PSB and CAT domains are visualized with bound peptides, as predicted by using the structures of the Cr-P4H complexed with its substrate (Ser-Pro)5 (Protein Data Bank [PDB] entry: 3GZE), superimposed on the CAT domains (red and magenta peptides) and the PSB-II-peptide complex (PDB entry: 6EVN), superimposed on the PSB domains (green and blue peptides). The upper and lower CAT active sites, identified by the red and magenta colored peptides, respectively, are related by the dimer twofold axis. The arrows show the N-to-C direction of the bound peptides. “L” identifies the linker region between the PSB domain and the CAT domain. B, zoomed-in view of the PSB and CAT domains of the mature C-P4H-II tetramer, shown in the upper part of A. The CAT catalytic site is marked by a star. The βI-βII loop (black ribbon) is close to the α5-helix and the N terminus (labeled “N”) of the same α-subunit as well as to the α2-helix of the other α-subunit of the α2-dimer (colored in salmon). CAT, catalytic domain of the α-subunit of C-P4H; C-P4H, collagen prolyl 4-hydroxylase; Cr-P4H, Chlamydomonas reinhardtii P4H isoform 1; PDI, protein disulfide isomerase; PSB, peptide-substrate–binding domain.
      In the model of the C-P4H α2β2 complex, the PSB domain is a protrusion of the core of this complex (Fig. 8), which is consistent with the observation that a soluble construct, containing only the PSB domain, can be expressed and purified (
      • Myllyharju J.
      • Kivirikko K.I.
      Identification of a novel proline-rich peptide-binding domain in prolyl 4-hydroxylase.
      ), and the purified protein has functional peptide-binding properties, as described for the PSB domains of C-P4H-I (
      • Pekkala M.
      • Hieta R.
      • Bergmann U.
      • Kivirikko K.I.
      • Wierenga R.K.
      • Myllyharju J.
      The peptide-substrate-binding domain of collagen prolyl 4-hydroxylases is a tetratricopeptide repeat domain with functional aromatic residues.
      ,
      • Hieta R.
      • Kukkola L.
      • Permi P.
      • Pirilä P.
      • Kivirikko K.I.
      • Kilpeläinen I.
      • et al.
      The peptide-substrate-binding domain of human collagen prolyl 4-hydroxylases. Backbone assignments, secondary structure, and binding of proline-rich peptides.
      ) as well as for C-P4H-II (
      • Murthy A.V.
      • Sulu R.
      • Koski M.K.
      • Tu H.
      • Anantharajan J.
      • Sah-Teli S.K.
      • et al.
      Structural enzymology binding studies of the peptide-substrate-binding domain of human collagen prolyl 4-hydroxylase (type-II): high affinity peptides have a PxGP sequence motif.
      ). Superpositioning of the structures of the PSB–peptide complex (
      • Murthy A.V.
      • Sulu R.
      • Koski M.K.
      • Tu H.
      • Anantharajan J.
      • Sah-Teli S.K.
      • et al.
      Structural enzymology binding studies of the peptide-substrate-binding domain of human collagen prolyl 4-hydroxylase (type-II): high affinity peptides have a PxGP sequence motif.
      ) and Cr-P4H-peptide complex (
      • Koski M.K.
      • Hieta R.
      • Hirsilä M.
      • Rönkä A.
      • Myllyharju J.
      • Wierenga R.K.
      The crystal structure of an algal prolyl 4-hydroxylase complexed with a proline-rich peptide reveals a novel buried tripeptide binding motif.
      ) on the model of C-P4H-II shows that the peptide-binding grooves of the PSB domain and the CAT domain are aligned, such that the C-terminal end of the peptide bound to the PSB domain points toward the N terminus of the peptide bound in the CAT active site (Fig. 8). However, there is a gap of about 32 Å between these two peptides, being the Cα–Cα distance between the C-terminal end of the PSB-bound peptide and the N-terminal end of the CAT domain–bound peptide. It is unclear if structural rearrangements occur in the C-P4H complex during the catalytic cycle that would align these grooves closer to each other. In any case, the conformational switch observed for Tyr140 of Cr-P4H (Fig. 7) is not possible in the current C-P4H-II model of the tetramer, as the flipped-out conformation of the corresponding residue (Tyr427) would overlap with the side chain of the conserved Tyr290 (located in the middle of the α12-helix). Therefore, the conformational flexibility properties of the βI-βII region of Cr-P4H suggest that structural rearrangements of the C-P4H complex during the catalytic cycle might be important, and the structure of the C-P4H-II-Δ281 complex represents just a snapshot of a structure that occurs during the catalytic cycle. More structural snapshots are required to fully understand the complicated catalytic cycle and the reaction mechanism of C-P4H.

      Concluding remarks

      The structural studies of the C-P4H-II-Δ281 complex reveal the structure of the C-P4H CAT domain and its mode of interaction with the β/PDI subunit. This structure provides also important insight into the structural properties of the mature α2β2 C-P4H complex. The proposed model of the C-P4H tetramer is based on the crystal structure of the C-P4H-II-Δ281 complex, together with AlphaFold2 prediction of the structure of the α2-dimer. In this model, the CAT active site and its three active site loops face bulk solvent, whereas the β/PDI subunit is located on the backside of the CAT domain, having no interactions with any of the other domains. The PSB peptide-binding site of C-P4H is near the CAT peptide-binding site, suggesting that the procollagen chain can remain bound near the catalytic site, even if the substrate-binding tunnel of the catalytic site opens up at the end of the catalytic cycle, allowing for the hydroxylated peptide to be released from the catalytic site. In the catalytic cycle, the cofactor 2-oxoglutarate is converted into succinate, and opening up of the catalytic site (with the release of the hydroxylated peptide) is required for exchanging the succinate with 2-oxoglutarate after which the next reaction cycle can start. The notion that the procollagen substrate remains bound at the nearby PSB domain at the end of the catalytic cycle, and is not released in bulk solvent, is in line with the observed processivity properties of this enzyme (
      • de Jong L.
      • van der Kraan I.
      • de Waal A.
      The kinetics of the hydroxylation of procollagen by prolyl 4-hydroxylase. Proposal for a processive mechanism of binding of the dimeric hydroxylating enzyme in relation to the high kcat/Km ratio and a conformational requirement for hydroxylation of -X-Pro-Gly- sequences.
      ). The model of the mature C-P4H tetramer shows that the active site of C-P4H is shaped by regions of each of the three domains of the α-subunit (including the α5-helix) as well as by a region (the α2-helix) of the other α-subunit of the α2-dimer. Therefore, our studies on the CAT mechanism of C-P4H will now focus on understanding the structural enzymological properties of the mature α2β2 C-P4H complex.

      Experimental procedures

      Cloning and construct preparation of the mature and truncated C-P4H variants

      Two splice variants occur for the α-subunit of C-P4H-I (gene name: P4HA1_HUMAN) and C-P4H-II (gene name: P4HA2_HUMAN). These studies were done with the P13674-1 (UniProt) isoform-1 splice variant of C-P4H-I (in which exon 10 is expressed) (
      • Helaakoski T.
      • Veijola J.
      • Vuori K.
      • Rehn M.
      • Chow L.T.
      • Taillon-Miller P.
      • et al.
      Structure and expression of the human gene for the α subunit of prolyl 4-hydroxylase. The two alternatively spliced types of mRNA correspond to two homologous exons the sequences of which are expressed in a variety of tissues.
      ) and with the O15460-1 (UniProt) isoform-IIb splice variant of C-P4H-II (in which exon 12b is expressed (
      • Nokelainen M.
      • Nissi R.
      • Kukkola L.
      • Helaakoski T.
      • Myllyharju J.
      Characterization of the human and mouse genes for the alpha subunit of type II prolyl 4-hydroxylase. Identification of a previously unknown alternatively spliced exon and its expression in various tissues.
      )). Codon-optimized genes of these variants were synthesized commercially (GenScript). The complementary DNA was inserted into the NdeI and BamHI cleavage sites. They were then cloned into a polycistronic CyDisCo plasmid already containing the codon-optimized PDI gene (the β-subunit of C-P4Hs, P07237 [UniProt]) and the yeast Erv1 gene (sulfhydryl oxidase) (
      • Hatahet F.
      • Nguyen V.D.
      • Salo K.E.
      • Ruddock L.W.
      Disruption of reducing pathways is not essential for efficient disulfide bond formation in the cytoplasm of E. coli.
      ,
      • Nguyen V.D.
      • Hatahet F.
      • Salo K.E.
      • Enlund E.
      • Zhang C.
      • Ruddock L.W.
      Pre-expression of a sulfhydryl oxidase significantly increases the yields of eukaryotic disulfide bond containing proteins expressed in the cytoplasm of E.coli.
      ). The N-terminal and C-terminal sequences of the expressed subunits are given in Table S1. The complete sequences of the expressed α-subunit and β-subunit are provided in Figure 1 and Figure S2, respectively. Second, various truncated C-P4H-II constructs lacking the N-terminal region of the α-subunit starting from Asp141, Asp282, Arg305, and Ala325 for C-P4H-II (Table S1 and Fig. 1) were also cloned into the polycistronic CyDisCo plasmid by using the corresponding codon-optimized α-subunit as a template and the primers shown in Table S1. These truncated C-P4H-II variants are referred to as C-P4H-II-Δ140, C-P4H-II-Δ281, C-P4H-II-Δ304, and C-P4H-II-Δ324, respectively. In all the constructs, the α-subunit was cloned to contain an N-terminal 6× His tag for convenient purification, as listed in Table S1. All cloning procedures were carried out using standard double digestion and ligation protocols. The positive colonies were screened by checking the size of the insert on an agarose gel, and the obtained clones were confirmed by sequencing using the Biocenter Oulu (BCO) DNA sequencing core facility.

      Expression and purification

      The expression of the recombinant proteins was initiated by transforming the CyDisCo plasmid, which encodes also the α-subunit, the β/PDI subunit, and Erv1, into the K12 E. coli expression strain. In the protocol, on day 2, a single bacterial colony was first inoculated in Luria–Bertani medium containing 100 μg/ml ampicillin and grown overnight at 37 °C in a shaker. The overnight culture was diluted on day 3 (100-fold) in 1 l of Terrific broth medium supplemented with 100 μg/ml ampicillin and incubated at 30 °C with vigorous shaking (250 rpm) until the absorbance reached 0.6 to 0.7 at 600 nm. Then the culture was cooled to the induction temperature (20 °C), and the expression was induced by adding 0.5 mM IPTG and continued to grow overnight (∼17 h) at 20 °C (250 rpm). After expression, on day 4, the cells were harvested by centrifugation (10,800g, 30 min) and frozen with liquid nitrogen and stored at −70 °C for later use.
      The protein purification was carried out at 4 °C. The frozen pellets, containing the recombinant proteins with the N-terminal 6× His tag in the α-subunit, were suspended in lysis buffer containing 50 mM Tris, pH 7.8, 100 mM NaCl, and 100 mM glycine, supplemented with SIGMAFAST protease inhibitor cocktail tablets (one tablet per 100 ml). The cell pellet was sonicated at 36% amplitude for 7 min with 2 s ON, 4 s OFF pulse on ice bath, and the supernatant was loaded onto a Talon immobilized metal-ion affinity chromatography column, washed with the lysis buffer, and eluted with the same buffer containing also 250 mM imidazole. The protein was then passed through a PD10 column pre-equilibrated with a buffer containing 10 mM Tris, pH 7.8, 50 mM NaCl, and 100 mM glycine. After this, the sample was subjected to anion exchange chromatography (HiTrap Q; GE Healthcare) connected to BioLogic DuoFlow purification device (Bio-Rad Laboratories, Inc) with the same buffer and eluted by using an NaCl gradient from 50 to 1000 mM. All fractions containing the protein (analyzed by using SDS-PAGE) were then pooled and concentrated with 50 kDa molecular weight cutoff Amicon Ultracentrifugal filters (Merck Millipore) before loading them on to a Superdex 200 10/300 GL column (bed volume 24 ml; GE Healthcare) pre-equilibrated with SEC buffer containing 50 mM Tris, pH 7.8, 50 mM NaCl, and 100 mM glycine. All fractions containing the protein were pooled and concentrated to about 3 to 10 mg/ml. Mostly, samples were frozen with liquid nitrogen in 100 μl aliquots (in SEC buffer) before being used for the subsequent experiments. The sample quality was monitored by SDS-PAGE analysis.
      Peptide MS fingerprints and selected MSMS spectra were used to confirm the presence of the protein loaded on the SDS-PAGE gels after the purifications. Bands were excised, alkylated (DTT/iodoacetamide), and trypsinized using standard procedures. Peptides were eluted from the gel and measured on an UltrafleXtreme Maldi-Tof mass spectrometer (Bruker). Raw data were processed with BioTools and subjected to database search using Mascot (Matrix Science).

      SEC-MALS

      The SEC-MALS analysis of the purified proteins (in SEC buffer) was carried out using a miniDAWN detector from Wyatt Technologies in on-line mode (connected to the Shimadzu HPLC system). Before entering the MALS detector, the sample flows through an Optilab refractive index detector from Wyatt Technologies. This is used to measure the concentration of the protein sample. Molecular mass and polydispersity measurements were carried out using the ASTRA software from Wyatt Technologies. About 30 to 50 μl of the purified (and frozen-thawed) sample was filtered and loaded with flow rate 500 μl/min onto the Superdex 200 Increase 10/300 GL column pre-equilibrated with SEC buffer (0.1 μm filtered and degassed).

      CD

      Far-UV CD spectra of the purified proteins were recorded using a Chirascan CD spectrophotometer (Applied Photophysics Ltd) in a cuvette with 1 mm path length, using a wavelength range of 190 to 280 nm. The purified protein samples (in SEC buffer) were diluted to 0.1 mg/ml using Milli-Q water, and the final protein concentrations were determined from the absorbance values at 214 nm. The CD data were acquired every 1 nm with 0.5 s as an integration time and repeated three times with baseline correction. The samples were then heated from 22 to 94 °C using a Peltier temperature controller at a rate of 1 °C/min for obtaining a CD melting curve, recording at each temperature the CD spectrum using the same wavelength range as mentioned previously. The data collected between 72 and 94 °C were not used for the Tm calculations as the proteins started to aggregate. Data analysis was carried out using the Chirascan Pro-Data Viewer (Applied Photophysics) and CDNN (http://www.xn--gerald-bhm-lcb.de/download/cdnn). The direct CD measurements (θ; mdeg) were converted into mean residue molar ellipticity ([θ]MR) by Pro-Data Viewer. The Tm was calculated with the Global 3 software (Applied Photophysics) using a one-transition model and using the same wavelength range as aforementioned.

      Activity assays

      Two assays were used to characterize the enzymatic properties of C-P4H-II and its truncated variants. First, the catalytic activity of the purified enzymes was measured by an indirect activity assay based on the hydroxylation-coupled decarboxylation of 2-oxo[1-14C]glutarate (PerkinElmer) (
      • Kivirikko K.I.
      • Myllylä R.
      Posttranslational enzymes in the biosynthesis of collagen: intracellular enzymes.
      ). About 100 μM (PPG)10 peptide was used as a substrate. In addition, the activities were measured with a direct assay (
      • Kivirikko K.I.
      • Myllylä R.
      Posttranslational enzymes in the biosynthesis of collagen: intracellular enzymes.
      ), based on measurement of the 4-hydroxy[14C]proline formation from [14C]proline-labeled procollagen substrate, consisting of nonhydroxylated α-chain of chicken type-I procollagen. The radioactivities were determined using a Tri-Carb 2900TR (PerkinElmer) liquid scintillation counter. The results were calculated as disintegrations per minute per enzyme active site, taking into account that the number of C-P4H active sites is two per mature C-P4H-II tetramer and one per heterodimer for the truncated variants.

      Peptide mapping MS experiments of mature C-P4H-II

      The MS analyses were performed with native (untreated), DTT-reduced, and NEM-labeled samples of the mature C-P4H-II (in SEC buffer). The treatment with DTT reduces disulfide bridges, and the treatment with NEM will modify free SH groups. First, the protein solution was buffer-exchanged to 100 mM ammonium acetate (pH 6.8) by using a PD-10 column (Cytiva). In the next step, three 50-μl aliquots were taken from the C-P4H-II sample (3.5 μM), the first being the native sample, the second reduced with DTT (1 mM, 1 h at room temperature), and the third treated with NEM (1 mM, 1 h at room temperature). In the third step, each sample was diluted (1:1, v/v) with ammonium acetate (pH 2.6) (to reduce pH to around 4.1 to facilitate on-line pepsin digestion and to prevent cysteine oxidation). The on-line pepsin digestion tubing was prepared as follows (
      • Marie G.
      • Serani L.
      • Laprévote O.
      An on-line protein digestion device for rapid peptide mapping by electrospray mass spectrometry.
      ): a 50-μl PEEK tubing was first washed with 1% formic acid, followed by filling the tubing with a pepsin solution (1 mg/ml pepsin in 1% formic acid). After an overnight incubation at 4 °C, the tubing was washed once with a small volume of water. The tubing, with the immobilized pepsin, was then connected directly to the ion source, and the protein solutions were infused through the tubing at room temperature at a flow rate of 2 μl/min using a syringe pump, resulting in an approximately 30 min digestion time. The mass spectra were measured using a 12-T Bruker solariX XR Fourier transform ion cyclotron resonance MS (Bruker Daltonics GmbH), equipped with a dynamically harmonized Paracell and an Apollo-II electrospray ionization source, operated in the positive-ion mode. The ion source temperature was set to 200 °C, and the nebulizing and drying gases were set to 1 bar and 4 l/min, respectively. The data were acquired with ftmsControl 2.1 (Bruker Daltonics GmbH) and further processed using DataAnalysis 4.4 software (Bruker Daltonics GmbH). The mass spectra were acquired using a one molecular weight dataset size with 256 scans summed for the final spectra using an m/z range of 387 to 5000. The mass spectra were further internally recalibrated using monoisotopic masses of the selected peptides, whose sequences were confirmed by MS/MS (collision-induced dissociation) measurements. The peptide monoisotopic masses were obtained by using a built-in SNAP2 peak picking algorithm and were then searched against the C-P4H-II sequence with GPMAW 9.2 software (Lighthouse Data) and ProteinProspector (https://prospector.ucsf.edu/prospector/mshome). Because of the broad cleavage specificity of pepsin, the peptide mapping was performed as follows. The peptides with a single hit were assigned first, and the observed cleavage sites of these peptides were then used to assign the other peptides with multiple possible locations. No restrictions were applied to the digestion sites, that is, nonspecific cleavage was used, and only the peptides with less than 10 ppm mass error were considered. All possible disulfide linkages were initially considered in the search of the peptides including intersubunit and intrasubunit ones by searching α- and β-subunits simultaneously. The search was repeated by limiting the disulfide pairing to those observed in the crystal structure. The identified peptides as well as the sequence coverages in each of the three experiments are listed in Table S2. A summary is provided in Table S3.

      Protein crystallography studies

      Extensive crystallization screening was performed with the C-P4H-II-Δ281 complex. After SEC, the protein (in SEC buffer) was concentrated to 11 mg/ml by using 50 K Amicon Ultra-15 Centrifugal Filter concentrator (Merck). The protein was subjected to extensive crystallization screening by using the equipment of the BCO, University of Oulu Structural Biology core facility. Initial screening for crystallization was carried out by the sitting drop vapor diffusion method using in-house screens in IQ 96-well triple sitting drop plates (TTP Labtech) at two different temperatures (22 and 4 °C). The protein and precipitant solutions were mixed together in three volume ratios (1:2, 1:1, and 2:1) in a final volume of 300 nl with the help of the Mosquito LCP nanodispenser (TTP Labtech). The plates were imaged using Formulatrix Rock Imagers RI54 (at 22 °C) and RI27 (at 4 °C). The crystallization results were monitored with the IceBear software expert system (
      • Daniel E.
      • Maksimainen M.M.
      • Smith N.
      • Ratas V.
      • Biterova E.
      • Murthy S.N.
      • et al.
      IceBear: an intuitive and versatile web application for research-data tracking from crystallization experiment to PDB deposition.
      ).
      The first, poorly diffracting, crystals of the CP4H-II-Δ281 complex were obtained at 4 °C using a well solution of 0.1 M MES, pH 6.0, containing 20% (w/v) PEG4000, 0.2 M lithium sulfate (ProPlex screen HT-96; Molecular Dimensions) at a volume ratio of 1:2 for the protein and well solutions, respectively. In subsequent purifications, the SEC buffer of the last step of the purification protocol was supplemented with 1 mM EDTA, and in further crystallization experiments, several additive screens were tested to improve the crystal quality. The best crystals were eventually obtained at 4 °C by mixing equal volumes of a protein solution of 16.5 mg/ml in 50 mM Tris–HCl, pH 7.8 (at 4 °C), 50 mM NaCl, 100 mM glycine, 1 mM EDTA, with a well solution of 0.1 M MES, pH 6.0 (at 4 °C), 20% (w/v) PEG4000, 0.2 M lithium sulfate, and 3% methanol. The initial crystal testing was done in the BCO Structural Biology core facility/Biocenter Finland national data collection core facility, using the in-house Microstar X8 Proteum X-ray generator (Bruker). Crystals were cryoprotected, before cryocooling by immersing in liquid nitrogen, by quick transfer (few seconds of equilibration) in a cryobuffer (0.1 M MES, pH 6.0 buffer with 15% ethylene glycol, 20% [w/v] PEG4000, and 0.2 M lithium sulfate). All the crystal fishing and cryocooling protocols were performed in the cold room at 4 °C. Many crystals were tested with the home source X-ray generator, and the best crystals (showing diffraction between 6 Å and 9 Å resolution) were stored and shipped to synchrotrons (ESRF [beamline ID29], PETRA III [DESY, EMBL Hamburg, beamline P14], and Diamond Light Source, beamlines I03, I04, and VMXi). The best dataset (3.8 Å resolution) was collected at the beamline P14, PETRA III (DESY), EMBL Hamburg. The X-ray images were processed by XDS (
      • Kabsch W.
      XDS.
      ), and the data were scaled using Aimless (
      • Evans P.R.
      • Murshudov G.N.
      How good are my data and what is the resolution.
      ) of the CCP4 package (
      • Winn M.D.
      • Ballard C.C.
      • Cowtan K.
      • Dodson E.J.
      • Emsley P.
      • Evans P.R.
      • et al.
      Overview of the CCP4 suite and current developments.
      ). The data processing statistics are listed in Table 2.
      The structure of the C-P4H-II-Δ281 complex was solved by molecular replacement, using Phaser (
      • McCoy A.J.
      • Grosse-Kunstleve R.W.
      • Adams P.D.
      • Winn M.D.
      • Storoni L.C.
      • Read R.J.
      Phaser crystallographic software.
      ) and Molrep (
      • Vagin A.
      • Teplyakov A.
      Molrep: an automated program for molecular replacement.
      ). In the first step, the crystal structure of Cr-P4H (PDB entry: 2JIG) (
      • Koski M.K.
      • Hieta R.
      • Böllner C.
      • Kivirikko K.I.
      • Myllyharju J.
      • Wierenga R.K.
      The active site of an algal prolyl 4-hydroxylase has a large structural plasticity.
      ) was used as a model to find two copies of the CAT domain. The two copies formed a dimer with the orientation of the local twofold axis consistent with the self-rotation function calculated using Molrep (
      • Vagin A.
      • Teplyakov A.
      Molrep: an automated program for molecular replacement.
      ). Such a confirmation was important at this point because, with low-resolution data, incorrect molecular replacement solutions with a good contrast are not unusual, whereas electron density maps from a partial structure are not very instructive. In the next steps, two copies of each of the a, a', and b’ (in this order) domains of the β/PDI subunit were positioned using Phaser. The a and a’ domains were found by using as search models the structures of these domains, as present in the reduced (“open”) form of human PDI (PDB entry: 4EL1) (
      • Wang C.
      • Li W.
      • Ren J.
      • Fang J.
      • Ke H.
      • Gong W.
      • et al.
      Structural insights into the redox-regulated dynamic conformations of human protein disulfide isomerase.
      ), and the b’ domain was found by using the structure of this domain in the β/PDI subunit of the MTP complex (PDB entry: 6I7S (
      • Biterova E.I.
      • Isupov M.N.
      • Keegan R.M.
      • Lebedev A.A.
      • Sohail A.A.
      • Liaqat I.
      • et al.
      The crystal structure of human microsomal triglyceride transfer protein.
      )). The PDI domains were located around the central CAT domain dimer and followed the same local twofold symmetry as the previously positioned CAT domains. The initial refinement trials with the partial model with Phenix (
      • Liebschner D.
      • Afonine P.V.
      • Baker M.L.
      • Bunkoczi G.
      • Chen V.B.
      • Croll T.I.
      • et al.
      Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix.
      ), Refmac5 (
      • Murshudov G.N.
      • Skubak P.
      • Lebedev A.A.
      • Pannu N.S.
      • Steiner R.A.
      • Nicholls R.A.
      • et al.
      REFMAC5 for the refinement of macromolecular crystal structures.
      ), and Lowrestr (
      • Kovalevskiy O.
      • Nicholls R.A.
      • Murshudov G.N.
      Automated refinement of macromolecular structures at low resolution using prior information.
      ) gave Rwork and Rfree of around 40.0 and 45.0%, respectively. The b domain was not found by molecular replacement searches, but it was positioned manually in the weak electron density map. First, the PDI structure (PDB entry: 4EL1) was superposed onto the already positioned a and b' domains using Coot (
      • Emsley P.
      • Lohkamp B.
      • Scott W.G.
      • Cowtan K.
      Features and development of Coot.
      ), and subsequently, the b domain position was optimized manually. The complete structure was then refined using the BUSTER-TNT software (Global Phasing Limited) (
      • Bricogne G.
      • Blanc E.
      • Brandl M.
      • Flensburg C.
      • Keller P.
      • Paciorek W.
      • et al.
      BUSTER Version 2.10.3.
      ), and model corrections were done using Coot (
      • Emsley P.
      • Lohkamp B.
      • Scott W.G.
      • Cowtan K.
      Features and development of Coot.
      ). In this model, the linker region at the N terminus of the α-subunit construct and the C-terminal tail of the α-subunit were built as a polyalanine model. Rwork and Rfree were 33.8% and 35.4%, respectively.
      At this stage, the predicted models of the AlphaFold2 (
      • Jumper J.
      • Evans R.
      • Pritzel A.
      • Green T.
      • Figurnov M.
      • Ronnegerger O.
      • et al.
      Highly accurate protein structure prediction with AlphaFold.
      ,
      • Tunyasuvunakool K.
      • Adler J.
      • Wu Z.
      • Green T.
      • Zielinski M.
      • Žídek A.
      • et al.
      Highly accurate protein structure prediction for the human proteome.
      ) and Robetta (
      • Baek M.
      • DiMaio F.
      • Anishchenko I.
      • Dauparas J.
      • Ovchinnikov S.
      • Lee G.R.
      • et al.
      Accurate prediction of protein structures and interactions using a three-track neural network.
      ) structure prediction machine learning software became available. Both models predicted the same conformation for the regions that were built as polyalanine stretches of the intermediately refined C-P4H-II-Δ281 structure, and these predicted conformations were also the same as built in this intermediate structure. Subsequently, the sequences of the predicted model were assigned to the polyalanine stretches of the α-subunits, and the structure refinement was revisited with AlphaFold2 models for CAT domain and β/PDI subunits being used for generating additional geometric restraints.
      The final structure is the result of three additional series of refinement with BUSTER, Refmac5, and again BUSTER. Model correction in Coot was done using BUSTER maps, whereas the intermediate Refmac5 series helped pushing optimization forward. Rwork/Rfree in the last BUSTER refinement run before and in the first BUSTER refinement run after the Refmac5 series were 27.8%/30.4% and 26.0%/28.0%, respectively. The slow progress of the refinement is possibly associated with the inaccuracy of the initial positioning of the domains of the less ordered β/PDI subunits and smaller gradients for their atoms during refinement (the centers of the individual domains of the β/PDI subunits, calculated as averaged coordinates of Cα-atoms, moved during the three series of refinement by 0.17 to 0.52 Å, whereas the centers of the two CAT domains moved by only 0.12 and 0.13 Å).
      The refinement protocols for BUSTER and Refmac5 were similar and included TLS refinement (one “big” cycle in case of BUSTER or 10 cycles in case of Refmac5) followed by restrained refinement of atomic coordinates and the overall B-factor (5–7 “big” cycles or 20 cycles). In both protocols, the relative weight between X-ray and geometry terms was estimated by the respective program. B-factors of all atoms were reset to 50 Å2 prior to each refinement run. In the TLS refinement cycles, the entire CAT domains and individual domains of the β/PDI subunits were assigned as TLS groups (based on the results of preliminary trials, where subunit-based definition of TLS groups gave higher R-factors). TLS parameters were recalculated from scratch in each run. The AlphaFold2 models of the CAT domain and the β/PDI subunit were generated using AlphaFold2, as implemented at the EBI (
      • Tunyasuvunakool K.
      • Adler J.
      • Wu Z.
      • Green T.
      • Zielinski M.
      • Žídek A.
      • et al.
      Highly accurate protein structure prediction for the human proteome.
      ), and these models were used as reference structure for the BUSTER refinement, being used by BUSTER to internally generate local structural similarity restraints (LSSRs) (
      • Smart O.S.
      • Womack T.O.
      • Flensburg C.
      • Keller P.
      • Paciorek W.
      • Sharff A.
      • et al.
      Exploiting structure similarity in refinement: automated NCS and target-structure restraints in BUSTER.
      ). In case of Refmac5, the external restraints (similar to LSSR but named differently) were generated explicitly using ProSMART (
      • Nicholls R.A.
      • Long F.
      • Murshudov G.N.
      Low-resolution refinement tools in REFMAC5.
      ), and weights were customized for repulsion restraints (increased from 1 to 2) and X-ray term (fixed to 0.0004). The final BUSTER model was validated with a Refmac5 test refinement cycle, using the same parameters but without external restraints. The overall rmsd Cα of 0.15 Å (0.14–0.17 Å for individual subunits) (with maximum Cα difference of 0.5 Å) for Gly298 of subunit B when comparing the final model (BUSTER with LSSR) and the control model (Refmac5 without external restraints) show that LSSR did not introduce any considerable model bias. Sharpened maps for model building and structure analysis were calculated using the default map coefficients of BUSTER.
      The quality of the structure was monitored using the validation tools in Coot, Molprobity (
      • Williams Ch.J.
      • Headd J.J.
      • Moriarty N.W.
      • Prisant M.G.
      • Videau L.L.
      • Deis L.N.
      • et al.
      MolProbity: more and better reference data for improved all-atom structure validation.
      ), and the wwPDB validation server throughout the refinement. The refinement statistics of the final model (Rwork = 24.4%, Rfree = 27.8%, as calculated by BUSTER) are listed in Table 2. The structures of the final model of both CAT domains as well as of the a’ and b’ domains of both β/PDI subunits are nicely defined by the electron density map (Figs. S5–S7). In addition, the structures of the PDI a domains are well defined, and the two least well-defined PDI b domains still unambiguously match the electron density map.

      Data availability

      The structure presented in this article has been deposited in the PDB with the code 7ZSC.

      Supporting information

      This article contains supporting information (
      • Koski M.K.
      • Hieta R.
      • Böllner C.
      • Kivirikko K.I.
      • Myllyharju J.
      • Wierenga R.K.
      The active site of an algal prolyl 4-hydroxylase has a large structural plasticity.
      ,
      • Koski M.K.
      • Hieta R.
      • Hirsilä M.
      • Rönkä A.
      • Myllyharju J.
      • Wierenga R.K.
      The crystal structure of an algal prolyl 4-hydroxylase complexed with a proline-rich peptide reveals a novel buried tripeptide binding motif.
      ,
      • Wang C.
      • Li W.
      • Ren J.
      • Fang J.
      • Ke H.
      • Gong W.
      • et al.
      Structural insights into the redox-regulated dynamic conformations of human protein disulfide isomerase.
      ,
      • Dong G.
      • Wearsch P.A.
      • Peaper D.R.
      • Cresswell P.
      • Reinisch K.M.
      Insights into MHC class I peptide loading from the structure of the tapasin-ERp57 thiol oxidoreductase heterodimer.
      ,
      • Myllykoski M.
      • Sutinen A.
      • Koski M.K.
      • Kallio J.P.
      • Raasakka A.
      • Myllyharju J.
      • et al.
      Structure of transmembrane prolyl 4-hydroxylase reveals unique organization of EF and dioxygenase domains.
      ,
      • Chowdhury R.
      • McDonough M.A.
      • Mecinovic J.
      • Loenarz C.
      • Flashman E.
      • Hewitson K.S.
      • et al.
      Structural basis for binding of hypoxia-inducible factor to the oxygen-sensing prolyl hydroxylases.
      ,
      • Ahmed S.
      • Ayscough A.
      • Barker G.R.
      • Canning H.E.
      • Davenport R.
      • Downham R.
      • et al.
      1,2,4-Triazolo-[1,5-a]pyridine HIF prolylhydroxylase domain-1 (PHD-1) inhibitors with a novel monodentate binding interaction.
      ,
      • Longbotham J.E.
      • Levy C.
      • Johannissen L.O.
      • Tarhonskaya H.
      • Jiang S.
      • Loenarz C.
      • et al.
      Structure and mechanism of a viral collagen prolyl hydroxylase.
      ).

      Conflict of interest

      A patent for the production system used to make the protein for structural studies using sulfhydryl oxidases in the cytoplasm of E. coli is held by the University of Oulu: Method for producing natively folded proteins in a prokaryotic host (Patent number: 9238817; date of patent: January 19, 2016). Inventor: L. W. R.
      J. M. owns equity in FibroGen, Inc, which develops HIF-P4H inhibitors as potential therapeutics. This company supports research in the J. M. group.
      All the other authors declare that they have no conflicts of interest with the contents of this article.

      Acknowledgments

      The use of the facilities and expertise of the BCO Structural Biology and Molecular Biophysics core facilities as well as the Biocenter Kuopio Structural Biology core facility, members of Biocenter Finland, Instruct-ERIC Centre Finland, and FINStruct is gratefully acknowledged. We also thank the expert support of the Proteomics and Protein Analysis core facility of the BCO. Datasets were collected at ESRF (ID-29), DLS (I03, I04, VMXi), and PETRA III (P14). We gratefully acknowledge the expert assistance of the beamline scientists at the synchrotron beamlines. We thank Sarel Fleishman (Weizmann Institute of Science, Israel) for making available the AlphaFold2 protocol that was used to predict the structure of the α2-dimer. We thank Jothi Anantharajan who started the studies of the truncated variants during her PhD project on the characterization of the domains of C-P4H-I. Minna Siurua and Eeva Lehtimäki are acknowledged for expert technical assistance in the activity assays. We also thank Ville Ratas and Tiila-Riikka Kiema for expert support concerning the crystallization experiments and in-house data collection, respectively. The Fourier transform ion cyclotron resonance laboratory is supported by the European Regional Development Fund (grant no.: A70135 ) and the European Network of Fourier Transform Ion Cyclotron Resonance Mass Spectrometry Centers (grant agreement no.: 731077 ).

      Author contributions

      A. V. M., R. S., M. K. K., and R. K. W. conceptualization; A. V. M., R. S., A. L., A. M. S., K. K., R. V., H. T., U. B., J. J., M. L., M. K. K., and R. K. W. methodology; A. V. M., R. S., A. L., A. M. S., M. L., M. K. K., and R. K. W. investigation; R. V., J. J., L. R., J. M., M. K. K., and R. K. W. resources; A. V. M., R. S., M. K. K., and R. K. W. writing–original draft; A. V. M., R. S., A. L., A. M. S., R. V., H. T., U. B., J. J., M. L., L. R., J. M., M. K. K., and R. K. W. writing–review and editing; M. K. K. and R. K. W. visualization; J. J., L. R., J. M., M. K. K., and R. K. W. supervision; M. K. K. and R. K. W. project administration.

      Funding and additional information

      This study was supported by the Academy of Finland through Academy Project grant 296498 (to J. M.) and Center of Excellence 2012 to 2017 grant 251314 (to J. M.), by the S. Jusélius Foundation (to J. M.) and the Jane and Aatos Erkko Foundation (to J. M.) as well as by the Academy of Finland grant 318182 (to L. W. R.) and by a grant from the Sigrid Jusélius Foundation (to M. K. K./R. K. W.).

      Supporting information

      References

        • Myllyharju J.
        Prolyl 4-hydroxylases, the key enzymes of collagen biosynthesis.
        Matrix Biol. 2003; 22: 15-24
        • Myllyharju J.
        Prolyl 4–hydroxylases, key enzymes in the synthesis of collagens and regulation of the response to hypoxia, and their roles as treatment targets.
        Ann. Med. 2008; 40: 402-417
        • Gorres K.L.
        • Raines R.T.
        Prolyl 4-hydroxylase.
        Crit. Rev. Biochem. Mol. Biol. 2010; 45: 106-124
        • Salo A.M.
        • Myllyharju J.
        Prolyl and lysyl hydroxylases in collagen synthesis.
        Exp. Dermatol. 2021; 30: 38-49
        • Shoulders M.D.
        • Raines R.T.
        Collagen structure and stability.
        Annu. Rev. Biochem. 2009; 78: 929-958
        • Rappu P.
        • Salo A.M.
        • Myllyharju J.M.
        • Heino J.
        Role of prolyl hydroxylation in the molecular interactions of collagens.
        Essays Biochem. 2019; 63: 325-335
        • Franklin T.J.
        Therapeutic approaches to organ fibrosis.
        Int. J. Biochem. Cell Biol. 1997; 29: 79-89
        • Vasta J.D.
        • Raines R.T.
        Collagen Prolyl 4-hydroxylase as a therapeutic target.
        J. Med. Chem. 2018; 61: 10403-10411
        • Shi R.
        • Gao S.
        • Zhang J.
        • Xu J.
        • Graham L.M.
        • Yang X.
        • et al.
        Collagen prolyl 4-hydroxylases modify tumor progression.
        Acta Biochim. Biophys. Sin. (Shanghai). 2021; 53: 805-814
        • Myllyharju J.
        • Kivirikko K.I.
        Characterization of the iron- and 2-oxoglutarate-binding sites of human prolyl 4-hydroxylase.
        EMBO J. 1997; 16: 1173-1180
        • Elgaard L.
        • Ruddock L.W.
        The human disulphide isomerase family: substrate interactions and functional properties.
        EMBO Rep. 2005; 6: 28-32
        • Hatahet F.
        • Ruddock L.W.
        Protein disulfide isomerase: a critical evaluation of its function in disulfide bond formation.
        Antioxid. Redox Signal. 2009; 11: 2807-2850
        • Khan H.A.
        • Mutus B.
        Protein disulfide isomerase a multifunctional protein with multiple physiological roles.
        Front. Chem. 2014; 2: 70
        • Helaakoski T.
        • Annunen P.
        • Vuori K.
        • MacNeil I.A.
        • Pihlajaniemi T.
        • Kivirikko K.I.
        Cloning, baculovirus expression, and characterization of a second mouse prolyl 4-hydroxylase α-subunit isoform: formation of an α2β2 tetramer with the protein disulfide-isomerase/β subunit.
        Proc. Natl. Acad. Sci. U. S. A. 1995; 92: 4427-4431
        • Kukkola L.
        • Hieta R.
        • Kivirikko K.I.
        • Myllyharju J.
        Identification and characterization of a third human, rat, and mouse collagen prolyl 4-hydroxylase isoenzyme.
        J. Biol. Chem. 2003; 278: 47685-47693
        • Helaakoski T.
        • Veijola J.
        • Vuori K.
        • Rehn M.
        • Chow L.T.
        • Taillon-Miller P.
        • et al.
        Structure and expression of the human gene for the α subunit of prolyl 4-hydroxylase. The two alternatively spliced types of mRNA correspond to two homologous exons the sequences of which are expressed in a variety of tissues.
        J. Biol. Chem. 1994; 269: 27847-27854
        • Nokelainen M.
        • Nissi R.
        • Kukkola L.
        • Helaakoski T.
        • Myllyharju J.
        Characterization of the human and mouse genes for the alpha subunit of type II prolyl 4-hydroxylase. Identification of a previously unknown alternatively spliced exon and its expression in various tissues.
        Eur. J. Biochem. 2001; 268: 5300-5309
        • Holster T.
        • Pakkanen O.
        • Soininen R.
        • Sormunen R.
        • Nokelainen M.
        • Kivirikko K.I.
        • et al.
        Loss of assembly of the main basement membrane collagen, type IV, but not fibril-forming collagens and embryonic death in collagen prolyl 4-hydroxylase-I null mice.
        J. Biol. Chem. 2007; 282: 2512-2519
        • Tolonen J.-P.
        • Salo A.M.
        • Finnilä M.
        • Aro E.
        • Karjalainen E.
        • Ronkainen V.-P.
        • et al.
        Reduced bone mass in collagen prolyl 4-hydroxylase P4ha1+/-;P4ha2-/- compound mutant mice.
        JBMR Plus. 2022; 6e10630
        • Annunen P.
        • Helaakoski T.
        • Myllyharju J.
        • Veijola J.
        • Pihlajaniemi T.
        • Kivirikko K.I.
        Cloning of the human prolyl 4-hydroxylase alpha subunit isoform α(II) and characterization of the type II enzyme tetramer. The α(I) andα(II) subunits do not form a mixed α(I)α(II)β2 tetramer.
        J. Biol. Chem. 1997; 272: 17342-17348
        • Myllyharju J.
        • Kivirikko K.I.
        Identification of a novel proline-rich peptide-binding domain in prolyl 4-hydroxylase.
        EMBO J. 1999; 18: 306-312
        • Anantharajan J.
        • Koski M.K.
        • Kursula P.
        • Hieta R.
        • Bergmann U.
        • Myllyharju J.
        • et al.
        The structural motifs for the substrate binding and dimerization of the α subunit of collagen prolyl 4-hydroxylase.
        Structure. 2013; 21: 2107-2118
        • Koski M.K.
        • Anantharajan J.
        • Kursula P.
        • Dhavala P.
        • Murthy A.V.
        • Bergmann U.
        • et al.
        Assembly of the elongated collagen prolyl 4-hydroxylase α2β2 heterotetramer around a central α2 dimer.
        Biochem. J. 2017; 474: 751-769
        • Pekkala M.
        • Hieta R.
        • Bergmann U.
        • Kivirikko K.I.
        • Wierenga R.K.
        • Myllyharju J.
        The peptide-substrate-binding domain of collagen prolyl 4-hydroxylases is a tetratricopeptide repeat domain with functional aromatic residues.
        J. Biol. Chem. 2004; 279: 52255-52261
        • Murthy A.V.
        • Sulu R.
        • Koski M.K.
        • Tu H.
        • Anantharajan J.
        • Sah-Teli S.K.
        • et al.
        Structural enzymology binding studies of the peptide-substrate-binding domain of human collagen prolyl 4-hydroxylase (type-II): high affinity peptides have a PxGP sequence motif.
        Protein Sci. 2018; 27: 1692-1703
        • Koski M.K.
        • Hieta R.
        • Böllner C.
        • Kivirikko K.I.
        • Myllyharju J.
        • Wierenga R.K.
        The active site of an algal prolyl 4-hydroxylase has a large structural plasticity.
        J. Biol. Chem. 2007; 282: 37112-37123
        • Koski M.K.
        • Hieta R.
        • Hirsilä M.
        • Rönkä A.
        • Myllyharju J.
        • Wierenga R.K.
        The crystal structure of an algal prolyl 4-hydroxylase complexed with a proline-rich peptide reveals a novel buried tripeptide binding motif.
        J. Biol. Chem. 2009; 284: 25290-25301
        • Dunwell J.M.
        • Purvis A.
        • Khuri S.
        Cupins: the most functionally diverse protein superfamily?.
        Phytochem. 2004; 65: 7-17
        • Islam Md.S.
        • Leissing T.M.
        • Chowdhury R.
        • Hopkins R.J.
        • Schofield C.J.
        2-oxoglutarate-dependent oxygenases.
        Annu. Rev. Biochem. 2018; 20: 659-672
        • John D.C.
        • Grant M.E.
        • Bulleid N.J.
        Cell-free synthesis and assembly of prolyl 4-hydroxylase: the role of the beta-subunit (PDI) in preventing misfolding and aggregation of the alpha-subunit.
        EMBO J. 1993; 12: 1587-1595
        • Vuori K.
        • Pihlajaniemi T.
        • Myllylä R.
        • Kivirikko K.I.
        Site-directed mutagenesis of human protein disulphide isomerase: effect on the assembly, activity and endoplasmic reticulum retention of human prolyl 4-hydroxylase in Spodoptera frugiperda insect cells.
        EMBO J. 1992; 11: 4213-4217
        • Koivunen P.
        • Pirneskoski A.
        • Karvonen P.
        • Ljung J.
        • Helaakoski T.
        • Notbohm H.
        • et al.
        The acidic C-terminal domain of protein disulfide isomerase is not critical for the enzyme subunit function or for the chaperone or disulfide isomerase activities of the polypeptide.
        EMBO J. 1999; 18: 65-74
        • Pirneskoski A.
        • Ruddock L.W.
        • Klappa P.
        • Freedman R.B.
        • Kivirikko K.I.
        • Koivunen P.J.
        Domains b' and a' of protein disulfide isomerase fulfill the minimum requirement for function as a subunit of prolyl 4-hydroxylase. The N-terminal domains a and b enhances this function and can be substituted in part by those of ERp57.
        Biol. Chem. 2001; 276: 11287-11293
        • Koivunen P.
        • Salo K.E.
        • Myllyharju J.
        • Ruddock L.W.
        Three binding sites in protein-disulfide isomerase cooperate in collagen prolyl 4-hydroxylase tetramer assembly.
        J. Biol. Chem. 2005; 280: 5227-5235
        • Wang C.
        • Li W.
        • Ren J.
        • Fang J.
        • Ke H.
        • Gong W.
        • et al.
        Structural insights into the redox-regulated dynamic conformations of human protein disulfide isomerase.
        Antioxid. Redox Signal. 2013; 19: 36-45
        • Biterova E.I.
        • Isupov M.N.
        • Keegan R.M.
        • Lebedev A.A.
        • Sohail A.A.
        • Liaqat I.
        • et al.
        The crystal structure of human microsomal triglyceride transfer protein.
        Proc. Natl. Acad. Sci. U. S. A. 2019; 116: 17251-17260
        • Dong G.
        • Wearsch P.A.
        • Peaper D.R.
        • Cresswell P.
        • Reinisch K.M.
        Insights into MHC class I peptide loading from the structure of the tapasin-ERp57 thiol oxidoreductase heterodimer.
        Immunity. 2009; 30: 21-32
        • Jumper J.
        • Evans R.
        • Pritzel A.
        • Green T.
        • Figurnov M.
        • Ronnegerger O.
        • et al.
        Highly accurate protein structure prediction with AlphaFold.
        Nature. 2021; 596: 583-589
        • Hatahet F.
        • Nguyen V.D.
        • Salo K.E.
        • Ruddock L.W.
        Disruption of reducing pathways is not essential for efficient disulfide bond formation in the cytoplasm of E. coli.
        Microb. Cell Fact. 2010; 9: 67
        • Nguyen V.D.
        • Hatahet F.
        • Salo K.E.
        • Enlund E.
        • Zhang C.
        • Ruddock L.W.
        Pre-expression of a sulfhydryl oxidase significantly increases the yields of eukaryotic disulfide bond containing proteins expressed in the cytoplasm of E.coli.
        Microb. Cell Fact. 2011; 10: 1
        • Myllykoski M.
        • Sutinen A.
        • Koski M.K.
        • Kallio J.P.
        • Raasakka A.
        • Myllyharju J.
        • et al.
        Structure of transmembrane prolyl 4-hydroxylase reveals unique organization of EF and dioxygenase domains.
        J. Biol. Chem. 2020; 296100197
        • Chowdhury R.
        • McDonough M.A.
        • Mecinovic J.
        • Loenarz C.
        • Flashman E.
        • Hewitson K.S.
        • et al.
        Structural basis for binding of hypoxia-inducible factor to the oxygen-sensing prolyl hydroxylases.
        Structure. 2009; 17: 981-989
        • Ahmed S.
        • Ayscough A.
        • Barker G.R.
        • Canning H.E.
        • Davenport R.
        • Downham R.
        • et al.
        1,2,4-Triazolo-[1,5-a]pyridine HIF prolylhydroxylase domain-1 (PHD-1) inhibitors with a novel monodentate binding interaction.
        J. Med. Chem. 2017; 60: 5663-5672
        • Longbotham J.E.
        • Levy C.
        • Johannissen L.O.
        • Tarhonskaya H.
        • Jiang S.
        • Loenarz C.
        • et al.
        Structure and mechanism of a viral collagen prolyl hydroxylase.
        Biochemistry. 2015; 54: 6093-6105
        • Culpepper M.A.
        • Scott E.E.
        • Limburg J.
        Crystal structure of prolyl 4-hydroxylase from Bacillus anthracis.
        Biochemistry. 2010; 49: 124-133
        • John D.C.
        • Bulleid N.J.
        Prolyl 4-hydroxylase: Defective assembly of alpha-subunit mutants indicates that assembled alpha-subunits are intramolecularly disulfide bonded.
        Biochemistry. 1994; 33: 14018-14025
        • Lumb R.A.
        • Bulleid N.J.
        Is protein disulfide isomerase a redox-dependent molecular chaperone?.
        EMBO J. 2002; 21: 6763-6770
        • Lamberg A.
        • Pihlajaniemi T.
        • Kivirikko K.I.
        Site-directed mutagenesis of the alpha subunit of human prolyl 4-hydroxylase. Identification of three histidine residues critical for catalytic activity.
        J. Biol. Chem. 1995; 270: 9926-9931
        • Walker K.W.
        • Gilbert H.F.
        Scanning and escape during protein-disulfide isomerase-assisted protein folding.
        J. Biol. Chem. 1997; 272: 8845-8848
        • Wijma H.J.
        • Floor R.J.
        • Jekel P.A.
        • Baker D.
        • Marrink S.J.
        • Janssen D.B.
        Computationally designed libraries for rapid enzyme stabilization.
        Protein Eng. Des. Sel. 2014; 27: 49-58
        • Klappa P.
        • Ruddock L.W.
        • Darby N.J.
        • Freedman R.B.
        The b' domain provides the principal peptide-binding site of protein disulfide isomerase but all domains contribute to binding of misfolded proteins.
        EMBO J. 1998; 17: 927-935
        • Nguyen V.D.
        • Wallis K.
        • Howard M.J.
        • Haapalainen A.M.
        • Salo K.E.
        • Saaranen M.J.
        • et al.
        Alternative conformations of the x region of human protein disulphide-isomerase modulate exposure of the substrate binding b' domain.
        J. Mol. Biol. 2008; 383: 1144-1155
        • Baek M.
        • DiMaio F.
        • Anishchenko I.
        • Dauparas J.
        • Ovchinnikov S.
        • Lee G.R.
        • et al.
        Accurate prediction of protein structures and interactions using a three-track neural network.
        Science. 2021; 373: 871-876
        • Hieta R.
        • Kukkola L.
        • Permi P.
        • Pirilä P.
        • Kivirikko K.I.
        • Kilpeläinen I.
        • et al.
        The peptide-substrate-binding domain of human collagen prolyl 4-hydroxylases. Backbone assignments, secondary structure, and binding of proline-rich peptides.
        J. Biol. Chem. 2003; 278: 34966-34974
        • de Jong L.
        • van der Kraan I.
        • de Waal A.
        The kinetics of the hydroxylation of procollagen by prolyl 4-hydroxylase. Proposal for a processive mechanism of binding of the dimeric hydroxylating enzyme in relation to the high kcat/Km ratio and a conformational requirement for hydroxylation of -X-Pro-Gly- sequences.
        Biochim. Biophys. Acta. 1991; 1079: 103-111
        • Kivirikko K.I.
        • Myllylä R.
        Posttranslational enzymes in the biosynthesis of collagen: intracellular enzymes.
        Met. Enzymol. 1982; 82: 245-304
        • Marie G.
        • Serani L.
        • Laprévote O.
        An on-line protein digestion device for rapid peptide mapping by electrospray mass spectrometry.
        Anal. Chem. 2000; 72: 5423-5430
        • Daniel E.
        • Maksimainen M.M.
        • Smith N.
        • Ratas V.
        • Biterova E.
        • Murthy S.N.
        • et al.
        IceBear: an intuitive and versatile web application for research-data tracking from crystallization experiment to PDB deposition.
        Acta Crystallogr. D Biol. Crystallogr. 2021; 77: 151-163
        • Kabsch W.
        XDS.
        Acta Crystallogr. D Biol. Crystallogr. 2010; 66: 125-132
        • Evans P.R.
        • Murshudov G.N.
        How good are my data and what is the resolution.
        Acta Crystallogr. D Biol. Crystallogr. 2013; 69: 1204-1214
        • Winn M.D.
        • Ballard C.C.
        • Cowtan K.
        • Dodson E.J.
        • Emsley P.
        • Evans P.R.
        • et al.
        Overview of the CCP4 suite and current developments.
        Acta Crystallogr. D Biol. Crystallogr. 2011; 67: 235-242
        • McCoy A.J.
        • Grosse-Kunstleve R.W.
        • Adams P.D.
        • Winn M.D.
        • Storoni L.C.
        • Read R.J.
        Phaser crystallographic software.
        J. Appl. Cryst. 2007; 40: 658-674
        • Vagin A.
        • Teplyakov A.
        Molrep: an automated program for molecular replacement.
        J. Appl. Cryst. 1997; 30: 1022-1025
        • Liebschner D.
        • Afonine P.V.
        • Baker M.L.
        • Bunkoczi G.
        • Chen V.B.
        • Croll T.I.
        • et al.
        Macromolecular structure determination using X-rays, neutrons and electrons: recent developments in Phenix.
        Acta Crystallogr. D Biol. Crystallogr. 2019; 75: 861-877
        • Murshudov G.N.
        • Skubak P.
        • Lebedev A.A.
        • Pannu N.S.
        • Steiner R.A.
        • Nicholls R.A.
        • et al.
        REFMAC5 for the refinement of macromolecular crystal structures.
        Acta Crystallogr. D Biol. Crystallogr. 2011; 67: 355-367
        • Kovalevskiy O.
        • Nicholls R.A.
        • Murshudov G.N.
        Automated refinement of macromolecular structures at low resolution using prior information.
        Acta Crystallogr. D Biol. Crystallogr. 2016; 72: 1149-1161
        • Emsley P.
        • Lohkamp B.
        • Scott W.G.
        • Cowtan K.
        Features and development of Coot.
        Acta Crystallogr. D Biol. Crystallogr. 2010; 66: 486-501
        • Bricogne G.
        • Blanc E.
        • Brandl M.
        • Flensburg C.
        • Keller P.
        • Paciorek W.
        • et al.
        BUSTER Version 2.10.3.
        Glob. Phasing Ltd, Cambridge, United Kingdom2017
        • Tunyasuvunakool K.
        • Adler J.
        • Wu Z.
        • Green T.
        • Zielinski M.
        • Žídek A.
        • et al.
        Highly accurate protein structure prediction for the human proteome.
        Nature. 2021; 596: 590-596
        • Smart O.S.
        • Womack T.O.
        • Flensburg C.
        • Keller P.
        • Paciorek W.
        • Sharff A.
        • et al.
        Exploiting structure similarity in refinement: automated NCS and target-structure restraints in BUSTER.
        Acta Crystallogr. D Biol. Crystallogr. 2012; 68: 368-380
        • Nicholls R.A.
        • Long F.
        • Murshudov G.N.
        Low-resolution refinement tools in REFMAC5.
        Acta Crystallogr. D Biol. Crystallogr. 2012; 68: 404-417
        • Williams Ch.J.
        • Headd J.J.
        • Moriarty N.W.
        • Prisant M.G.
        • Videau L.L.
        • Deis L.N.
        • et al.
        MolProbity: more and better reference data for improved all-atom structure validation.
        Protein Sci. 2018; 27: 293-315