Prolyl 4-Hydroxylation of α-Fibrinogen

Plasma proteome analysis requires sufficient power to compare numerous samples and detect changes in protein modification, because the protein content of human samples varies significantly among individuals, and many plasma proteins undergo changes in the bloodstream. A label-free proteomics platform developed in our laboratory, termed “Two-Dimensional Image Converted Analysis of Liquid chromatography and mass spectrometry (2DICAL),” is capable of these tasks. Here, we describe successful detection of novel prolyl hydroxylation of α-fibrinogen using 2DICAL, based on comparison of plasma samples of 38 pancreatic cancer patients and 39 healthy subjects. Using a newly generated monoclonal antibody 11A5, we confirmed the increase in prolyl-hydroxylated α-fibrinogen plasma levels and identified prolyl 4-hydroxylase A1 as a key enzyme for the modification. Competitive enzyme-linked immunosorbent assay of 685 blood samples revealed dynamic changes in prolyl-hydroxylated α-fibrinogen plasma level depending on clinical status. Prolyl-hydroxylated α-fibrinogen is presumably controlled by multiple biological mechanisms, which remain to be clarified in future studies.

Plasma proteome analysis requires sufficient power to compare numerous samples and detect changes in protein modification, because the protein content of human samples varies significantly among individuals, and many plasma proteins undergo changes in the bloodstream. A label-free proteomics platform developed in our laboratory, termed "Two-Dimensional Image Converted Analysis of Liquid chromatography and mass spectrometry (2DICAL)," is capable of these tasks. Here, we describe successful detection of novel prolyl hydroxylation of ␣-fibrinogen using 2DICAL, based on comparison of plasma samples of 38 pancreatic cancer patients and 39 healthy subjects. Using a newly generated monoclonal antibody 11A5, we confirmed the increase in prolyl-hydroxylated ␣-fibrinogen plasma levels and identified prolyl 4-hydroxylase A1 as a key enzyme for the modification. Competitive enzyme-linked immunosorbent assay of 685 blood samples revealed dynamic changes in prolyl-hydroxylated ␣-fibrinogen plasma level depending on clinical status. Prolyl-hydroxylated ␣-fibrinogen is presumably controlled by multiple biological mechanisms, which remain to be clarified in future studies.
For comprehensive analysis of plasma proteins, it is necessary to compare a sufficient number of blood samples to avoid simple interindividual heterogeneity, because the protein content of human samples varies significantly among individuals. Also, the provision of sufficient power is needed to detect pro-tein modification because many plasma proteins undergo changes in the bloodstream (1). Even though the proteomic technologies have advanced (2,3), there remains room for improvement. Different isotope labeling and identificationbased methods have been developed for quantitative proteomics technologies (4 -6), but the number of samples that can be compared by the current isotope-labeling methods is limited, and identification-based proteomics is unable to capture information regarding unknown modifications.
A label-free proteomics platform developed in our laboratory, termed "Two-Dimensional Image Converted Analysis of Liquid chromatography and mass spectrometry (2DICAL) 2 (7), simply compares the liquid chromatography and mass spectrometry (LC-MS) data and detects a protein modification by finding changes in the mass to charge ratio (m/z) and retention time (RT). Enhanced methods for accurate MS peak alignment across multiple LC runs have enabled the successful implementation of clinical studies requiring comparison of a large number of samples (8,9). Using 2DICAL to analyze plasma samples of pancreatic cancer patients and healthy controls, novel prolyl hydroxylation of ␣-fibrinogen was successfully discovered.
Fibrinogen and its modification has been investigated because of its clinical importance (10,11). On the other hand, prolyl hydroxylation has attracted attention after the discovery of the hypoxia-inducible factor 1␣ (HIF1␣) prolyl-hydroxylase and its role in switching of HIF1␣ functions (12). Prolyl hydroxylation in other proteins has been energetically sought, but only a few such proteins have been identified (13). Only one study has reported prolyl hydroxylation of fibrinogen at the ␤ chain (14).
Here, we report the detection of prolyl 4-hydroxylated ␣-fibrinogen by plasma proteome analysis, a protein modification that dynamically changes in plasma depending on the clinical status and is a candidate plasma biomarker.

EXPERIMENTAL PROCEDURES
Clinical Samples-Seventy-seven plasma samples (38 patients with pancreatic ductal adenocarcinoma and 39 healthy controls) were obtained from the National Cancer Center Hospital (Tokyo, Japan) (Sets 1 and 2), and 9 plasma samples (5 patients with pancreatic ductal adenocarcinoma and 4 healthy controls) were obtained from the Tokyo Medical University Hospital (Tokyo, Japan) (Set 3) (15). 685 plasma samples from patients with various diseases and healthy controls (Sets 4) were collected prospectively from seven medical institutions associated with the "Third-Term Comprehensive Control Research for Cancer" and will be described in detail elsewhere. 3 Written informed consent was obtained from every subject. The study was reviewed and approved by the ethics committee of each institute.
Sample Preparation-To 20 l of a plasma sample, 900 l of phosphate-buffered saline and 100 l of Con A-agarose (Vector, Burlingame, CA) were added, and the sample was incubated at 4°C for 2 h. After extensive washing with phosphatebuffered saline, proteins bound to Con A were eluted by competition with 100 mM mannose. To 30 l of the eluted sample, 10 l of 5 M urea, 2.5 l of 1 M NH 4 HCO 3 , and 3.3 g of sequencing grade modified trypsin (Promega, Madison, WI) were added. After digestion at 37°C for 20 h, peptides were dried with a SpeedVac concentrator (Thermo Electron, Holbrook, NY) and then dissolved in 50 l of 0.1% formic acid.
Peak Alignment Across Multiple LC-MS-MS peaks were detected, normalized, and quantified using the in-house 2DICAL software package, as described previously (7). To increase the accuracy of peak alignment across multiple LC-MS runs, we applied a greedy algorithm, which had been used for fast DNA sequence alignment, to supplement our previous method (8,9).
Protein and Modification Identification-MS and MS/MS data were acquired by preparative LC-MS runs with a tolerance of Ϯ0.1 m/z and Ϯ 0.5 min of RT using QTOF Ultima and linear ion trap (LTQ)-Orbitrap (Thermo Fisher Scientific, Waltham, MA) mass spectrometers. The MS/MS data were analyzed with Mascot software (Matrix Sciences, London, UK) including oxidized histidine, oxidized methionine, and hydroxyproline as possible modifications. Chemical formulas were determined with Xcalibur software (Thermo Fisher Scientific) with mass tolerance of 5 ppm.
Immunoblotting-Protein samples were separated by SDS-PAGE and electroblotted onto polyvinylidene difluoride membranes (Millipore, Billerica, MA). Blots were visualized with an enhanced chemiluminescence kit (GE Healthcare, Bucks, UK) and quantified as described previously (20).
Competitive ELISA-100 l of plasma diluted 20-fold with phosphate-buffered saline or 100 l of serially diluted HyP-ESS standard peptide were incubated with 100 l of 1 g/ml horseradish peroxidase-conjugated 11A5 antibody for 30 min. 50 l of the solution was added to 96-well microtiter plates precoated with 50 ng of HyP-ESS peptide and incubated for 1 h. After extensive washing, wells were incubating with the OPD solution for 10 min, and color absorbance at 490 nm was measured (supplemental Fig. S6D).
Statistical Analyses-Mann-Whitney U test was performed with the open-source statistical language R (version 2.7.0) (9).

Large Scale Quantitative Plasma Proteomics of Pancreatic
Cancer Patients-77 plasma samples (39 from patients with pancreatic cancer and 38 from healthy controls) were obtained from National Cancer Center Hospital. We used concanavalin A (Con A) to concentrate plasma glycoproteins (21). This "glycocapturing" procedure removed albumin and reduced the concentration of other abundant plasma proteins (22). Various aberrations of protein glycosylation accumulate in cancer (23,24). Most tumor markers of pancreatic cancer used clinically, including CA19-9, DUPAN-2, and NCC-ST-439, are known to be carbohydrate antigens (23,25). Each sample was anonymized, randomized, and measured in triplicate by 2DICAL. A total of 115,325 independent MS peaks were detected within mass ranges of 250 -1600 m/z and an LC RT of 0 -45 min (Fig.  1A). The correlation coefficient (CC) and coefficient of variance (CV) values for the triplicate data were over 0.95 and under 0.15, respectively, in most subjects. To increase statistical robustness, 77 samples were separated at random into two experimental sets (Set 1 (18 pancreatic cancer patients and 19 healthy controls) and Set 2 (20 pancreatic cancer patients and 20 healthy controls)), and the two sets were analyzed independently. We selected 10 peptide peaks showing a statistically significant difference between the cancer patients and controls (Ͼ2 fold difference, p Ͻ 0.0005 (Mann-Whitney U test), average peak intensity of Ͼ10 in either the cancer samples or the control samples) in both sets. We further selected 6 peaks of  Fig. S1 and Table S1) inspecting the 2DICAL reports with various two-dimensional views (Fig. 1B). The difference between cancer patients and controls was further validated in an independent set (Set 3, consisting of 5 pancreatic cancer patients and 4 healthy controls) obtained from another medical institution (Tokyo Medical University Hospital) (supplemental Fig. S2).
Target MS/MS Analysis for Peak Identification-Target MS/MS data were acquired from preparative LC-MS. The MS/MS spectra of the peaks of 552 m/z and 827 m/z matched the same ESSSHHP*GIAEFPSR sequence of fibrinogen ␣polypeptide isoform ␣-E preproprotein (NP_000499/ NP_068657) with the highest Mascot scores (supplemental Fig. S3 and not shown; * indicates a mismatch (described below)). These peaks were considered to be differently charged masses (triply and doubly charged, respectively) derived from the same peptide. The triple-charged 546 m/z peak is considered to be a mass with neutral loss of H 2 O, because its appearance was almost identical to the peaks of 552 and 827 m/z. The peak of 1141 m/z matched another peptide sequence of fibrinogen ␣-polypeptide isoform ␣-E preproprotein TFP*GFFSPMLGEFVSETESR with the highest Mascot score (supplemental Fig. S4). No significant match was found with the 412 m/z peak despite its highly qualified MS/MS spectrum (data not shown), probably because of an unknown posttranslational modification, a non-annotated gene sequence  Fig. 2 and data not  shown). However, the intensity of the three 16-daltonsmaller MS peaks did not differ significantly between pancreatic cancer patients and controls (Fig. 2B).
Determination of the 16-Dalton Increase by High Resolution MS-To clarify the nature of the 16-dalton increase, the peptides of 827 and 819 m/z as well as 1141 and 1133 m/z were analyzed with a high resolution Orbitrap mass spectrometer. The difference between both the larger and the smaller pairs was 15.995 dalton, considered to be derived exclusively from the addition of one oxygen atom (Fig. 3, A and B). MS/MS revealed that the addition took place on the proline 565 and 530 residues ( Fig. 3C and supplemental Fig. S5A). A 155.0808 m/z fragment observed in the high resolution MS/MS spectrum of the 819 m/z peak and a 171.0762 m/z fragment observed in the spectrum of the 827 m/z peak led to the exclusive identification of their chemical formulas as C 7 H 11 O 2 N 2 and C 7 H 11 O 3 N 2 , respectively (Fig. 3D). Formulas C 7 H 11 O 3 N 2 and C 7 H 11 O 2 N 2 match hydroxyproline-glycine and unmodified proline-glycine, respectively.
Detection of Prolyl 4-Hydroxylation of ␣-Fibrinogen-The physiologically stable oxidation of proline occurs exclusively at the carbon in the fourth position (supplemental Fig. S5B). We used Ganp (germinal center-associated nuclear protein) (27) transgenic mice to produce a monoclonal antibody (named 11A5) that reacts with a synthetic peptide ESSSHHP(O)-GIAEFPSR (P(O), 4-hydroxyproline) (named HyP-ESS) but not with an unmodified synthetic peptide with the same amino acid sequence (ESS) (supplemental Fig. S6A). GANP mice can produce highly diverse antibodies and have been used with success to generate high affinity antibodies to various difficult antigens (18). We were unable to produce a monoclonal antibody with specificity for TFP(O)GFFSPMLGEFVSETESR (data not shown). Fibrinogen ␣-polypeptide is produced and secreted mainly by the liver. ␣-Fibrinogen with hydroxylation of its proline 565 residue (hereafter, ␣FG-565HyP) as well as 3 polypeptides (the ␣-, ␤-, and ␥-chains) of fibrinogen were detected in the lysates (Fig. 4A) and supernatants (data not shown) of cultured hepatic cells and several hepatocellular carcinoma cell lines by immunoblotting with 11A5 monoclonal antibody. The 4-hydroxylation of proline is catalyzed by two types of enzymes: collagen-type and HIF1-type prolyl 4-hydroxylases (12,28,29). There are 4 collagen-type (P4HA1, P4HA2, P4HA3, and P4HB) and 3 HIF1-type (EGNL1, EGNL2, and EGNL3) prolyl 4-hydroxylase genes annotated in the human genome, but only knockdown of P4HA1 by siRNA inhibited the production of ␣FG-565HyP by KIM-1 cells (Fig. 4B and supplemental Fig.  S7A), indicating the involvement of P4HA1 (EC 1.14.11.2) in the 4-hydroxylation of the proline 565 residue of ␣-fibrinogen, at least in this cell line.

Prolyl 4-Hydroxylated ␣-Fibrinogen in Clinical Samples-
The plasma level of ␣FG-565HyP was increased in pancreatic cancer patients, but the levels of ␣-, ␤-, and ␥-fibrinogen did not show any differences between pancreatic cancer patients and healthy controls (Fig. 4C). The levels of ␣FG-565HyP and ␣-fibrinogen were not significantly correlated (CC ϭ 0.22) (supplemental Fig. S6, B and C). There was a significant correlation (CC ϭ 0.81) between the intensity of the 827 m/z peak detected by 2DICAL and the level of ␣FG-565HyP determined by immunoblotting with 11A5 antibody (supplemental Fig. S7B), indicating the quantitative accuracy of 2DICAL. A competitive ELISA utilizing anti-HyP-ESS (11A5) monoclonal antibody was constructed (supplemental Fig. S6D), and the plasma level of ␣FG-565HyP was quantified in 685 individuals (Set 4) (Fig. 5). The plasma samples were collected prospectively from 7 medical institutions to ensure the absence of bias during the process of sample preparation. The ELISA assay showed high reproducibility with a median CV value of 0.079 among triplicates. There was a significant difference (p ϭ 3.80 ϫ 10 Ϫ15 , Mann-Whitney U test) in the plasma level of ␣FG-565HyP between 160 pan- creatic cancer patients (2.26 Ϯ 2.28 arbitrary units) and 113 healthy controls (0.91 Ϯ 1.24) (Fig. 5A). The plasma level of ␣FG-565HyP was not elevated in patients with Stage IA (UICC, International Union Against Cancer) pancreatic cancer (p ϭ 0.811), but patients with Stage IB or more advanced disease showed a significant increase of plasma ␣FG-565HyP (p ϭ 2.99 ϫ 10 Ϫ2 to 1.88 ϫ 10 Ϫ12 ) ( Fig. 5B and supplemental Table  S2). An elevated plasma level of ␣FG-565HyP was also observed in various cancers and chronic inflammatory disease. Patients with cancers of other organs (including the bile duct (p ϭ 4.24 ϫ 10 Ϫ5 ), liver (p ϭ 1.08 ϫ 10 Ϫ3 ), esophagus (p ϭ 2.07 ϫ 10 Ϫ4 ), stomach (p ϭ 5.95 ϫ 10 Ϫ4 ), and colon/rectum (p ϭ 9.29 ϫ 10 Ϫ6 )) as well as patients with chronic pancreatitis (p ϭ 3.89 ϫ 10 Ϫ2 ) showed a significant increase in plasma ␣FG-565HyP ( Fig. 5C and supplemental Table S3). Patients with benign pancreatic tumor/cyst (p ϭ 0.216) or cholecystitis (p ϭ 0.111) showed no significant difference from the controls.

DISCUSSION
Plasma proteomics by liquid chromatography and mass spectrometry (LC-MS) has been a challenge because of the complexity and individual diversity of human samples. We developed a simple but robust method that enables the quantitative comparison of multiple LC-MS data. In this study, we identified 6 MS peaks whose intensity was significantly different between 38 cancer patients and 39 healthy controls (Figs. 1 and 2 and supplemental Figs. S1 and S2) among a total of 115,325 peaks derived from Con A-binding plasma glycoproteins. High resolution MS/MS analysis revealed that 4 of 6 peaks were derived from prolyl 4-hydroxylated plasma ␣-fibrinogen (Fig. 3). Artificial oxidation of peptides/proteins frequently occurs during the preparative procedures for MS analysis, especially during separation by SDS-PAGE. However, the plasma samples from cancer patients and healthy controls used in this study were collected, stored, and processed in an identical manner and were not separated by SDS-PAGE. We deliberately validated the native hydroxylation of the proline residue of plasma ␣-fibrinogen by immunoblotting and ELISA with a modification-specific monoclonal antibody (Figs. 4 and 5).
Prolyl hydroxylation is essential for the folding, secretion, and stability of the collagen triple helix (28,29). Collagen has long been considered to be the only protein that is hydroxylated on its proline residues, but recently the von Hippel Lindau (VHL) tumor suppressor gene product-mediated degradation of HIF1␣ was revealed to be regulated by prolyl hydroxylation (12). Prolyl 4-hydroxylation regulates the stability of argonaute 2 protein (13). However, it is largely unknown which other proteins are prolyl-hydroxylated and how the modification regulates the function of proteins. We found that the collagen-type  prolyl 4-hydroxylase P4HA1 is essential for the production of ␣FG-565HyP (Fig. 4B). Consistently, the consensus Xaa-Pro-Gly sequence of collagen (13,30) was seen in the prolyl hydroxylation sites of ␣-fibrinogen (supplemental Fig. S5A). Prolyl-hydroxylated ␣-fibrinogen was produced in cultured hepatic cells and several hepatocellular carcinoma cell lines but not in pancreatic cancer cell lines (data not shown). Immunohistochemical study using antibody 11A5 showed that prolylhydroxylated ␣-fibrinogen existed at the inflammation site around the pancreatic cancer cells (data not shown). The modification change in plasma level may be determined by the production and consumption balance in the human body. Hydroxylation at proline 530 of ␣-fibrinogen was strongly correlated with ␣FG-565HyP (supplementary Fig. S7C). Multiple biological mechanisms may be involved in the regulation of prolyl-hydroxylated ␣-fibrinogen.
Post-translational modifications, such as glycosylation, phosphorylation, and oxidation, cause small differences in the molecular weight of proteins. Prolyl-hydroxylated peptides are 16 daltons larger than their unmodified counterparts, but this small change in molecular weight can readily be detected by 2DICAL as differences in the m/z values as well as the RT of the peptide peaks. The peaks derived from unmodified plasma fibrinogen fragments appeared in different locations (compare Fig. 2, A and B). Such modifications may be overlooked by MS/MS-based identification-oriented proteome approaches (31)(32)(33)(34).
In this study, we were able to pinpoint the prolyl 4-hydroxylation of ␣-fibrinogen peptides in the large dataset of plasma samples (115,325 MS peaks ϫ 231 LC-MS runs (77 cases in triplicate) ϭ 27 million data points). Using a large independent validation cohort, newly constructed ELISA assay revealed the plasma level elevation of prolyl 4-hydroxylated ␣-fibrinogen in pancreatic cancer as well as other cancers and chronic inflammatory disease. Future studies will reveal the function of prolylhydroxylated ␣-fibrinogen and its regulation and clinical usage.