If you don't remember your password, you can reset it by entering your email address and clicking the Reset Password button. You will then receive an email that contains a secure link for resetting your password
If the address matches a valid account an email will be sent to __email__ with instructions for resetting your password
Corresponding authors: Angelo D’Alessandro, PhD, Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz Medical Campus, 12801 East 17th Ave., Aurora, CO 80045, Phone # 303-724-0096.
The Red Blood Cell (RBC)-Omics study, part of the larger NHLBI-funded Recipient Epidemiology and Donor Evaluation Study (REDS-III), aims to understand the genetic contribution to blood donor RBC characteristics. Previous work identified donor demographic, behavioral, genetic, and metabolic underpinnings to blood donation, storage, and (to a lesser extent) transfusion outcomes, but none have yet linked the genetic and metabolic bodies of work. We performed a Genome-Wide Association (GWA) analysis using RBC-Omics study participants with generated untargeted metabolomics data to identify metabolite quantitative trait loci (mQTL) in RBCs. We performed GWA analyses of 382 metabolites in 243 individuals imputed using the 1000 Genomes Project phase 3 all-ancestry reference panel. Analyses were conducted using ProbABEL and adjusted for sex, age, donation center, number of whole blood donations in the past two years, and first ten principal components of ancestry. Our results identified 423 independent genetic loci associated with 132 metabolites (p < 5x10-8). Potentially novel locus-metabolite associations were identified for the region encoding heme transporter FLVCR1 and choline, and for lysophosphatidylcholine acetyltransferase LPCAT3 and lysophosphatidylserine 16.0, 18.0, 18.1, and 18.2; these associations are supported by published rare disease and mouse studies. We also confirmed previous metabolite GWA results for associations including N(6)-Methyl-L-lysine and protein PYROXD2, and various carnitines and transporter SLC22A16. Association between pyruvate levels and G6PD polymorphisms was validated in an independent cohort and novel murine models of G6PD deficiency (African and Mediterranean variants). We demonstrate that it is possible to perform metabolomics-scale GWA analyses with a modest, trans-ancestry sample size.
) that allow them to take up and metabolize gases (O2, CO2) and small molecule substrates. This task is facilitated by their transit through the whole body every ∼20 seconds, during the average 120 day lifespan of RBCs.(
) has fueled the efforts towards personalized medicine.
RBC transfusion is a life-saving intervention for 4.5 million Americans every year. The logistics of producing over 100 million units of blood available for transfusion every year around the world necessitates storage of RBC components in the blood bank. However, storage in the blood bank is characterized by a series of biochemical(
) Appreciation for the role of RBC metabolism in storage quality and post-transfusion performances has informed the concept of the metabolic age of the unit – as opposed to the chronological age of the unit (i.e., days elapsed since the time of donation).(
) In a move towards personalized transfusion medicine, the Recipient Epidemiology and Donor Evaluation Study – REDS III RBC Omics - was designed to test the hypothesis that donor biology plays a significant role in the quality of donated RBC.(
) Genetic heterogeneity of blood donors was associated with the RBC propensity to hemolyze spontaneously, or following oxidative, osmotic or mechanical insults. Quantitative trait-loci analyses were used to identify polymorphic genes that contribute to an increased resistance/susceptibility to RBC hemolysis.(
) Of note, polymorphisms associated with a compromised (< 10%) residual activity of glucose 6-phosphate dehydrogenase (G6PD) were associated with an increased susceptibility of stored RBCs to lyse following oxidant insults.(
) G6PD is the rate-limiting enzyme of the pentose phosphate pathway, the main antioxidant pathway in RBCs. Thus, G6PD is crucial for the synthesis of the reducing cofactor NADPH, which is required to preserve glutathione homeostasis and reduce multiple antioxidant enzymes as they exert their catalytic activities.(
Blood donors are a selected population, with the basic requirement for volunteer blood donation being the absence of serious underlying medical conditions, adequate hemoglobin levels, absence of risk factors for transfusion-transmitted infections, and not taking specific teratogenic or other medications(
) that are grounds for deferral. As such, epidemiology studies have been facilitated by the study of large cohorts of blood donors, similar to investigations on SARS-CoV-2 incidence in the general population based on serological characterization of routine blood donors.(
) generated as part of the REDS III – RBC Omics study to perform a metabolite Quantitative Trait Loci (mQTL) analysis of routine blood donors. The study builds on previous mQTL reports in the context of cardiovascular diseases, asthma, or neurodegenerative diseases, including Alzheimer’s disease.(
) However, most of these studies focused on plasma, urine and cerebrospinal fluid, with limited analyses targeting erythrocyte metabolism – which is the focus of this study. Given the importance of RBC metabolism as a window into systems homeostasis, findings reported here could be relevant not just for transfusion medicine research, but also for diverse areas of physiology where RBC metabolism is impaired (e.g., exercise,(
Metabolomics analyses were performed on packed RBC samples derived from stored RBC components from 250 donors who had been previously characterized at the genome level via the Precision Transfusion Medicine array (Figure 1.A).(
) Through the workflow summarized in Supplementary Figure 1, a total of 2,831 SNP-metabolite associations were observed below the genome-wide correction threshold (p < 5.0 x 10-8 ). Data are summarized in tabulated form in Supplementary Table 1 by identifying the SNP with the smallest p-value within a +/- 500 kilobase range as the lead SNP; individual SNP associations are reported extensively in Supplementary Table 1. In Figure 1.B, we listed the top 10 hits identified by closest annotated gene to the significant SNP in order of -log10(p). Manhattan plots overlapping all the significant hits (FDR < 5 x 10-8) are shown in Figure 1.C, which also includes metabolite – gene pairs. QQplots for the top nine metabolite-associated SNPs are shown in Figure 1.D. Sensitivity analyses for 46 metabolites examining the impact of 1) more stringent variant QC; 2) the choice of imputation strategy for missing metabolite data; 3) the effect of blood storage additive; and 4) ancestry, are reported in Supplementary Table 1. Genetic associations identified for six metabolites have been previously reported (Supplementary Table 1). Ancestry plots were thus generated to show normalized metabolite abundances as a function of alleles, as distributed across genetic ancestries of the donors enrolled in this study (Figure 2.A). We further characterized the mQTL loci by generating LocusZoom plots to examine the local LD structure and performed in silico functional annotation using OASIS.
Consistent with previous mQTL studies(citations 40,41,64-67), the top SNP associated with levels of methyl-lysine, rs4539242, is in high linkage disequilibrium (LD) with both the missense mutation M461T (R2=1.0 in Europeans) and synonymous mutation F484F (R2=0.94 in Europeans), observations that represent an internal quality control for the present analysis (Manhattan plots and LocusZoom in Figure 2.B-C, respectively). Both mutations are themselves associated with levels of methyl-lysine (p=4.22x10-13 and p=1.28x10-44, respectively; Supplementary Table 1).
The region coding for the enzyme lysophosphatidylserine acetyl-transferase 5 (LPCAT3) was found to be genetically heterogenous across volunteer blood donors. Polymorphisms in the region coding for LPCAT3 were The lead SNP, rs73264680, associated with associated with RBC levels of lysophospholipids (LPS), including linoleyl- (18:2), palmitoyl (16:0), stearoyl (18:0) or oleyl-LPS (18:1), rs73264680, is in perfect LD in Europeans with the missense mutation rs1984564/I217T within LPCAT3 (Figure 3.A-C; Supplementary Table 1; residue mapped against the structure of LPCAT3 - 7F3X.pdb in Figure 3.D).
The lead SNP associated with UDP N acetyl glucosamine, rs4316067 (Supplementary Table 1), is located in an intron of NT5C3A (Figure 4A-B). The lead SNP associated with choline, rs2047287 (Figure 4C-D), is in strong LD (D’=1.0; R2=0.768 in Europeans) with the missense mutation T544M in FLVCR1.
Polymorphisms in bifunctional epoxide hydrolase 2 (EPHX2 - missense mutation rs751141/R221Q - p = 4.55 10-10; λ = 1.01) and spermine oxidase, where the intronic rs11087622 in SMOX – is in LD (R2=0.16; D’=0.71 in Europeans) with synonymous mutation A392A (Supplementary Table 1)-, are associated with variability in the levels of oxylipins (12,13-EpOME) and spermine, respectively (Figure 5).
A missense variant (rs12210538/M409T) in the carnitine transporter SLC22A16 is associated with variability in the levels of RBC free and acetyl-carnitine (Supplementary Table 1). Manhattan plots and LocusZooms are shown in Supplementary Figure 2. Additional SNPs associated with carnitine levels include palmitoyl-carnitine (nearest gene HTR5A-AS1) and undecanoyl-carnitine (nearest gene EPHX2).
A series of significant associations were identified between the levels of oxylipins like 9.10-EpOME (EPHX2 – Uniprot names provided for protein products of the nearest gene within parentheses in this paragraph) or 14-DHoHE (LOC401177 or LOC100505817) or 9-HETE, 15-HETE (NAP1L3), dopamine (G6PD), glycolytic metabolites (glucose and VAV2, hexose phosphate, including fructose 6-phosphate and FN3K; lactate and PNMA5), purines (hypoxanthine and TACR2, TSPAN15; urate and FOLR1 and APP), amino acids (glutamine and PLEKHB2; glutamate and BACH1-IT2; methionine and TOP1MT; taurine and LPHN3, threonine and mR8058), free fatty acids (palmitoleic acid and METTL2B; oleic acid; arachidonic acid and FADS1; docosapentaenoic acid and SGCZ), sphingolipids (sphingosine 1-phosphate and EDARADD or SORCS2 or KDM6A), uridine diphosphate (UDP and ZNF485 – Supplementary Figure 3-15).
Finally, polymorphisms in SPTA1 and G6PD are associated with variability in the levels of S-adenosyl-methionine (intronic - p = 2.52 10-10; λ = 1.03; Supplementary Table 1) and pyruvate μM (missense mutation V98M - p = 2.87 10-12; λ = 1.12; Supplementary Table 1), respectively (Figure 6.A-B and C-D for Manhattan plots and LocusZoom).
Sensitivity and Replication analyses
We performed several sensitivity and analyses, including GWAS of 46 metabolites in 176 study participants with available metabolomics data generated from RBC samples stored for 42 days, as replication of findings in fresh blood from the same subjects. The top associations were replicated for N6-Methyl-L-Lysine, LPS16.0-18.2, UDP N-aceytl-glucosamine, Choline, Undecanoyl carnitine, spermine, spermine uM, and L-carnitine. These findings replicated in each of the five sensitivity analysis, with either the original lead SNP or a genome-wide significant SNP in high LD with the original reaching genome-wide significance (Supplementary Table 1). For docosahexaenoic acid (FA22.6), the association with rs28603189 replicated when stringent QC criteria were applied, when a different missing metabolite imputation strategy was employed, and when analysis was restricted to the participants whose RBC samples were stored in Additive 3 (R2=0.94), but not in the day 42 storage samples or the European-only analysis. The association between rs12033733 and S-adenosyl-L-methionine withstood the stringent QC but none of the other sensitivity analyses. Finally, although there were many genome-wide significant associations with pyruvate uM, the lambda was 1.115 and the QQ plot troubling (Figure 1), potentially indicating unaccounted for population structure for this metabolite. The association between pyruvate uM and rs142516556, a SNP near the G6PD gene, remained robust to the stringent QC, imputation method, and Additive 3 restricted GWAS (Supplementary Table1).
Elevated pyruvate and pyruvate/lactate ratios are recapitulated in an independent human cohort of blood donors and mouse models of G6PD deficiency
Pyruvate levels were found to be inversely proportional to G6PD activity in fresh RBCs from an independent cohort of G6PD deficient (n=10) and sufficient (n=27) blood donors (Figure 7.A-C). The differences in pyruvate levels between the two groups were exacerbated during blood bank storage up to 42 days (Figure 7.D). Similarly, RBCs from G6PD deficient mice (African A- and Mediterranean variant – Med-) and WT C57BL6/J or humanized canonical G6PD mice (Figure 7.E) were incubated with 1,2,3-13C3-glucose for 1h, to determine metabolic fluxes through glycolysis and the pentose phosphate pathway (PPP). Results (Figure 7.F) confirmed significant decreases in the labeled levels of oxidative phase metabolites of the PPP (13C3-phosphogluconate and 13C2-ribose-phosphate) in A- and Med- mice, which corresponded to increases in the ratios of labeled 13C3-pyruvate/lactate.
As part of the REDS-III RBC-Omics study, a cohort of 12,535 volunteer blood donors were enrolled to donate a unit of blood that was processed into RBC components that were characterized for storage hemolysis parameters. DNA samples derived from donation WBC were genotyped using a Precision Transfusion Medicine array mapping 879,000 SNPs.(
) As a result, 27 loci were associated with measures of hemolysis following blood storage, the most significant being the association between ANK and SPTA1 with osmotic hemolysis (5.85 x 10-28 and 1.01 x 10-22, respectively) and G6PD with oxidative hemolysis (2.66 x 10-17).(
) Here we performed the first mQTL analysis of RBCs from 250 recalled RBC-Omics donors, a subset of the 12,535 enrolled and genotyped donors in the RBC-Omics study.
Overall, we report 2,831 SNP-metabolite associations meeting genome-wide significance. Of note, the smallest p-value found in the present study, p=1.90x10-63 for the association between rs4539242 within PYROXD2 and the RBC levels of N-methyl-lysine, was much smaller than the p-values describing any of the 27 loci associated with RBC hemolysis phenotypes in this population (citation 28), despite the much smaller cohort (250 vs 12,535) – suggesting the metabolic signatures are more directly determined by genetics than is hemolytic propensity, with the latter having a larger etiologic contribution from environmental factors.. The association between PYROXD2 and methyl-lysine had already been reported in previous mQTL studies,(
) and thus serves as an internal control for the present analysis.
Novel findings include the association between polymorphisms in the heme transporter FLVCR1 and the RBC levels of choline. Previous genomics data have shown a strong linkage (co-dependency: 0.41 Pearson) between FLVCR1 and the enzyme choline kinase A (https://depmap.org/portal/gene/CHKA?tab=overview). By controlling intracellular heme pools, FLVCR1 is known to play a role in the differentiation of committed erythroid progenitors.(
) In this view, it is interesting to note that SAM levels were associated with polymorphisms in the structural protein spectrin alpha 1 (SPTA1), recently identified as one of the main targets of methylation of deamidated asparagines in stored RBCS.(
) The association between choline and FLVCR1 is also relevant owing to the role of choline metabolism as a substrate for phosphocholine metabolism in phospholipid synthesis in terminal erythropoiesis.(
The RBC levels of multiple lysophosphatidylserines (LPS 16:0, 18:0, 18:1 and 18:2) were associated with variation in the lysophosphatidylcholine acetyl-transferase 5 gene (LPCAT3). LPCAT3 is a key enzyme of the Lands cycle,(
) These results are suggestive of a role of LPCAT3 in phosphatidylserine metabolism, a class of lipids whose exposure in the outer membrane leaflet regulates erythrophagocytosis and clearance from the bloodstream, with implications for post-transfusion recovery of stored RBCs.(
) In this view, it is worth noting that the levels of multiple acyl-carnitines, in equilibrium with the acyl-CoAs as part of the Lands cycle, were found to be associated with polymorphisms in the carnitine transporter SLC22A16. This observation may indicate a inter-subject variability in membrane lipid damage-repair capacity, with implications for exercise physiology or kidney disease, since this pathway is impacted by acute exercise(
) Heterogeneity in the RBC levels of some (bacteria or, under sterile ex vivo conditions, oxidant-stress derived) odd chain acyl-carnitines (e.g., undecanoyl-carnitine) was associated with polymorphisms in the coding region of the gene EPHX2, which was in turn also associated with variance in the levels of several linoleyl-derived oxylipins (9,10-EpOME, 12,13-EpOME), which are lipid mediators released by RBCs in response to hypoxia.(
) This is relevant in that mature RBCs have been found to express functional fatty acid desaturases (FADS), and FADS activity – which is dependent on iron - was found to increase in response to storage-induced or pathological oxidant stress in vitro and in vivo.(
) spermine oxidase (SMOX) was found to be polymorphic in routine blood donors, which was here associated with varying levels of the product of its enzymatic activity, the polyamine spermine.
One of the main findings of the genomic arm of the REDS-III RBC-Omics Study was the identification of polymorphisms associated with the expression of a less active isoform of G6PD (African variant) that are associated with an increased susceptibility to end of storage hemolysis of RBCs following oxidant insults.(
) Parallel metabolomics studies identified an impact for donor sex, ethnicity and age on the antioxidant systems (especially glutathione-dependent systems) of stored RBCs, with an emphasis for an impairment in the stored RBC capacity to activate the pentose phosphate pathway(
) (G6PD is the rate-limiting enzyme of this pathway). These results were independently corroborated by the observation that failure to activate the PPP is a hallmark of the metabolic lesion to stored RBCs, in part attributable to the inability to inhibit glycolysis via the reversible oxidation of glyceraldehyde 3-phosphate dehydrogenase (GAPDH)(
) Here we report that the levels of pyruvate in fresh RBCs and pyruvate/lactate ratios in stored RBCs are associated with the same G6PD polymorphisms. A causal role of this correlation is established in humanized murine models of G6PD deficiency, since the same metabolic change is seen in RBCs that differ only in their form of G6PD (African or Mediterranean(
) variants vs. nondeficient human form). These observations could be partly explained by the compensatory over-activation of NADH-dependent methemoglobin reductase to cope with increased oxidant stress in G6PD deficient erythrocytes.(
) Indeed, methemoglobin reductase would compete with lactate dehydrogenase for NADH, rendering the enzymatic step of lactic fermentation to regenerate NADH back to NAD+ no longer critical to preserve glycolytic fluxes (NAD+ is an essential cofactor for GAPDH activity upstream to pyruvate and ATP synthesis in glycolysis). The G6PD African variant in this donor population was also linked to variance in the levels of dopamine, confirming previous biomarker analyses from the metabolome of G6PD deficient vs sufficient blood donors(
). This is interesting in that monoamine oxidase-dependent dopamine synthesis is an NADPH-dependent process, with implications relevant to exercise physiology (e.g., the sense of well-being/stimulatory effect after exercise(
The present study has several limitations. First, mQTL analyses were determined based upon genomic characterization of a cohort of volunteer routine blood donors. As such, disease-related polymorphisms that would be identified in cohorts of non-healthy patients (i.e., from persons who are ineligible to donate blood) would be intrinsically not amenable to identification as a result of our study design. On the other hand, while sufficiently healthy to donate blood and as a result probably has biases similar to other ‘healthy worker’ cohorts (
)) but at a lower rate than the general population. As such, some of the genome-wide associations reported here (e.g., carnitine and SLC22A16) may be translationally relevant beyond transfusion medicine when interpreted in the context of markers relevant to specific diseases (e.g., carnitine metabolism and obesity(
)). Furthermore, fresh (∼10 day old – i.e. freshest samples available for this cohort) RBCs from volunteer donor volunteers were tested in this study. As such it is unclear whether the findings are relevant to transfusion medicine (e.g., genetic underpinning of metabolic heterogeneity in end of storage RBCs) or to physiological (e.g., hypoxia) or pathological conditions in which alterations to RBC metabolism are mechanistically relevant. Indeed, some metabolic markers of the RBC storage lesion only accumulate in end of storage units (e.g., hypoxanthine).(
) However, replication studies were performed on end of storage (day 42) blood from the same units and donors, though only 176 biological replicates were available. As such, the present findings are suggestive of clinical relevance in the field of transfusion medicine, to the extent that the metabolic heterogeneity of fresh and end of stored units associates or is an etiological driver of post-transfusion performances, such as intra- or extra-vascular hemolysis and post-transfusion recoveries (
). Future studies will need to address this issue in larger cohorts, by focusing on RBC samples stored for longer periods of time. Although the small (n=250) number of participants available still allowed for robust association discovery, a larger number of samples in more ancestrally diverse participants will increase the statistical power of future work and provide insights that are relevant to specific populations. Such studies could pave the way for the use of other orthogonal omics approaches to metabolomics (e.g., proteomics) to maximize the value of the genetic and metabolic data already available for this well-curated cohort. Similar studies could be possible on other cohorts from patients with hematological conditions, such as sickle cell disease, where metabolite levels could not only be associated with, but also mechanistically contribute to the etiology of thromboinflammatory comorbidities of clinical relevance (e.g., sphingosine 1-phosphate and systemic hypoxemia,(
RBC-Omics was conducted under regulations applicable to all human subject research supported by federal agencies as well as requirements for blood product manipulation specified and approved by the FDA. The data coordinating center (RTI International) of REDS-III was responsible for the overall compliance of human subjects regulatory protocols including institutional review board approval from each participating blood center, from the REDS-III Central Laboratory (Vitalant Research Institute), and from the data coordinating center, as previously detailed.(
) Donors were enrolled at the four participating REDS-III US blood centers. Overall, 13,758 whole blood donors were enrolled and 13,403 (97%) age 18+ provided informed consent to participate in the study; of these, 12,353 were evaluated for hemolysis parameters (spontaneous, oxidative or osmotic) on RBCs stored for ∼39-42 days. Extreme hemolyzers (5th and 95th percentile) from the donors tested for end of storage oxidative hemolysis were asked to donate a second unit of blood. These units were sterilely sampled for metabolomics analyses (n = 250 for the freshest available time points, i.e., < 14 storage days). Blood collection, sample processing and other aspects of the screening and recall phases of the RBC-Omics Study have been extensively described.(
An isotopically labeled internal standard mixture including a mix of 13C15N-labeled amino acid standards (2.5 μM) was prepared in methanol. A volume of 100μl of frozen RBC aliquots was mixed with water and the mixture of isotopically labeled internal standards (1:1:1, v/v/v). The samples were extracted with methanol (final concentration of 80% methanol). After incubation at −20°C for 1 hour, the supernatants were separated by centrifugation and stored at −80°C until analysis. Samples were vortexed and insoluble material pelleted as described.(
Analyses were performed using a Vanquish UHPLC coupled online to a Q Exactive mass spectrometer (Thermo Fisher, Bremen, Germany). Samples were analyzed using a 3 minute isocratic condition or a 5, 9 and 17 min gradient as described.(
) Additional analyses, including untargeted analyses and Fish score calculation via MS/MS, were calculated against the ChemSpider database with Compound Discoverer 2.0 (Thermo Fisher, Bremen, Germany).
Metabolite QC and Processing
The quality control and processing of metabolites is detailed in Supplementary Figure 1. We first selected only those metabolites measured at Day 10 of storage, for which 250 participants had metabolite data. Metabolites with missing data and zeros were both treated identically as missing. We removed the following metabolites from further processing: a duplicate carnosine, a duplicate lorazepam, phosphate, triacanthine, and acetyl-L-carnitine. 507 metabolites remained for further processing. We also removed 22 drug metabolites with concentrations detected in greater than 50% of the participants, leaving 487 metabolites. We then separated the participant data by blood storage additive type and excluded metabolites with greater than 10% missingness from each additive set, respectively. After removing these metabolites, 382 remained. We separated these 382 metabolites into those quantified absolutely (against stable isotope-labeled internal standards, as described(
)) vs relatively. Relatively quantified metabolites were natural log-transformed. A suffix “(uM)” was added to the label of all the metabolites for which absolute concentrations were determined. These groups of metabolites then had missing metabolite levels imputed using QRILC,(
) and 243 study participants who had both metabolomics data and imputation data on serial samples from stored RBC components that passed respective quality control procedures. We adjusted for sex, age (continuous), frequency of blood donation in the last two years (continuous), blood donor center, and ten ancestry PCs. Statistical significance was determined using a p-value threshold of 5x10-8. We only considered variants with a minimum minor allele frequency of 1% and a minimum imputation quality score of 0.80.
Replication and Sensitivity Analyses
For replication, we followed the same procedures for post-processing of metabolites measured at Day 42 of storage. There were 176 participants with metabolite data generated from Day 42 samples. Association analyses and statistical significance were determined as described above. We selected 46 metabolites, oversampled for top hits from the GWAS analysis of early storage samples to analyze for potential replication and in the sensitivity analyses described below.
We performed four sensitivity analyses using 46 metabolites and the original 243 recalled RBC-Omics participants. We performed a “stringent” GWAS, which required that evaluated variants have a minimum minor allele frequency of 5% and a minimum imputation quality score of 0.90. We performed an analysis using only those participants whose blood donations were collected at one of the three centers that used Additive 3 in their storage protocol. We also performed a sensitivity analysis including only those participants of European ancestry and using the variant data imputed using the European reference panel. Finally, we re-imputed the missing metabolites data as described above, swapping out the QRILC imputation procedure for a simple substitution of the missing value with the lowest detected value for the metabolite in question.
The OASIS: Omics Analysis, Search & Information a TOPMED funded resources(
), was used to annotate the top SNPS. OASIS annotation includes information on position, chromosome, alellele frequencies, closest gene, type of variant, position relative to closest gene model, if predicted to functionally consequential, tissues specific gene expression, and other information.
We generated LocusZoom plots locally using v1.4 and plotted a margin of +/- 200 kilobases around each lead SNP against the November 2014 1000 Genomes European ancestry build
Comparison with GWAS Catalog
Lead SNPs for all metabolites were queried using the LDLink tool LDtrait (query date 5/6/2022)(
) by selecting an R2 threshold of 0.8 in a +/- 500,000 base pair window in the combined five European ancestries using Genome Build GRCh37. We noted SNPs that have been previously associated with other traits and considered replicated associations as those SNPs with previously reported associations to the same metabolite as found in our study population.
All animal procedures were approved by the University of Virginia IACUC (protocol no. 4269). Humanized G6PD deficient (A-, Med-) and non-deficient (HuCan) mice were generated by replacing the murine G6PD locus in Bruce4 ES cells (C57BL/6 background) with either the A- (V68M/N126D), Med- (S188F) or huCan (B+) variant (manuscript in preparation). In short, nucleofected ES cells were drug-selected (Neo), G418 resistant clones were isolated, and the presence of homologous recombination (and absence of random integration) was confirmed (data not shown). Clones were then developed into full animals and correct homologous recombination was reconfirmed. The Neo cassette was flanked with FRT sites and removed by breeding with a germline FLP transgenic mouse – the FLP was subsequently removed. Generation of Cre-inducible G6PD Med- deficient mice was described previously. (
An independent cohort was enrolled in this study at the Columbia University and New York Blood Center in New York, under IRB protocols no. AAAJ6862 and 401165, respectively. Male volunteers were recruited using flyers, person-to-person communication, and email, between November 2012 and August 2017. Screening was limited to males because G6PD deficiency is X-linked; Following screening and confirming G6PD activity, 10 G6PD-deficient and 30 G6PD-normal males donated 1 unit of whole blood at the New York Blood Center (New York, New York), (
) each of which was processed into packed RBCs, leukoreduced, prior to metabolomics analysis.
Data availability statement
All the mQTL results and related elaborations described in the present study are provided in Supplementary Table 1. The raw genomics data were made available as per Page et al. J Clin Investigation 2021 (reference 28). The raw metabolomics data were made available as per D’Alessandro et al. Transfusion 2019 (reference 29).
Research reported in this publication was funded by the National Institute of General and Medical Sciences (RM1GM131968 to ADA), and R01HL146442 (ADA), R01HL149714 (ADA), R01HL148151 (ADA), R01HL161004 (ADA), and R21HL150032 (ADA) from the National Heart, Lung, and Blood Institute. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.
The authors wish to acknowledge NHLBI Recipient Epidemiology and Donor Evaluation Study-III (REDS-III), which was supported by NHLBI contracts NHLBI HHSN2682011-00001I, -00002I, -00003I, -00004I, -00005I, -00006I, -00007I, -00008I, and -00009I. The authors would like to express their deep gratitude Dr. Simone Glynn of NHLBI for her outstanding support throughout this study, the RBC-Omics research staff at all participating blood centers and testing labs for their exceptional performance and contribution to this project, and to all blood donors who agreed to participate in this study.
5Department of Biochemistry and Molecular Genetics, University of Colorado Denver – Anschutz Medical Campus, Aurora, CO, USA
Disclosure of Conflict of interest
Though unrelated to the contents of this manuscripts, the authors declare that AD is a founder of Omix Technologies Inc and Altis Biosciences LLC. AD is SAB members for Hemanext Inc and FORMA Therapeutics Inc. AD is a consultant for Rubius Therapeutics. JCZ is a consultant for Rubius Therapeutics and a founder of Svalinn Therapeutics. All other authors have no conflicts of interests to disclose.
The NHLBI Recipient Epidemiology Donor Evaluation Study-III (REDS-III), Red Blood Cell (RBC)-Omics Study, is the responsibility of the following persons:
- Versiti Milwaukee, WI: A.E. Mast, J.L. Gottschall, W. Bialkowski, L. Anderson, J. Miller, A. Hall, Z. Udee, V. Johnson
- The Institute for Transfusion Medicine (ITXM), Pittsburgh, PA: D.J. Triulzi, J.E. Kiss, P.A. D’Andrea
- University of California, San Francisco, San Francisco, CA: E.L. Murphy, A.M. Guiltinan
- American Red Cross Blood Services, Farmington, CT: R.G. Cable, B.R. Spencer, S.T. Johnson
- Data coordinating center: RTI International, Rockville, MD: D.J. Brambilla, M.T. Sullivan, S.M. Endres-Dighe, G.P. Page, Y. Guo, N. Haywood, D. Ringer, B.C. Siege
- Central and testing laboratories: Vitalant Research Institute, San Francisco, CA: M.P. Busch, M.C. Lanteri, M. Stone, S. Keating
- Pittsburgh Heart, Lung, Blood, and Vascular Medicine Institute, Division of Pulmonary, Allergy and Critical Care Medicine, University of Pittsburgh, Pittsburgh, PA: T. Kanias, M. Gladwin
- Steering committee chairman: University of British Columbia, Victoria, BC, Canada: S.H. Kleinman
- National Heart, Lung, and Blood Institute, National Institutes of Health: S.A. Glynn, K.B. Malkin, A.M. Cristman
MPB led the REDS-III RBC-Omics project. AM and GPP performed mQTL analyses. AD performed metabolomics analyses. EAH and ROF co-ordinated human G6PD studies. KD and JCZ performed mouse studies. AM, AD, GP prepared figures and tables. AD wrote the first version of the manuscript and all authors contributed to its finalization.