Hostname: page-component-cd9895bd7-hc48f Total loading time: 0 Render date: 2024-12-25T01:42:50.356Z Has data issue: false hasContentIssue false

Phenomewide Association Study of Health Outcomes Associated With the Genetic Correlates of 25 Hydroxyvitamin D Concentration and Vitamin D Binding Protein Concentration

Published online by Cambridge University Press:  22 April 2024

Hailey A. Kresge
Affiliation:
Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
Freida Blostein
Affiliation:
Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
Slavina Goleva
Affiliation:
Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
Clara Albiñana
Affiliation:
National Centre for Register-Based Research, Aarhus University, Aarhus V, Denmark Department of Psychiatry, University of Oxford, Oxford, UK
Joana A. Revez
Affiliation:
Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD, Australia
Naomi R. Wray
Affiliation:
Department of Psychiatry, University of Oxford, Oxford, UK Institute for Molecular Bioscience, University of Queensland, Brisbane, QLD, Australia Queensland Brain Institute, University of Queensland, Brisbane, QLD, Australia
Bjarni J. Vilhjálmsson
Affiliation:
National Centre for Register-Based Research, Aarhus University, Aarhus V, Denmark Bioinformatics Research Centre, Aarhus University, Aarhus C, Denmark Novo Nordisk Foundation Center for Genomic Mechanisms of Disease, Broad Institute, Cambridge, MA, USA
Zhihong Zhu
Affiliation:
National Centre for Register-Based Research, Aarhus University, Aarhus V, Denmark
John J. McGrath
Affiliation:
National Centre for Register-Based Research, Aarhus University, Aarhus V, Denmark Queensland Brain Institute, University of Queensland, Brisbane, QLD, Australia Queensland Centre for Mental Health Research, The Park Centre for Mental Health, Wacol, QLD, Australia
Lea K. Davis*
Affiliation:
Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA Division of Genetic Medicine, Vanderbilt University Medical Center, Nashville, TN, USA Division of Neurology, Pharmacology and Special Education, Vanderbilt Kennedy Center, Vanderbilt University Medical Center, Nashville, TN, USA Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, USA Department of Psychiatry and Behavioral Sciences, Vanderbilt University Medical Center, Nashville, TN, USA Department of Molecular Physiology and Biophysics, Vanderbilt University, Nashville, TN, USA
*
Corresponding author: Lea K. Davis; Email: [email protected]

Abstract

While it is known that vitamin D deficiency is associated with adverse bone outcomes, it remains unclear whether low vitamin D status may increase the risk of a wider range of health outcomes. We had the opportunity to explore the association between common genetic variants associated with both 25 hydroxyvitamin D (25OHD) and the vitamin D binding protein (DBP, encoded by the GC gene) with a comprehensive range of health disorders and laboratory tests in a large academic medical center. We used summary statistics for 25OHD and DBP to generate polygenic scores (PGS) for 66,482 participants with primarily European ancestry and 13,285 participants with primarily African ancestry from the Vanderbilt University Medical Center Biobank (BioVU). We examined the predictive properties of PGS25OHD, and two scores related to DBP concentration with respect to 1322 health-related phenotypes and 315 laboratory-measured phenotypes from electronic health records. In those with European ancestry: (a) the PGS25OHD and PGSDBP scores, and individual SNPs rs4588 and rs7041 were associated with both 25OHD concentration and 1,25 dihydroxyvitamin D concentrations; (b) higher PGS25OHD was associated with decreased concentrations of triglycerides and cholesterol, and reduced risks of vitamin D deficiency, disorders of lipid metabolism, and diabetes. In general, the findings for the African ancestry group were consistent with findings from the European ancestry analyses. Our study confirms the utility of PGS and two key variants within the GC gene (rs4588 and rs7041) to predict the risk of vitamin D deficiency in clinical settings and highlights the shared biology between vitamin D-related genetic pathways a range of health outcomes.

Type
Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2024. Published by Cambridge University Press on behalf of International Society for Twin Studies

While there is no doubt that vitamin D deficiency is causally associated with adverse bone outcomes (e.g., rickets in children, osteoporosis in adults), the influence of vitamin D on other health outcomes remains poorly understood (Holick, Reference Holick2007). Cross-sectional observational studies often report an association between vitamin D deficiency (as defined by serum 25 hydroxyvitamin D [25OHD] concentration less than 25 nmol/L) and an increased risk of many different health outcomes, such as cancer, autoimmune disease, cardiovascular disease, and psychiatric disorders (Holick & Chen, Reference Holick and Chen2008; Manson, Cook et al., Reference Manson, Cook, Lee, Christen, Bassuk, Mora, Gibson, Gordon, Copeland, D’Agostino, Friedenberg, Ridge, Bubes, Giovannucci, Willett, Buring and Research Group2019). In most instances, these associations merely reflect the well-accepted finding that poor general health can lead to low 25OHD concentration because of reduced outdoor activity and reduced exposure to bright sunshine. In addition, prior risk factors such as obesity and smoking can confound the apparent association between vitamin D deficiency and adverse health outcomes.

Recently, large, randomized controlled trials of vitamin D supplementation have not supported a causal role for vitamin D in health outcomes related to cancer, cardiovascular disease and bone outcomes (Chou et al., Reference Chou, LeBoff and Manson2020; de Boer et al., Reference de Boer, Zelnick, Ruzinski, Friedenberg, Duszlak, Bubes, Hoofnagle, Thadhani, Glynn, Buring, Sesso and Manson2019; LeBoff et al., Reference LeBoff, Chou, Murata, Donlon, Cook, Mora, Lee, Kotler, Bubes, Buring and Manson2020; Lucas & Wolf, Reference Lucas and Wolf2019; Manson, Bassuk, Buring et al., Reference Manson, Bassuk, Cook, Lee, Mora, Albert and Buring2020; Manson, Bassuk, Cook et al., Reference Manson, Bassuk, Cook, Lee, Mora, Albert and Buring2020; Manson, Cook et al., Reference Manson, Cook, Lee, Christen, Bassuk, Mora, Gibson, Gordon, Copeland, D’Agostino, Friedenberg, Ridge, Bubes, Giovannucci, Willett, Buring and Research Group2019; Manson, Mora et al., Reference Manson, Mora and Cook2019; Neale et al., Reference Neale, Baxter, Romero, McLeod, English, Armstrong, Ebeling, Hartel, Kimlin, O’Connell, van der Pols, Venn, Webb, Whiteman and Waterhouse2022). These findings have lowered expectations about the role of vitamin D deficiency as a causal risk factor for many adverse health outcomes. However, because randomized controlled trials rarely extend beyond a few years, they are less able to detect exposure-risk relationships that have a long latency (e.g., suboptimal vitamin D status over many decades may contribute to the gradual loss of bone mineral density, and result in later-life osteoporosis; Heaney, Reference Heaney2003). In these scenarios, Mendelian randomization (MR) studies may be informative, as it is assumed that common genetic variants that influence phenotypes such as 25OHD concentrations would operate in a stable fashion across the entire lifespan. To date, MR studies related 25OHD have found evidence to support causal pathways with (a) multiple sclerosis (Jiang et al., Reference Jiang, Ge and Chen2021; Manousaki et al., Reference Manousaki, Dudding, Haworth, Hsu, Liu, Medina-Gomez, Voortman, van der Velde, Melhus, Robinson-Cohen, Cousminer, Nethander, Vandenput, Noordam, Forgetta, Greenwood, Biggs, Psaty, Rotter and Richards2017; Mokry et al., Reference Mokry, Ross, Ahmad, Forgetta, Smith, Leong, Greenwood, Thanassoulis and Richards2015; Rhead et al., Reference Rhead, Baarnhielm, Gianfrancesco, Mok, Shao, Quach, Shen, Schaefer, Link, Gyllenberg, Hedstrom, Olsson, Hillert, Kockum, Glymour, Alfredsson and Barcellos2016), (b) ovarian cancer (Ong et al., Reference Ong, Cuellar-Partida, Lu, Ovarian Cancer Study, Fasching, Hein, Burghaus, Beckmann, Lambrechts, Van Nieuwenhuysen, Vergote, Vanderstichele, Anne Doherty, Anne Rossing, Chang-Claude, Eilber, Rudolph, Wang-Gohrke, Goodman and MacGregor2016), and (c) dyslipidemia (Revez et al., Reference Revez, Lin, Qiao, Xue, Holtz, Zhu, Zeng, Wang, Sidorenko, Kemper, Vinkhuyzen, Frater, Eyles, Burne, Mitchell, Martin, Zhu, Visscher, Yang and McGrath2020). On the other hand, MR analyses by Revez and colleagues (Reference Revez, Lin, Qiao, Xue, Holtz, Zhu, Zeng, Wang, Sidorenko, Kemper, Vinkhuyzen, Frater, Eyles, Burne, Mitchell, Martin, Zhu, Visscher, Yang and McGrath2020) showed evidence supporting a causal effect of range of other health outcomes on 25OHD levels, but 25OHD only had an apparent causal effect on such health outcomes in the presence of horizontal (or biologically) pleiotropic variants, which influence both 25OHD concentration and health outcomes through independent pathways.

The analysis of other key elements of the vitamin D pathway may help clarify these findings. Recently, Albiñana and colleagues (Reference Albiñana, Zhu, Borbye-Lorenzen, Boelt, Cohen, Skogstrand, Wray, Revez, Privé, Petersen, Bulik, Plana-Ripoll, Musliner, Agerbo, Børglum, Hougaard, Nordentoft, Werge, Mortensen and McGrath2023) published a genomewide association study (GWAS) of the concentration of the vitamin D binding protein (DBP), a circulating protein involved in the transport and storage of 25OHD. Based on the genetic correlates of DBP, MR studies confirmed a strong positive and unidirectional association between DBP concentration and 25OHD concentration. Furthermore, there was a robust association between the genetic variants associated with higher DBP (higher polygenic score of DBP, PGSDBP) and higher (measured) concentration of 25OHD in the UK Biobank (UKB) sample. This study also used a set of genetic instruments adjusted for the prominent cis-protein quantitative trait loci (cis-pQTLs) in the GC gene (which encodes the DBP protein). Based on this subset of genetic variants, additional associations were found with a range of clinical phenotypes in the UKB, including reduced risk of hypertension, reduced pulse rate, reduced risk of gastritis and duodenitis, and an increased risk of allergic rhinitis and agranulocytosis). To the best of our knowledge, no studies have used both the GWAS findings from 25OHD and DBP to help clarify the role of vitamin D status across a wide range of health outcomes.

Phenome-wide association studies (PheWAS) and laboratory-wide association studies (LabWAS) (Goldstein et al., Reference Goldstein, Weinstock, Bastarache, Larach, Fritsche, Schmidt, Brummett, Kheterpal, Abecasis, Denny and Zawistowski2020) have the ability to explore the associations between (a) the genetic correlates of potential risk factors such as 25OHD and DBP concentration, and (b) a wide range of disease and laboratory phenotypes in clinical settings (Dennis, Sealock, Straub et al., Reference Dennis, Sealock, Straub, Lee, Hucks, Actkins, Faucon, Feng, Ge, Goleva, Niarchou, Singh, Morley, Smoller, Ruderfer, Mosley, Chen and Davis2021; Denny et al., Reference Denny, Ritchie, Basford, Pulley, Bastarache, Brown-Gentry, Wang, Masys, Roden and Crawford2010; Wei et al., Reference Wei, Bastarache, Carroll, Marlo, Osterman, Gamazon, Cox, Roden and Denny2017). A previous PheWAS examined the association between a polygene risk score PGS for 25OHD based on 6 independent genetic loci and a wide range of phenotypes available in the UKB (Meng et al., Reference Meng, Li, Timofeeva, He, Spiliopoulou, Wei, Gifford, Wu, Varley, Joshi, Denny, Farrington, Zgaga, Dunlop, McKeigue, Campbell and Theodoratou2019). This study found no evidence of an association between 25OHD concentration and over 900 different clinical outcomes, but the authors noted that the study may have lacked the power to detect small effect sizes. We had the opportunity to conduct a PheWAS using the more powerful GWAS based on the UKB (n = 417,580), which identified 143 independent variants (Revez et al., Reference Revez, Lin, Qiao, Xue, Holtz, Zhu, Zeng, Wang, Sidorenko, Kemper, Vinkhuyzen, Frater, Eyles, Burne, Mitchell, Martin, Zhu, Visscher, Yang and McGrath2020). In addition, we used the GWAS findings related to DBP (n= 65,589, 26 independent variants) from Albiñana and colleagues (Reference Albiñana, Zhu, Borbye-Lorenzen, Boelt, Cohen, Skogstrand, Wray, Revez, Privé, Petersen, Bulik, Plana-Ripoll, Musliner, Agerbo, Børglum, Hougaard, Nordentoft, Werge, Mortensen and McGrath2023), which allowed us to look for convergent evidence from these two key vitamin D pathway components. The summary statistics from these two GWAS analyses were used to predict a wide range of diseases and laboratory phenotypes available within the Vanderbilt University Medical Center (VUMC) electronic health record (EHR) in conjunction VUMC’s DNA repository, BioVU. Importantly, the VUMC cohort also represents a healthcare-seeking population, compared to the volunteer ascertainment of UKB, which provides additional opportunities to investigate the relationship between Vitamin D and illness across the medical phenome.

Methods

Study Population and Data Access Approval

Data for this study were obtained with permission from the Vanderbilt University Medical Center Biobank (VUMC BioVU) DNA databank in conjunction with the de-identified version of the VUMC EHR called the Synthetic Derivative. The study was approved by the VUMC IRB (IRB#190418). The study population included only patients genotyped on the Illumina Expanded Multi-Ethnic Genotyping Array (MEGAex). The database includes demographics, vital measurements, ICD9 and ICD10 codes, Current Procedural Terminology (CPT) codes, laboratory test results, medications, and clinical notes recorded from 1994 to 2021. Detailed information about BioVU’s data management and quality control, ethical considerations, and continuing patient engagement has been previously published (Bowton et al., Reference Bowton, Field, Wang, Schildcrout, Van Driest, Delaney, Cowan, Weeke, Mosley, Wells, Karnes, Shaffer, Peterson, Denny, Roden and Pulley2014; Denny et al., Reference Denny, Ritchie, Basford, Pulley, Bastarache, Brown-Gentry, Wang, Masys, Roden and Crawford2010; Pulley et al., Reference Pulley, Brace, Bernard and Masys2008; Ritchie et al., Reference Ritchie, Denny, Crawford, Ramirez, Weiner, Pulley, Basford, Brown-Gentry, Balser, Masys, Haines and Roden2010; Roden et al., Reference Roden, Pulley, Basford, Bernard, Clayton, Balser and Masys2008). Of note, date shifting within a 1-year timeframe was adopted as a strategy to reduce potential identifiability. While dates are shifted by a consistent number of days within an individual’s medical record (i.e., birthday and all visits are shifted by the same number of days), the selected interval for the date-shifting differs between individuals. This practice limits our ability to detect seasonal associations with 25OHD concentrations because we lack precise dates for laboratory testing and code assignment.

Genotyping and Quality Control

Genotypes for 94,474 individuals who received care at VUMC were obtained through BioVU. Genotypes were measured on the MEGAex array (Zhao et al., Reference Zhao, Jing, Samuels, Sheng, Shyr and Guo2018), and ancestral clusters for individuals of inferred European or African ancestry were selected as previously described (Dennis, Sealock, Straub et al., Reference Dennis, Sealock, Straub, Lee, Hucks, Actkins, Faucon, Feng, Ge, Goleva, Niarchou, Singh, Morley, Smoller, Ruderfer, Mosley, Chen and Davis2021). Genotyping data within each ancestry group were imputed and underwent quality control checks as previously described. Briefly, European and African ancestry boundaries were calculated using Eigenstrat (Price et al., Reference Price, Patterson, Plenge, Weinblatt, Shadick and Reich2006). Data were imputed using the Michigan Imputation Server with the Haplotype Reference Consortium reference panel (McCarthy et al., Reference McCarthy, Das, Kretzschmar, Delaneau, Wood, Teumer, Kang, Fuchsberger, Danecek, Sharp, Luo, Sidore, Kwong, Timpson, Koskinen, Vrieze, Scott, Zhang and Mahajan2016). Genotyping data was then subjected to a series of ancestry-specific QC filters, including minor allele frequency <0.05, imputation quality R^2 <0.3 thresholding, and π <0.2. The resulting dataset contained 6,360,678 variants from 66,917 people of European ancestry and 12,897,448 variants from 13,329 people of African ancestry.

We filtered samples to only those individuals with complete data on EHR reported sex and median age in the database (respectively 66,482 and 13,285 for European and African ancestry individuals). From these subsets we calculated the principal components (PCs) of genetic ancestry on a randomly selected subset of 250,000 SNPs using Flash PCA (Abraham & Inouye, Reference Abraham and Inouye2014) and an in-house script (Abraham et al., Reference Abraham, Qiu and Inouye2017).

Phenotype Data

PheWAS

Phenotypic data were represented using phecodes generated by hierarchical clustering of related ICD codes (Denny et al., Reference Denny, Bastarache, Ritchie, Carroll, Zink, Mosley, Field, Pulley, Ramirez, Bowton, Basford, Carrell, Peissig, Kho, Pacheco, Rasmussen, Crosslin, Crane, Pathak and Roden2013). ICD-9 and 10 codes were mapped to 1664 phecode categories according to the Phecode Map v1.2 (https://phewascatalog.org/phecodes), as implemented in the PheWAS R package v0.12 (Carroll et al., Reference Carroll, Bastarache and Denny2014). Patients were assigned to the case group for a given phecode if they had at least two different ICD-9 or 10 codes that mapped to a given phecode, or if they had at least two separate occurrences (i.e., on different days) of a single ICD-9 or 10 code that mapped to the given phecode, both of which are validated strategies to improve the positive predictive value of phecodes (Denny et al., Reference Denny, Bastarache, Ritchie, Carroll, Zink, Mosley, Field, Pulley, Ramirez, Bowton, Basford, Carrell, Peissig, Kho, Pacheco, Rasmussen, Crosslin, Crane, Pathak and Roden2013). The control group excluded patients with only one component ICD-9 or 10 code, or with one or more ICD-9 or 10 codes that mapped to related phecodes (as defined by the Phecode Map v1.2).

LabWAS

We used the previously described QualityLab and LabWAS pipelines to perform quality control and analysis of quantitative clinical laboratory (lab) tests data in the EHR (Dennis, Sealock, Straub et al., Reference Dennis, Sealock, Straub, Lee, Hucks, Actkins, Faucon, Feng, Ge, Goleva, Niarchou, Singh, Morley, Smoller, Ruderfer, Mosley, Chen and Davis2021). We extracted data on all lab tests collected in the routine clinical care of VUMC patients, resulting in data from 939 lab tests after the QualityLab pipeline was applied (Dennis, Sealock, Straub et al., Reference Dennis, Sealock, Straub, Lee, Hucks, Actkins, Faucon, Feng, Ge, Goleva, Niarchou, Singh, Morley, Smoller, Ruderfer, Mosley, Chen and Davis2021). SNP-based heritability of lab values was previously calculated and described in detail. As we are using polygenic risk scores to predict lab values, we restricted the analysis to tests with a non-zero estimated SNP-based heritability. This resulted in 318 labs available for analysis. In this primary analysis, we used the median lab values adjusted for cubic splines of median age at lab ascertainment (4 knots). We transformed lab values to fit the normal distribution to improve the performance of the linear regression models (McCaw et al., Reference McCaw, Lane, Saxena, Redline and Lin2020). We applied the rank-based inverse normal quantile transformation (RINT) to all labs, which ensured trait normality by replacing the value of each observation with its quantile from the standard normal distribution.

Vitamin D can be measured clinically in a variety of forms. Overall vitamin D status is routinely assessed by assaying the transport and storage forms such as 25 hydroxyvitamin D3 and the closely related 25 hydroxyvitamin D2. Typically, the more abundant form, the D3 type, is the product of actinic pathways (i.e., the action of ultraviolet light on the skin). Both D3 and D2 can be obtained via supplements. The active hormonal form of vitamin D is 1,25 dihydroxyvitamin D (1,25OHD; either D2 or D3), which has a short half-life and is typically measured in picogram level concentrations. The assays for 25OHD and 1,25OHD were based on chemiluminescent magnetic microparticle immunoassays or quantitative chemiluminescent immunoassays respectively. The VUMC pathology laboratory participates in quality-assurance programs organized by DEQAS (the Vitamin D External Quality Assessment Scheme) and the National Institute of Standards and Technology (NIST; Dai et al., Reference Dai, Zhu, Manson, Song, Li, Franke, Costello, Rosanoff, Nian, Fan, Murff, Ness, Seidner, Yu and Shrubsole2018). Here, we have included measurements for 25OHD by two different assays (25OHD_a2, n = 9,472; 25OHD_a3, n = 9,450) and 1,25OHD by three different assays (1,25OHD_a1, n = 18,247; 1,25OHD_a4, n = 3,227; 1,25OHD_a5, n = 2,672).

Statistical Analysis

Polygenic score model training

We generated several PGSs based on GWAS of 25OHD and DBP concentration. For 25OHD, we used the original 25OHD GWAS summary statistics reported by Revez et al. (Reference Revez, Lin, Qiao, Xue, Holtz, Zhu, Zeng, Wang, Sidorenko, Kemper, Vinkhuyzen, Frater, Eyles, Burne, Mitchell, Martin, Zhu, Visscher, Yang and McGrath2020). An additional GWAS of 25OHD was conducted in a sample of 8306 UKB participants with 25OHD concentrations available and genetically inferred predominant African ancestry. Ancestry was inferred based on a two-step approach described elsewhere (Wang et al., Reference Wang, Guo, Ni, Yang, Visscher and Yengo2020). GWAS was conducted as described in Revez et al. (Reference Revez, Lin, Qiao, Xue, Holtz, Zhu, Zeng, Wang, Sidorenko, Kemper, Vinkhuyzen, Frater, Eyles, Burne, Mitchell, Martin, Zhu, Visscher, Yang and McGrath2020). Briefly, 25OHD concentrations were normalized with RINT and genetic variants were tested for association with RINT 25OHD using fastGWA (Jiang et al., Reference Jiang, Zheng, Qi, Kemper, Wray, Visscher and Yang2019). Covariates included in the model were age, sex, month of assessment, supplement intake, and the first 10 within-ancestry PCs.

For DBP, we used the two scores provided by Albiñana et al. (Reference Albiñana, Zhu, Borbye-Lorenzen, Boelt, Cohen, Skogstrand, Wray, Revez, Privé, Petersen, Bulik, Plana-Ripoll, Musliner, Agerbo, Børglum, Hougaard, Nordentoft, Werge, Mortensen and McGrath2023), based on neonatal dried blood spots from the iPSYCH case-cohort sample (n = 65,589; Pedersen et al., Reference Pedersen, Bybjerg-Grauholm, Pedersen, Grove, Agerbo, Baekvad-Hansen, Poulsen, Hansen, McGrath, Als, Goldstein, Neale, Daly, Hougaard, Mors, Nordentoft, Borglum, Werge and Mortensen2018). The first score (PGSDBP), which is based on the entire genome, is dominated by the very large effect cis-pQTLs within the GC gene (which encodes the DBP protein). The second score (PGCDBP_GC) excludes variants within the GC gene and is better able to identify trans-pQTLs variants. The iPSYCH sample did not have sufficient sample size with African ancestry to generate ancestry-specific DBP-summary statistics.

All PGSs were calculated with PRS-CS (Polygenic Risk Score – Continuous Shrinkage), a Bayesian polygenic prediction method that imposes continuous shrinkage priors on SNP effect sizes (Ge et al., Reference Ge, Chen, Ni, Feng and Smoller2019). These priors can be represented as global-local scale mixtures of normals, which allow the model to flexibly adapt to differing genetic architectures and is computationally efficient. The shrinkage parameter was automatically learnt from the data (i.e., using PRS-CS-auto). SNP effect estimates were obtained from GWAS summary statistics, and the score was calculated using a linkage disequilibrium reference panel from 503 European samples from the 1000 Genomes Project phase 3 (1000 Genomes Project Consortium et al., Reference Auton, Brooks, Durbin, Garrison, Kang, Korbel, Marchini, McCarthy, McVean and Abecasis2015) for the European and African ancestry analyses. For the score generated using the GWAS summary statistics for 25OHD from samples of predominantly African ancestry, the shrinkage parameter was set to 1e-2 due to the small GWAS sample size and the score was calculated using a linkage disequilibrium reference panel from 661 African ancestry samples from the 1000 Genomes Project phase 3 (1000 Genomes Project Consortium et al., Reference Auton, Brooks, Durbin, Garrison, Kang, Korbel, Marchini, McCarthy, McVean and Abecasis2015). PGS estimates were scaled to have a mean of zero and a standard deviation (SD) of 1 within ancestry strata before testing for association with any outcome variables.

LabWAS of PGS25OHD, PGSDBP and PGSDBP_GC

After QC, we applied RINT to the median (across longitudinal measures within a person) lab values, to account for skewness and non-normality in the subsequent LabWAS. In this analysis, we tested the association between the predictor variables (PGS25OHD, PGSDBP and PGSDBP_GC) against all heritable clinically measured laboratory tests. Additionally, we imposed a minimum sample size requirement of 100 for a laboratory test to be included in the LabWAS analysis, bringing the number of labs tested in each scan to 315 in the European ancestry set and 230 in the African ancestry set. We examined the influence of each of the three PGS on each of the validated LabWAS variables controlling for sex, median age across all ICD codes in medical record, and the top 10 principal components to adjust for genetic ancestry. Results are reported as beta coefficients and their standard errors per SD increase in the PGS. The Bonferroni-corrected threshold for statistical significance across labs for the European ancestry samples was 0.05/315 = 1.59e-04 and for the African ancestry samples was 0.05/230 = 2.17e-4 (based on the number of labs tested).

PheWAS of PGS25OHD, PGSDBP and PGSDBP_GC

The PheWAS analysis was conducted using the PheWAS R package v0.12 (Carroll et al., Reference Carroll, Bastarache and Denny2014). As with LabWAS, we required phecodes to include at least 100 cases (leading to 1322 tested phecodes in the European ancestry set, 688 in the African ancestry set), and we included covariates for sex, median age, and the first 10 PCs of estimated from genetic data. Results are reported as odds ratios (ORs) and their 95% confidence intervals (CIs) SD (either 25OHD or DBP concentrations) increase in each of the three PGS scores. The Bonferroni-corrected threshold for statistical significance across all tested phecodes was 0.05/1,322 = 3.78x 10−5 for the European ancestry set and 0.05/688 = 7e-5 for the African ancestry samples.

Post-hoc analyses of PGSDBP and PGSDBP_GC PheWAS findings

The study by Albiñana et al. (Reference Albiñana, Zhu, Borbye-Lorenzen, Boelt, Cohen, Skogstrand, Wray, Revez, Privé, Petersen, Bulik, Plana-Ripoll, Musliner, Agerbo, Børglum, Hougaard, Nordentoft, Werge, Mortensen and McGrath2023) included PheWAS analyses of PGSDBP and PGSDBP_GC based on the UKB, examining 25OHD concentration and a subset of UKB phenotypes (i.e., 1149 phenotypes, including 1027 diseases and a range of anthropometric, brain imaging and infectious disease antigens phenotypes). Based on the findings from the current study, we attempted to replicate selected findings in the other UKB phenotypes not examined in the earlier study. The PheWAS analysis was conducted in the UKB using the same models as outlines in Albiñana et al. (Reference Albiñana, Zhu, Borbye-Lorenzen, Boelt, Cohen, Skogstrand, Wray, Revez, Privé, Petersen, Bulik, Plana-Ripoll, Musliner, Agerbo, Børglum, Hougaard, Nordentoft, Werge, Mortensen and McGrath2023). The quantitative traits were normalized using RINT with mean zero and variance 1. The PRSs were generated using SBayesR (Lloyd-Jones et al., Reference Lloyd-Jones, Zeng, Sidorenko, Yengo, Moser, Kemper, Wang, Zheng, Magi, Esko, Metspalu, Wray, Goddard, Yang and Visscher2019) with the reference LD matrix estimated from 1,145,953 HapMap3 SNPs in the UKB. PRSs were computed for 348,501 individuals of European ancestry. The individuals were genetically unrelated (relationship < .05). The covariates included in the model were sex, age and the first 20 PCs.

The influence of rs4588 and rs7041 on PheWAS and LabWAS

In addition to the polygene scores, we examined the influence of two missense variants with the GC gene (rs4588, rs7041) on the variables of interest. Albiñana et al. (Reference Albiñana, Zhu, Borbye-Lorenzen, Boelt, Cohen, Skogstrand, Wray, Revez, Privé, Petersen, Bulik, Plana-Ripoll, Musliner, Agerbo, Børglum, Hougaard, Nordentoft, Werge, Mortensen and McGrath2023) had previously demonstrated that the rs7041 variant explained 54% of the variance of DBP concentration in neonatal dried blood spots. For the individual SNPs, we examined an additive model (i.e., 0, 1, 2 coding for effect allele).

Results

Our analyses included 88,019 BioVU patients of European (n = 66,483) or African ancestry (n = 13,285). In the European ancestry (EA) sample, 56% of patients were female and the mean age was 48.71 years. In the African ancestry (AA) sample, 61% of patients were female and the mean age was 38.6 years. See Table 1 for additional characteristics of patients included.

Table 1. Counts and univariates statistics for key demographic variables of the European and African ancestry groups

Note: EHR, electronic health record; SD, standard deviation; Q1−Q3, first and third quartile.

European Ancestry — PGS25OHD

With respect to PheWAS (i.e., clinical phenotypes) in those with European ancestry, higher PGS25OHD was associated (as expected) with lower odds of vitamin D deficiency (OR = 0.84, 95% CI [0.82, 0.86]; n cases = 5768, n controls = 45,960). Within the phenotypes that met the Bonferroni-adjusted threshold, of the nine top phenotypes (Figure 1), five were associated with altered lipid concentrations (e.g., reduced odds of hypercholesterolemia, OR = 0.92, 95% CI [0.90, 0.95]; n cases = 6925, n controls = 41,747). Two of the top nine phenotypes were related to a reduced risk of diabetes (e.g., reduced odds of Type 2 diabetes, OR = 0.95, 95% CI [0.93, 0.97], n cases = 10,202, n controls = 46,320) (Supplementary data 1).

Figure 1. The association between PGS25OHD and disease phenotypes in individuals with primarily European ancestry (n = 66,482).

Note: Associations for 1322 phenotypes are shown. On the x-axis, the phenotypes clustered according to broad phenotype categories represented by different colors. P values are shown on the y-axis, with upturned triangles representing positive associations and downturned triangles representing negative associations. The top phenotypes with p values exceeding the Bonferroni multiple testing threshold (p < 3.78e-5), are labeled. Full details are provided in Supplementary data 1.

LabWAS results (Figure 2) were consistent with the clinical diagnoses, with higher PGS25OHD associated with both increased 1,25OHD concentration (β = 0.16, 95% CI [0.14, 0.17], n total = 18,247, r 2 = .03) and increased 25OHD concentration (β = 0.18, 95% CI [0.16, 0.20]; n total = 9472, r 2 = .03). Laboratory tests related to the measurement of cholesterol (β = −0.04, 95% CI [−0.05, −0.03], n total = 30,329, r 2 = .002) and triglycerides (β = −0.06, 95% CI [−0.07, −0.05], n total = 30, 534, r 2 = .003) had small but significant inverse associations with PGS25OHD, in keeping with the disease phenotypes described above. Finally, higher PGS25OHD was associated with a small but significant reduction in glucose concentration (β = −0.015, 95% CI [−0.02, −0.008], n total = 62,280, r 2 = .0003) (Supplementary data 2).

Figure 2. The association between PGS25OHD and laboratory results, in individuals with primarily European ancestry (n = 66,482).

Note: Associations for 315 laboratory results are shown. On the x-axis, the laboratory tests are clustered according to broad organ or pathology categories, represented by different colors. P values are shown on the y-axis, with upturned triangles representing positive associations and downturned triangles representing negative associations. Laboratory tests with p-values exceeding the Bonferroni multiple testing threshold (p < 1.59e-04), shown as a pink horizontal reference line, are labeled. Full details are provided in Supplementary data 2. 25OHD_a2 and 25OHD_a3 are two different types of 25 hydroxyvitamin D assays. 1,25OHD_a1 and 1,25OHD_a4 are two different types of 1,25 dihydroxyvitamin D assays. Trigs, triglycerides; Chol, cholesterol, LDL.C, low density lipoprotein cholesterol; Gluc, glucose.

European Ancestry — PGSDBP and PGSDBP_GC

No PheWAS associations with PGSDBP exceeded the Bonferroni adjusted p-value threshold in those with European ancestry (Figure 3). However, vitamin D deficiency was nominally significant (OR = 0.96, 95% CI [0.93, 0.98], n cases = 5768, n controls = 45,960) (Figure 4, Supplementary data 3).

Figure 3. The associations between PGSDBP, PGSDBP_GC and disease phenotypes in individuals with primarily European ancestry (n = 66,482).

Note: Panel A, PheWAS for PGSDBP. Panel B, PGSDBP_GC. Associations for 1322 phenotypes are shown. On the x-axis, the phenotypes clustered according to broad phenotypes represented by different colors. P values are shown on the y-axis, with upturned triangles representing positive associations and downturned triangles representing negative associations. The five phenotypes with the smallest p values are labeled; however, none of phenotypes exceeded the Bonferroni multiple testing threshold (p < 3.78e-05). In Panel B, Vitamin D deficiency is also labeled for reference. Full details are provided in Supplementary Data 3 and 5.

Figure 4. The associations between PGSDBP, PGSDBP_GC and laboratory measures in individuals with primarily European ancestry (n = 66,482).

Note: Panel A, LabWAS for PGSDBP. Panel B, LabWAS for PGSDBP_GC. Associations for 315 laboratory results are shown. On the x-axis, the laboratory tests are clustered according to broad organ or pathology categories, represented by different colors. P values are shown on the y-axis, with upturned triangles representing positive associations and downturned triangles representing negative associations. Laboratory tests with p values exceeding the Bonferroni multiple testing threshold (p < 1.59e-04), shown as a pink horizontal reference line, are labeled. 25OHD_a2 and 25OHD_a3 are two different types of 25 hydroxyvitamin D assays. 1,25OHD_a1 and 1,25OHD_a4 are two different types of 1,25 dihydroxyvitamin D assays (see Methods). WBC, leukocytes (#/volume) in blood by automated count. LymAbs, lymphocytes (#/volume) in blood by automated count. MonAbs, absolute count of monocytes. NtAb, absolute count of neutrophils. EoAb, absolute count of eosinophils. Full details are provided in Supplementary data 4 and 6.

With respect to LabWAS, in those with European ancestry, the two PGS related to DBP identified distinct findings (Figure 4). For PGSDBP (which is strongly influenced by cis-pQTLs within the GC gene, which encodes for the DBP protein), there were small but significant associations with both 25OHD (e.g., β = 0.08, 95% CI [0.06, 0.10], n total = 9472, r 2 =.006), and 1,25OHD (β = 0.04, 95% CI [0.03, 0.06], r 2 = .002) (Supplementary data 4).

For the PGSDBP_GC (which adjusts for variants within the GC gene to identify trans-pQTLs), there were no significant findings in the PheWAS analyses (Supplementary data 5). However, for the PGSDBP_GC LabWAS analyses, we found small but significant reductions in white blood cell counts (leukocytes/lymphocytes, monocytes, neutrophils, eosinophils). For example, leukocyte counts were reduced in those with higher PGSDBP_GC values (β=−0.044, 95% CI [−0.051, −0.037], n total = 64775, r 2 = .002) (Supplementary data 6). As post-hoc analyses, we examined blood count related phenotypes in the UKB and confirmed a reduction in a range of comparable blood count related variables (Supplementary data 7). For example, higher PGSDBP_GC values were significantly associated with reduced lymphocyte (i.e., leukocyte) count with a similar effect size as found in the main analysis (β = −0.039, 95% CI [−0.042, −0.035], n total = 291,968, r 2 = .002).

African Ancestry — PGS25OHD

We performed the PheWAS and LabWAS of the primarily African ancestry sample using summary statistics derived from the UKB African-ancestry population. The African ancestry derived PGS25OHD identified one significant PheWAS finding, with higher genetically predicted 25OHD concentration being associated with a reduced risk of type 2 diabetes with renal manifestations (OR = 0.61, 95% CI [0.49, 0.78], n cases = 589, n controls = 9455). With respect to LabWAS findings, none were significant based on the Bonferroni-adjusted threshold (Supplementary data 8 and 9).

We also conducted the PheWAS and LabWAS of primarily African ancestry individuals using the PGS25OHD trained on European derived summary statistics. Despite the much larger discovery sample size, no association exceeded the Bonferroni-corrected p-value threshold, but several of the diagnoses that associated with PGS25OHD in the larger European target sample were nominally significant (p < .05) in the African ancestry target sample. For example, within the top 16 hits for the PGS25OHD LabWAS analyses, three were for vitamin D-related measures (i.e., 25OHD or 1,25OHD). Those with higher PGS25OHD scores had higher concentration of 1,25OHD (β = 0.05, 95% CI [0.02, 0.09], n total = 3279, r 2 = .003). Also in the top 16 were two measures related to cholesterol (e.g., cholesterol [mass/volume] in serum or plasma, β = −0.04, 95% CI [−0.07, −0.02], n total = 5979, r 2 = 0.002) (Supplementary data 10 and 11).

African Ancestry — PGSDBP and PGSDBP_GC

With respect to PGSDBP and PGSDBP_GC, we were restricted to using the PGS based on the original European-ancestry derived summary statistics. Based on these PGS scores, there were no significant PheWAS findings; however, the top hit for PGSDBP_GC was a nominally significant protective finding for multiple sclerosis (OR = 0.76, 95% CI [0.65, 0.90], n cases = 159, n controls = 10,501). With respect to PGSDBP and PGSDBP_GC LabWAS findings, there were no significant findings; however, there was a small, nominally significant association between PGSDBP and 25OHD concentration (β = 0.09, 95% CI [0.002, 0.169], n total = 473, r 2 = .008). Full details of these analyses can be found in Supplementary data 12, 13, 14 and 15.

The Influence of rs4588 and rs7041 on PheWAS and LabWAS Variables

The allele frequencies for rs4588 and rs7041 in the BioVU sample are shown in Supplementary Table 16. The presence of the G allele in rs4588, and the C allele in rs7041, were associated with higher concentration of 1,25OHD in both the European and African ancestry groups (Supplementary data 16).

With respect to PheWAS, in the European ancestry sample, for the two individual SNPs within the GC gene, rs4588 was significantly associated with the clinical diagnosis of Vitamin D deficiency (rs4588, OR = 0.86, 95% CI [0.83, 0.90], p = 1.98E-11, n cases = 5767, n controls = 45,944). rs7041 also had a significant association with Vitamin D deficiency (rs7041, OR = 0.90, 95% CI [0.87, 0.94], p = 1.77E-7, n cases = 5763, n controls = 45,935). However, there were no significant findings in the African ancestry group (Supplementary data 17, 18, 19 and 20). With respect to LabWAS in the European ancestry group, both individual SNPs were significantly associated with both 25OHD concentration and 1,25OHD (e.g., rs4588 and 25OD_a3, n total = 9450, β = 0.22, SE = 0.15, p = 4.36E-46; rs4588 and 1,24OHD_a1, n total = 18,247. β = 0.15, p = 1.25E-44. Supplementary data 21, 22, 23 and 24). With respect to the African ancestry sample, rs4588 was nominally significantly associated with both 25OHD and 1,25OHD, while rs7041 was only nominally significantly associated with 1,25OHD.

Discussion

It was reassuring that the most recently published PGS for 25OHD (Revez et al., Reference Revez, Lin, Qiao, Xue, Holtz, Zhu, Zeng, Wang, Sidorenko, Kemper, Vinkhuyzen, Frater, Eyles, Burne, Mitchell, Martin, Zhu, Visscher, Yang and McGrath2020) was able to predict 25OHD concentration and vitamin D deficiency. This study confirms that the genetic loci associated with 25OHD and DBP concentrations also predict a wide range of medical conditions and laboratory measurements within electronic health records in a general hospital setting. For example, we found that this same PGS predicted the risk of several phenotypes previously linked to vitamin D in observational and MR studies, including dyslipidemia and diabetes. In addition, the genetic correlates of DBP concentration also predicted 25OHD and 1,25OHD concentrations, and were associated with a range of white blood cell related measures. We will expand on these findings below.

Of particular interest, our findings lend weight to the hypothesis that variants associated with 25OHD are horizontally (or biologically) pleiotropic (Hemani et al., Reference Hemani, Bowden and Davey Smith2018), and influence 25OHD concentration among other biological functions, such as lipid pathways. In analyses that excluded potentially horizontally pleotropic variants, Revez and colleagues (Reference Revez, Lin, Qiao, Xue, Holtz, Zhu, Zeng, Wang, Sidorenko, Kemper, Vinkhuyzen, Frater, Eyles, Burne, Mitchell, Martin, Zhu, Visscher, Yang and McGrath2020) identified a persistent association between genetically predicted higher 25OHD concentration and a lower risk of dyslipidaemia. Many of the variants identified by Revez and colleagues were in genes related to lipid and lipoprotein pathways (e.g., DHCR7, APOE, APOC1, DOCK7, CELSR2, LIPC, PCSK9). While the mechanisms linking lipid and vitamin D pathways are poorly understood, there is evidence that vitamin D can inhibit activity of DHCR7, which encodes a key enzyme that diverts 7-dehydrocholesterol away from vitamin D biosynthesis and converts it to cholesterol (Prabhu et al., Reference Prabhu, Luu, Sharpe and Brown2016; Zou & Porter, Reference Zou and Porter2015). Regardless of the precise biological mechanisms, there is now convergent evidence from MR (Revez et al., Reference Revez, Lin, Qiao, Xue, Holtz, Zhu, Zeng, Wang, Sidorenko, Kemper, Vinkhuyzen, Frater, Eyles, Burne, Mitchell, Martin, Zhu, Visscher, Yang and McGrath2020) and the current PheWAS study linking low 25OHD concentrations to an increased risk of dyslipidemia and higher concentrations of (a) triglyceride, (b) cholesterol, and (c) low density lipoprotein cholesterol. However, randomized clinical trials of vitamin D supplements have not reported strong effects on these phenotypes within their study timeframes (Costenbader et al., Reference Costenbader, MacFarlane, Lee, Buring, Mora, Bubes, Kotler, Camargo, Manson and Cook2019; Meng et al., Reference Meng, Matthan, Angellotti, Pittas and Lichtenstein2020; Ohlund et al., Reference Ohlund, Lind, Hernell, Silfverdal, Liv and Karlsland Akeson2020). Thus, the clinical implications of these findings should be treated cautiously.

Our study also found that variants associated with higher 25OHD were associated with a reduced risk of diabetes and plasma glucose concentration. There are several potential biological mechanisms that could underpin this association. Revez et al. (Reference Revez, Lin, Qiao, Xue, Holtz, Zhu, Zeng, Wang, Sidorenko, Kemper, Vinkhuyzen, Frater, Eyles, Burne, Mitchell, Martin, Zhu, Visscher, Yang and McGrath2020) found that PGS25OHD predicted a range of behaviors measured in the UKB including indoor activities (negatively associated with ‘hours spent using a computer’) and outdoor activity (positively associated with ‘duration of walks’ and ‘duration of vigorous activity’). Thus, at least some of the predictive properties of PGS25OHD may be mediated by genetic variants associated with behaviors that influence actinic production of vitamin D. These same variables may influence body mass index, and subsequent risk of type 2 diabetes. Thus, the association between PGS25OHD and diabetes may operate via pathways other than a direct influence of 25OHD concentration on the risk of diabetes.

To the best of our knowledge, this work also provides the first evidence to show that the PGS for 25OHD predicts 1,25OHD concentration. Studies of 1,25OHD are challenging because the half-life of this small molecule is short compared to 25OHD (several hours compared to one to two weeks, respectively; Zerwekh, Reference Zerwekh2008), and the concentration of 1,25OHD is tightly controlled by parathyroid hormone and calcium homeostasis. Several factors can uncouple the association between 25OHD and 1,25OHD. It has been reported that in the presence of both vitamin D deficiency (i.e., low 25OHD concentrations), and low calcium concentration, 1,25OHD concentrations can rise sharply — thus, this molecule is not regarded as a reliable measure of overall vitamin D status (Holick, Reference Holick2009). From a clinical perspective, data from randomized controlled trials found that the use of oral vitamin D supplements is associated with an increase in the concentration of 1,25OHD (Zittermann et al., Reference Zittermann, Ernst, Birschmann and Dittrich2015). Albiñana et al. (Reference Albiñana, Zhu, Borbye-Lorenzen, Boelt, Cohen, Skogstrand, Wray, Revez, Privé, Petersen, Bulik, Plana-Ripoll, Musliner, Agerbo, Børglum, Hougaard, Nordentoft, Werge, Mortensen and McGrath2023) tested bidirectional MR models between the genetic correlates of 25OHD and DBP concentrations. They found a unidirectional association, which supports the hypothesis that higher DBP concentration may extend the functional half-life of 25OHD. Of interest, the two individual SNPs within the GC gene (rs4588 and rs7041) were associated with both 25OHD and 1,25OHD in the LabWAS for the European-ancestry sample. Within the smaller African ancestry sample the two individual SNPs were nominally significantly associated with 1,25OHD (rs4588 also had a nominally significant association with 25OHD). These findings provide new insights into the genetic architecture of vitamin D metabolism.

The vitamin D binding protein has a range of biological functions in addition to the transport of 25OHD and 1,25OHD (e.g., T-cell response, C5a-mediated chemotaxis, macrophage activation; Bouillon et al., Reference Bouillon, Schuit, Antonio and Rastinejad2019). Albiñana et al. (Reference Albiñana, Zhu, Borbye-Lorenzen, Boelt, Cohen, Skogstrand, Wray, Revez, Privé, Petersen, Bulik, Plana-Ripoll, Musliner, Agerbo, Børglum, Hougaard, Nordentoft, Werge, Mortensen and McGrath2023) found evidence from MR that increased DBP concentration based on the GWAS findings adjusted for variants in the GC gene were associated with a reduced risk of rheumatoid arthritis and multiple sclerosis. While we found a nominal association between PGSDBP_GC and multiple sclerosis in the African ancestry sample, these disorders were not confidently detected in the current study. We did, however, find a range of decreased white blood cell trait counts associated with PGSDBP_GC. Pleotropic variants may account for this finding. A missense variant in SH2B3, is both (a) a ‘master regulator’ influencing the concentration of over 50 plasma protein (Ferkingstad et al., Reference Ferkingstad, Sulem, Atlason, Sveinbjornsson, Magnusson, Styrmisdottir, Gunnarsdottir, Helgason, Oddsson, Halldorsson, Jensson, Zink, Halldorsson, Masson, Arnadottir, Katrinardottir, Juliusson, Magnusson, Magnusson and Stefansson2021; Pietzner et al., Reference Pietzner, Wheeler, Carrasco-Zanini, Cortes, Koprulu, Worheide, Oerton, Cook, Stewart, Kerrison, Luan, Raffler, Arnold, Arlt, O’Rahilly, Kastenmuller, Gamazon, Hingorani, Scott and Langenberg2021; Sun et al., Reference Sun, Maranville, Peters, Stacey, Staley, Blackshaw, Burgess, Jiang, Paige, Surendran, Oliver-Williams, Kamat, Prins, Wilcox, Zimmerman, Chi, Bansal, Spain, Wood and Butterworth2018), and (b) associated with a range of hematological measurements and disorders (Morris et al., Reference Morris, Butler, Perkins, Kershaw and Babon2021). The active form of vitamin D (1,25OHD) is a potent driver of cellular differentiation (in keeping with other steroid hormones) and in the presence of vitamin D deficiency, the hematological cell lines may be less differentiated, which in turn may explain the decrease in mature cell counts (Medrano et al., Reference Medrano, Carrillo-Cruz, Montero and Perez-Simon2018).

The genetic correlations of GWAS summary statistics can be difficult to interpret, as cases used to derive the summary statistic may have an increased risk of additional correlated phenotypes (compared to non-cases). For example, it is feasible that the individuals in the UKB who had lipid-related phenotypes also had low 25OHD as a consequence of their impaired health (e.g., diabetes, obesity), and the GWAS methodology and subsequent PheWAS studies may detect both the target and correlated phenotypes (previously referred to as the ‘phenotypic hitchhiking’ effect; Dennis. Sealock, Levinson et al., Reference Dennis, Sealock, Levinson, Farber-Eger, Franco, Fong, Straub, Hucks, Song, Linton, Fontanillas, Elson, Ruderfer, Abdellaoui, Sanchez-Roige, Palmer, Boomsma, Cox, Chen and Davis2021). Regardless of these issues, the findings of our study lend weight to the hypothesis that vitamin D pathways and lipid-related phenotypes may have shared biological pathways.

Finally, despite a much-reduced discovery sample size, the PGS25OHD based on African ancestry derived summary statistics, detected an association between PGS25OHD and type 2 diabetes with renal manifestations in the primarily African ancestry target sample. Importantly, this compares to an absence of significant associations in the exact same target sample when using the PGS25OHD trained on sumstats from a primarily European ancestry sample. These findings illustrate that the absence of associations in the latter analysis is largely due to underrepresentation in the European ancestry GWAS and strongly signal the need for more ancestrally diverse genetic research in general and in vitamin D genetic studies specifically (Sirugo et al., Reference Sirugo, Williams and Tishkoff2019).

The study has several strengths. The electronic health records used in this study included a large sample of patients, with extensive information on treated phenotypes and laboratory tests. The PGS instrument was based on a more powerful GWAS study compared to the previous (null) PheWAS (Meng et al., Reference Meng, Li, Timofeeva, He, Spiliopoulou, Wei, Gifford, Wu, Varley, Joshi, Denny, Farrington, Zgaga, Dunlop, McKeigue, Campbell and Theodoratou2019). However, there were several important limitations. The discovery sample for 25OHD was based on the UKB, which is not representative of the general community (Fry et al., Reference Fry, Littlejohns, Sudlow, Doherty, Adamska, Sprosen, Collins and Allen2017). As a result, if selective process are associated with both the predictor and outcome variable, collider biases may be introduced (Munafo et al., Reference Munafo, Tilling, Taylor, Evans and Davey Smith2018), which can subsequently lead to spurious associations. Our African ancestry sample was small, and we were not able to examine diverse ancestries beyond African and European ancestry groups. Ideally, variant imputation and PRS scores generation should be based on appropriate African ancestry samples. Thus, our results are unlikely to be generalizable to other ancestries. Additionally, the Vanderbilt health system is a tertiary referral center, and may not be representative of population-based samples. Lastly, private health insurance is required in most primary care clinics at VUMC, which further limits the socioeconomic diversity of the patient population.

Conclusions

Genetic instruments designed to predict vitamin D status were shown to have face validity in the large sample of European and African ancestry patients treated in a specialist health setting. The polygene risk score for 25OHD predicted clinical vitamin D deficiency, and also predicted the concentration of the active form of vitamin D, 1,25 dihydroxyvitamin D. In addition, two missense SNPs within the GC gene (rs4588 and rs7041) independently predicted both 25OHD and 1,25OHD concentrations, and thus could act as informative genetic instruments in MR models. Other phenotypes associated with our predictors include lipid-related diagnoses and diabetes. These findings lend weight to the hypothesis that low vitamin D may contribute to these clinical features.

Supplementary material

To view supplementary material for this article, please visit https://doi.org/10.1017/thg.2024.19.

Availability of data and materials

The data that support the findings of this study are available from Vanderbilt University Medical Center but restrictions apply to the availability of these data, which were used under license for the current study, and so are not publicly available. Data are however available from the authors upon reasonable request and with permission of Vanderbilt University Medical Center. The data in question must first be reviewed by the Integrated Data Access and Services Core to ensure that the de-identification is complete and no potentially identifying information remains. Please contact the Vanderbilt Institute for Clinical and Translational Research ([email protected]) for more information.

Details for Phecodes can be found here: Phecode Map v1.2: https://phewascatalog.org/phecodes.

Code for QualityLab and LabWAS software used to generate the results presented in this paper can be found here (https://bitbucket.org/straubp_vandy/quality_labs/) and here (https://bitbucket.org/juliasealock/labwas/).

Acknowledgments

The authors thank the Vanderbilt University Medical Center Biobank and Mass General Brigham Biobank for providing genomic and health information data.

Authors’ contributions

LKD and JJM conceived and designed the study. HK and FB were responsible for the analyses using the VUMC databases and prepared the figures and supplementary material. Additional analyses related to UKB data were conducted by JR and ZZ. LKD, JJM, HK and FB drafted the manuscript. All authors contributed to the revision of the manuscript. All authors read and approved the final manuscript.

Funding statement

LKD was supported by a grant from the NIMH (R56MH120736). HAK was supported by a Vanderbilt MSTP training grant (T32GM007347). JMcG was supported by the Danish National Research Foundation (Niels Bohr Professorship) is receives from the Queensland Department of Health, via The Park Centre for Mental Health. FB was supported by a Vanderbilt training grant from the NHGRI (T32HG008341).

The development and maintenance of the SD was supported by the National Center for Research Resources, Grant UL1 RR024975-01, and is now at the National Center for Advancing Translational Sciences, Grant 2 UL1 TR000445-06. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

The dataset(s) used for the analyses described were obtained from Vanderbilt University Medical Center’s BioVU, which is supported by numerous sources: institutional funding, private agencies, and federal grants. These include the NIH funded Shared Instrumentation Grant S10RR025141; and CTSA grants UL1TR002243, UL1TR000445, and UL1RR024975. Genomic data are also supported by investigator-led projects that include U01HG004798, R01NS032830, RC2GM092618, P50GM115305, U01HG006378, U19HL065962, R01HD074711; and additional funding sources listed at https://victr.vumc.org/biovu-funding/.

Ethics approval and consent to participate

BioVU Consent form is provided to patients in the outpatient clinic environments at VUMC. The consent states policies on data sharing and privacy and, upon consent, makes any blood leftover from clinical care eligible for BioVU banking. The VUMC Institutional Review Board oversees BioVU and approved this project. All data included in this study was de-identified and unlinked to any identifying information. This study was reviewed by the Vanderbilt University Medical Center IRB (IRB#172020) and designated as non-human subjects research. The research was conducted in accordance with the principles of the Declaration of Helsinki.

Competing interests

BJV is a member of the scientific advisory board for Allelica. The other authors have no competing interests.

List of abbreviations

1,25OHD: 1,25 dihydroxyvitamin D

25OHD: 25 hydroxyvitamin D

DBP: vitamin D binding protein

EHR: Electronic health record

GWAS: genomewide association study

MR: Mendelian randomization

PGS: polygene scores

PheWAS: Phenome-wide association study

LabWAS: Laboratory-wide association study

RINT: rank-based inverse normal transformation

SNP: single nucleotide polymorphism

UKB: UK Biobank

Footnotes

*

Joint first authors.

Joint senior authors.

References

Abraham, G., & Inouye, M. (2014). Fast principal component analysis of large-scale genome-wide data. PloS One, 9, e93766. https://doi.org/10.1371/journal.pone.0093766 CrossRefGoogle ScholarPubMed
Abraham, G., Qiu, Y., & Inouye, M. (2017). FlashPCA2: principal component analysis of Biobank-scale genotype datasets. Bioinformatics, 33, 27762778. https://doi.org/10.1093/bioinformatics/btx299 CrossRefGoogle ScholarPubMed
Albiñana, C., Zhu, Z., Borbye-Lorenzen, N., Boelt, S. G., Cohen, A. S., Skogstrand, K., Wray, N. R., Revez, J. A., Privé, F., Petersen, L. V., Bulik, C. M., Plana-Ripoll, O., Musliner, K. L., Agerbo, E., Børglum, A. D., Hougaard, D. M., Nordentoft, M., Werge, T., Mortensen, P. B., … McGrath, J. J. (2023). Genetic correlates of vitamin D-binding protein and 25-hydroxyvitamin D in neonatal dried blood spots. Nature Communications, 14, 852. https://doi.org/10.1038/s41467-023-36392-5 CrossRefGoogle ScholarPubMed
Bouillon, R., Schuit, F., Antonio, L., & Rastinejad, F. (2019). Vitamin D binding protein: A historic overview. Frontiers in Endocrinology, 10, 910. https://doi.org/10.3389/fendo.2019.00910 CrossRefGoogle ScholarPubMed
Bowton, E., Field, J. R., Wang, S., Schildcrout, J. S., Van Driest, S. L., Delaney, J. T., Cowan, J., Weeke, P., Mosley, J. D., Wells, Q. S., Karnes, J. H., Shaffer, C., Peterson, J. F., Denny, J. C., Roden, D. M., & Pulley, J. M. (2014). Biobanks and electronic medical records: Enabling cost-effective research. Science Translational Medicine, 6, 234cm233. https://doi.org/10.1126/scitranslmed.3008604 CrossRefGoogle ScholarPubMed
Carroll, R. J., Bastarache, L., & Denny, J. C. (2014). R PheWAS: Data analysis and plotting tools for phenome-wide association studies in the R environment. Bioinformatics, 30, 23752376. https://doi.org/10.1093/bioinformatics/btu197 CrossRefGoogle ScholarPubMed
Chou, S. H., LeBoff, M. S., & Manson, J. E. (2020). Is the sun setting on vitamin D? Clinical Chemistry, 66, 635637. https://doi.org/10.1093/clinchem/hvaa074 CrossRefGoogle ScholarPubMed
Costenbader, K. H., MacFarlane, L. A., Lee, I. M., Buring, J. E., Mora, S., Bubes, V., Kotler, G., Camargo, C. A. Jr., Manson, J. E., & Cook, N. R. (2019). Effects of one year of vitamin D and marine omega-3 fatty acid supplementation on biomarkers of systemic inflammation in older US adults. Clinical Chemistry, 65, 15081521. https://doi.org/10.1373/clinchem.2019.306902 CrossRefGoogle ScholarPubMed
Dai, Q., Zhu, X., Manson, J. E., Song, Y., Li, X., Franke, A. A., Costello, R. B., Rosanoff, A., Nian, H., Fan, L., Murff, H., Ness, R. M., Seidner, D. L., Yu, C., & Shrubsole, M. J. (2018). Magnesium status and supplementation influence vitamin D status and metabolism: Results from a randomized trial. American Journal of Clinical Nutrition, 108, 12491258. https://doi.org/10.1093/ajcn/nqy274 CrossRefGoogle ScholarPubMed
de Boer, I. H., Zelnick, L. R., Ruzinski, J., Friedenberg, G., Duszlak, J., Bubes, V. Y., Hoofnagle, A. N., Thadhani, R., Glynn, R. J., Buring, J. E., Sesso, H. D., & Manson, J. E. (2019). Effect of vitamin D and omega-3 fatty acid supplementation on kidney function in patients with type 2 diabetes: A randomized clinical trial. JAMA, 322, 18991909. https://doi.org/10.1001/jama.2019.17380 CrossRefGoogle ScholarPubMed
Dennis, J., Sealock, J., Levinson, R. T., Farber-Eger, E., Franco, J., Fong, S., Straub, P., Hucks, D., Song, W. L., Linton, M. F., Fontanillas, P., Elson, S. L., Ruderfer, D., Abdellaoui, A., Sanchez-Roige, S., Palmer, A. A., Boomsma, D. I., Cox, N. J., Chen, G., … Davis, L. K. (2021). Genetic risk for major depressive disorder and loneliness in sex-specific associations with coronary artery disease. Molecular Psychiatry, 26, 42544264. https://doi.org/10.1038/s41380-019-0614-y CrossRefGoogle ScholarPubMed
Dennis, J. K., Sealock, J. M., Straub, P., Lee, Y. H., Hucks, D., Actkins, K., Faucon, A., Feng, Y. A., Ge, T., Goleva, S. B., Niarchou, M., Singh, K., Morley, T., Smoller, J. W., Ruderfer, D. M., Mosley, J. D., Chen, G., & Davis, L. K. (2021). Clinical laboratory test-wide association scan of polygenic scores identifies biomarkers of complex disease. Genome Medicine, 13, 6. https://doi.org/10.1186/s13073-020-00820-8 CrossRefGoogle ScholarPubMed
Denny, J. C., Bastarache, L., Ritchie, M. D., Carroll, R. J., Zink, R., Mosley, J. D., Field, J. R., Pulley, J. M., Ramirez, A. H., Bowton, E., Basford, M. A., Carrell, D. S., Peissig, P. L., Kho, A. N., Pacheco, J. A., Rasmussen, L. V., Crosslin, D. R., Crane, P. K., Pathak, J., … Roden, D. M. (2013). Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nature Biotechnology, 31, 11021110. https://doi.org/10.1038/nbt.2749 CrossRefGoogle ScholarPubMed
Denny, J. C., Ritchie, M. D., Basford, M. A., Pulley, J. M., Bastarache, L., Brown-Gentry, K., Wang, D., Masys, D. R., Roden, D. M., & Crawford, D. C. (2010). PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations. Bioinformatics, 26, 12051210. https://doi.org/10.1093/bioinformatics/btq126 CrossRefGoogle ScholarPubMed
Ferkingstad, E., Sulem, P., Atlason, B. A., Sveinbjornsson, G., Magnusson, M. I., Styrmisdottir, E. L., Gunnarsdottir, K., Helgason, A., Oddsson, A., Halldorsson, B. V., Jensson, B. O., Zink, F., Halldorsson, G. H., Masson, G., Arnadottir, G. A., Katrinardottir, H., Juliusson, K., Magnusson, M. K., Magnusson, O. T., … Stefansson, K. (2021). Large-scale integration of the plasma proteome with genetics and disease. Nature Genetics, 53, 17121721. https://doi.org/10.1038/s41588-021-00978-w CrossRefGoogle ScholarPubMed
Fry, A., Littlejohns, T. J., Sudlow, C., Doherty, N., Adamska, L., Sprosen, T., Collins, R., & Allen, N. E. (2017). Comparison of sociodemographic and health-related characteristics of UK Biobank Participants with those of the general population. American Journal of Epidemiology, 186, 10261034. https://doi.org/10.1093/aje/kwx246 CrossRefGoogle ScholarPubMed
Ge, T., Chen, C. Y., Ni, Y., Feng, Y. A., & Smoller, J. W. (2019). Polygenic prediction via Bayesian regression and continuous shrinkage priors. Nature Communications, 10, 1776. https://doi.org/10.1038/s41467-019-09718-5 CrossRefGoogle ScholarPubMed
Goldstein, J. A., Weinstock, J. S., Bastarache, L. A., Larach, D. B., Fritsche, L. G., Schmidt, E. M., Brummett, C. M., Kheterpal, S., Abecasis, G. R., Denny, J. C., & Zawistowski, M. (2020). LabWAS: Novel findings and study design recommendations from a meta-analysis of clinical labs in two independent biobanks. Plos Genetics, 16, e1009077. https://doi.org/10.1371/journal.pgen.1009077 CrossRefGoogle ScholarPubMed
Heaney, R. P. (2003). Long-latency deficiency disease: insights from calcium and vitamin D. American Journal of Clinical Nutrition, 78, 912919. https://doi.org/10.1093/ajcn/78.5.912.CrossRefGoogle ScholarPubMed
Hemani, G., Bowden, J., & Davey Smith, G. (2018). Evaluating the potential role of pleiotropy in Mendelian randomization studies. Human Molecular Genetics, 27, R195R208. https://doi.org/10.1093/hmg/ddy163 CrossRefGoogle ScholarPubMed
Holick, M. F. (2007). Vitamin D deficiency. New England Journal of Medicine, 357, 266281.. https://doi.org/10.1056/NEJMra070553.CrossRefGoogle ScholarPubMed
Holick, M. F. (2009). Vitamin D status: Measurement, interpretation, and clinical application. Annals of Epidemiology, 19, 7378. https://doi.org/10.1016/j.annepidem.2007.12.001 CrossRefGoogle ScholarPubMed
Holick, M. F., & Chen, T. C. (2008). Vitamin D deficiency: a worldwide problem with health consequences. American Journal of Clinical Nutrition, 87, 1080S1086S. https://doi.org/10.1093/ajcn/87.4.1080S CrossRefGoogle ScholarPubMed
Jiang, L., Zheng, Z., Qi, T., Kemper, K. E., Wray, N. R., Visscher, P. M., & Yang, J. (2019). A resource-efficient tool for mixed model association analysis of large-scale data. Nature Genetics, 51, 17491755. https://doi.org/10.1038/s41588-019-0530-8 CrossRefGoogle ScholarPubMed
Jiang, X., Ge, T., & Chen, C. Y. (2021). The causal role of circulating vitamin D concentrations in human complex traits and diseases: A large-scale Mendelian randomization study. Scientific Reports, 11, 184. https://doi.org/10.1038/s41598-020-80655-w CrossRefGoogle Scholar
LeBoff, M. S., Chou, S. H., Murata, E. M., Donlon, C. M., Cook, N. R., Mora, S., Lee, I. M., Kotler, G., Bubes, V., Buring, J. E., & Manson, J. E. (2020). Effects of supplemental vitamin D on bone health outcomes in women and men in the VITamin D and OmegA-3 TriaL (VITAL). Journal of Bone and Mineral Research, 35, 883893. https://doi.org/10.1002/jbmr.3958 CrossRefGoogle Scholar
Lloyd-Jones, L. R., Zeng, J., Sidorenko, J., Yengo, L., Moser, G., Kemper, K. E., Wang, H., Zheng, Z., Magi, R., Esko, T., Metspalu, A., Wray, N. R., Goddard, M. E., Yang, J., & Visscher, P. M. (2019). Improved polygenic prediction by Bayesian multiple regression on summary statistics. Nature Communications, 10, 5086. https://doi.org/10.1038/s41467-019-12653-0 CrossRefGoogle ScholarPubMed
Lucas, A., & Wolf, M. (2019). Vitamin D and health outcomes: Then came the randomized clinical trials. JAMA, 322, 18661868. https://doi.org/10.1001/jama.2019.17302 CrossRefGoogle ScholarPubMed
Manousaki, D., Dudding, T., Haworth, S., Hsu, Y. H., Liu, C. T., Medina-Gomez, C., Voortman, T., van der Velde, N., Melhus, H., Robinson-Cohen, C., Cousminer, D. L., Nethander, M., Vandenput, L., Noordam, R., Forgetta, V., Greenwood, C. M. T., Biggs, M. L., Psaty, B. M., Rotter, J. I., … Richards, J. B. (2017). Low-frequency synonymous coding variation in CYP2R1 has large effects on vitamin D levels and risk of multiple sclerosis. American Journal of Human Genetics, 101, 227238. https://doi.org/10.1016/j.ajhg.2017.06.014 CrossRefGoogle ScholarPubMed
Manson, J. E., Bassuk, S. S., Buring, J. E., & Group, V. R. (2020). Principal results of the VITamin D and OmegA-3 TriaL (VITAL) and updated meta-analyses of relevant vitamin D trials. Journal of Steroid Biochemistry and Molecular Biology, 198, 105522. https://doi.org/10.1016/j.jsbmb.2019.105522 CrossRefGoogle ScholarPubMed
Manson, J. E., Bassuk, S. S., Cook, N. R., Lee, I. M., Mora, S., Albert, C. M., Buring, J. E., & VITAL Research Group. (2020). Vitamin D, marine n-3 fatty acids, and primary prevention of cardiovascular disease current evidence. Circulation Research, 126, 112128. https://doi.org/10.1161/CIRCRESAHA.119.314541 CrossRefGoogle ScholarPubMed
Manson, J. E., Cook, N. R., Lee, I. M., Christen, W., Bassuk, S. S., Mora, S., Gibson, H., Gordon, D., Copeland, T., D’Agostino, D., Friedenberg, G., Ridge, C., Bubes, V., Giovannucci, E. L., Willett, W. C., Buring, J. E., & Research Group, VITAL. (2019). Vitamin D supplements and prevention of cancer and cardiovascular disease. New England Journal of Medicine, 380, 3344. https://doi.org/10.1056/NEJMoa1809944 CrossRefGoogle ScholarPubMed
Manson, J. E., Mora, S., & Cook, N. R. (2019). Marine n-3 fatty acids and vitamin D supplementation and primary prevention. Reply. New England Journal of Medicine, 380, 18791880. https://doi.org/10.1056/NEJMc1902636 Google ScholarPubMed
McCarthy, S., Das, S., Kretzschmar, W., Delaneau, O., Wood, A. R., Teumer, A., Kang, H. M., Fuchsberger, C., Danecek, P., Sharp, K., Luo, Y., Sidore, C., Kwong, A., Timpson, N., Koskinen, S., Vrieze, S., Scott, L. J., Zhang, H., Mahajan, A., … Haplotype Reference Consortium. (2016). A reference panel of 64,976 haplotypes for genotype imputation. Nature Genetics, 48, 12791283. https://doi.org/10.1038/ng.3643 Google ScholarPubMed
McCaw, Z. R., Lane, J. M., Saxena, R., Redline, S., & Lin, X. (2020). Operating characteristics of the rank-based inverse normal transformation for quantitative trait analysis in genome-wide association studies. Biometrics, 76, 12621272. https://doi.org/10.1111/biom.13214 CrossRefGoogle ScholarPubMed
Medrano, M., Carrillo-Cruz, E., Montero, I., & Perez-Simon, J. A. (2018). Vitamin D: Effect on haematopoiesis and immune system and clinical applications. International Journal of Molecular Sciences, 19, 2663. https://doi.org/10.3390/ijms19092663 CrossRefGoogle ScholarPubMed
Meng, H., Matthan, N. R., Angellotti, E., Pittas, A. G., & Lichtenstein, A. H. (2020). Exploring the effect of vitamin D3 supplementation on surrogate biomarkers of cholesterol absorption and endogenous synthesis in patients with type 2 diabetes-randomized controlled trial. American Journal of Clinical Nutrition, 112, 538547. https://doi.org/10.1093/ajcn/nqaa149 CrossRefGoogle ScholarPubMed
Meng, X., Li, X., Timofeeva, M. N., He, Y., Spiliopoulou, A., Wei, W. Q., Gifford, A., Wu, H., Varley, T., Joshi, P., Denny, J. C., Farrington, S. M., Zgaga, L., Dunlop, M. G., McKeigue, P., Campbell, H., & Theodoratou, E. (2019). Phenome-wide Mendelian-randomization study of genetically determined vitamin D on multiple health outcomes using the UK Biobank study. International Journal of Epidemiology, 48, 14251434. https://doi.org/10.1093/ije/dyz182 CrossRefGoogle ScholarPubMed
Mokry, L. E., Ross, S., Ahmad, O. S., Forgetta, V., Smith, G. D., Leong, A., Greenwood, C. M., Thanassoulis, G., & Richards, J. B. (2015). Vitamin D and risk of multiple sclerosis: A Mendelian randomization study. PLoS Medicine, 12, e1001866. https://doi.org/10.1371/journal.pmed.1001866 CrossRefGoogle ScholarPubMed
Morris, R., Butler, L., Perkins, A., Kershaw, N. J., & Babon, J. J. (2021). The role of LNK (SH2B3) in the regulation of JAK-STAT signalling in haematopoiesis. Pharmaceuticals (Basel), 15, 24. https://doi.org/10.3390/ph15010024 CrossRefGoogle ScholarPubMed
Munafo, M. R., Tilling, K., Taylor, A. E., Evans, D. M., & Davey Smith, G. (2018). Collider scope: when selection bias can substantially influence observed associations. International Journal of Epidemiology, 47, 226235. https://doi.org/10.1093/ije/dyx206 CrossRefGoogle ScholarPubMed
Neale, R. E., Baxter, C., Romero, B. D., McLeod, D. S. A., English, D. R., Armstrong, B. K., Ebeling, P. R., Hartel, G., Kimlin, M. G., O’Connell, R., van der Pols, J. C., Venn, A. J., Webb, P. M., Whiteman, D. C., & Waterhouse, M. (2022). The D-Health Trial: A randomised controlled trial of the effect of vitamin D on mortality. Lancet Diabetes & Endocrinology, 10, 120128. https://doi.org/10.1016/S2213-8587(21)00345-4 CrossRefGoogle Scholar
Ohlund, I., Lind, T., Hernell, O., Silfverdal, S. A., Liv, P., & Karlsland Akeson, P. (2020). Vitamin D status and cardiometabolic risk markers in young Swedish children: A double-blind randomized clinical trial comparing different doses of vitamin D supplements. American Journal of Clinical Nutrition, 111, 779786. https://doi.org/10.1093/ajcn/nqaa031 CrossRefGoogle ScholarPubMed
1000 Genomes Project Consortium, Auton, A., Brooks, L. D., Durbin, R. M., Garrison, E. P., Kang, H. M., Korbel, J. O., Marchini, J. L., McCarthy, S., McVean, G. A., & Abecasis, G. R. (2015). A global reference for human genetic variation. Nature, 526, 6874. https://doi.org/10.1038/nature15393 Google ScholarPubMed
Ong, J. S., Cuellar-Partida, G., Lu, Y., Ovarian Cancer Study, A., Fasching, P. A., Hein, A., Burghaus, S., Beckmann, M. W., Lambrechts, D., Van Nieuwenhuysen, E., Vergote, I., Vanderstichele, A., Anne Doherty, J., Anne Rossing, M., Chang-Claude, J., Eilber, U., Rudolph, A., Wang-Gohrke, S., Goodman, M. T., … MacGregor, S. (2016). Association of vitamin D levels and risk of ovarian cancer: A Mendelian randomization study. International Journal of Epidemiology, 45, 16191630. https://doi.org/10.1093/ije/dyw207 CrossRefGoogle ScholarPubMed
Pedersen, C. B., Bybjerg-Grauholm, J., Pedersen, M. G., Grove, J., Agerbo, E., Baekvad-Hansen, M., Poulsen, J. B., Hansen, C. S., McGrath, J. J., Als, T. D., Goldstein, J. I., Neale, B. M., Daly, M. J., Hougaard, D. M., Mors, O., Nordentoft, M., Borglum, A. D., Werge, T., & Mortensen, P. B. (2018). The iPSYCH2012 case-cohort sample: New directions for unravelling genetic and environmental architectures of severe mental disorders. Molecular Psychiatry, 23, 614. https://doi.org/10.1038/mp.2017.196 CrossRefGoogle ScholarPubMed
Pietzner, M., Wheeler, E., Carrasco-Zanini, J., Cortes, A., Koprulu, M., Worheide, M. A., Oerton, E., Cook, J., Stewart, I. D., Kerrison, N. D., Luan, J., Raffler, J., Arnold, M., Arlt, W., O’Rahilly, S., Kastenmuller, G., Gamazon, E. R., Hingorani, A. D., Scott, R. A., … Langenberg, C. (2021). Mapping the proteo-genomic convergence of human diseases. Science, 374, eabj1541. https://doi.org/10.1126/science.abj1541 CrossRefGoogle ScholarPubMed
Prabhu, A. V., Luu, W., Sharpe, L. J., & Brown, A. J. (2016). Cholesterol-mediated degradation of 7-dehydrocholesterol reductase switches the balance from cholesterol to vitamin D synthesis. Journal of Biological Chemistry, 291, 83638373. https://doi.org/10.1074/jbc.M115.699546 Google ScholarPubMed
Price, A. L., Patterson, N. J., Plenge, R. M., Weinblatt, M. E., Shadick, N. A., & Reich, D. (2006). Principal components analysis corrects for stratification in genome-wide association studies. Nature Genetics, 38, 904909. https://doi.org/10.1038/ng1847 CrossRefGoogle ScholarPubMed
Pulley, J. M., Brace, M. M., Bernard, G. R., & Masys, D. R. (2008). Attitudes and perceptions of patients towards methods of establishing a DNA biobank. Cell Tissue Bank, 9, 5565. https://doi.org/10.1007/s10561-007-9051-2 CrossRefGoogle ScholarPubMed
Revez, J. A., Lin, T., Qiao, Z., Xue, A., Holtz, Y., Zhu, Z., Zeng, J., Wang, H., Sidorenko, J., Kemper, K. E., Vinkhuyzen, A. A. E., Frater, J., Eyles, D., Burne, T. H. J., Mitchell, B., Martin, N. G., Zhu, G., Visscher, P. M., Yang, J., … McGrath, J. J. (2020). Genome-wide association study identifies 143 loci associated with 25 hydroxyvitamin D concentration. Nature Communications, 11, 1647. https://doi.org/10.1038/s41467-020-15421-7 CrossRefGoogle ScholarPubMed
Rhead, B., Baarnhielm, M., Gianfrancesco, M., Mok, A., Shao, X., Quach, H., Shen, L., Schaefer, C., Link, J., Gyllenberg, A., Hedstrom, A. K., Olsson, T., Hillert, J., Kockum, I., Glymour, M. M., Alfredsson, L., & Barcellos, L. F. (2016). Mendelian randomization shows a causal effect of low vitamin D on multiple sclerosis risk. Neurology Genetics, 2, e97. https://doi.org/10.1212/NXG.0000000000000097 CrossRefGoogle Scholar
Ritchie, M. D., Denny, J. C., Crawford, D. C., Ramirez, A. H., Weiner, J. B., Pulley, J. M., Basford, M. A., Brown-Gentry, K., Balser, J. R., Masys, D. R., Haines, J. L., & Roden, D. M. (2010). Robust replication of genotype-phenotype associations across multiple diseases in an electronic medical record. American Journal of Human Genetics, 86, 560572. https://doi.org/10.1016/j.ajhg.2010.03.003 CrossRefGoogle Scholar
Roden, D. M., Pulley, J. M., Basford, M. A., Bernard, G. R., Clayton, E. W., Balser, J. R., & Masys, D. R. (2008). Development of a large-scale de-identified DNA biobank to enable personalized medicine. Clinical Pharmacology and Therapeutics, 84, 362369. https://doi.org/10.1038/clpt.2008.89 Google ScholarPubMed
Sirugo, G., Williams, S. M., & Tishkoff, S. A. (2019). The missing diversity in human genetic studies. Cell, 177, 1080. https://doi.org/10.1016/j.cell.2019.04.032 CrossRefGoogle ScholarPubMed
Sun, B. B., Maranville, J. C., Peters, J. E., Stacey, D., Staley, J. R., Blackshaw, J., Burgess, S., Jiang, T., Paige, E., Surendran, P., Oliver-Williams, C., Kamat, M. A., Prins, B. P., Wilcox, S. K., Zimmerman, E. S., Chi, A., Bansal, N., Spain, S. L., Wood, A. M., … Butterworth, A. S. (2018). Genomic atlas of the human plasma proteome. Nature, 558, 7379. https://doi.org/10.1038/s41586-018-0175-2 Google ScholarPubMed
Wang, Y., Guo, J., Ni, G., Yang, J., Visscher, P. M., & Yengo, L. (2020). Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations. Nature Communications, 11, 3865. https://doi.org/10.1038/s41467-020-17719-y CrossRefGoogle ScholarPubMed
Wei, W. Q., Bastarache, L. A., Carroll, R. J., Marlo, J. E., Osterman, T. J., Gamazon, E. R., Cox, N. J., Roden, D. M., & Denny, J. C. (2017). Evaluating phecodes, clinical classification software, and ICD-9-CM codes for phenome-wide association studies in the electronic health record. PloS One, 12, e0175508. https://doi.org/10.1371/journal.pone.0175508 CrossRefGoogle ScholarPubMed
Zerwekh, J. E. (2008). Blood biomarkers of vitamin D status. American Journal of Clinical Nutrition, 87, 1087S1091S. https://doi.org/10.1093/ajcn/87.4.1087S CrossRefGoogle ScholarPubMed
Zhao, S., Jing, W., Samuels, D. C., Sheng, Q., Shyr, Y., & Guo, Y. (2018). Strategies for processing and quality control of Illumina genotyping arrays. Briefings in Bioinformatics, 19, 765775. https://doi.org/10.1093/bib/bbx012 CrossRefGoogle ScholarPubMed
Zittermann, A., Ernst, J. B., Birschmann, I., & Dittrich, M. (2015). Effect of Vitamin D or Activated Vitamin D on Circulating 1,25-Dihydroxyvitamin D Concentrations: A systematic review and metaanalysis of randomized controlled trials. Clinical Chemistry, 61, 14841494. https://doi.org/10.1373/clinchem.2015.244913 CrossRefGoogle ScholarPubMed
Zou, L., & Porter, T. D. (2015). Rapid suppression of 7-dehydrocholesterol reductase activity in keratinocytes by vitamin D. Journal of Steroid Biochemistry and Molecular Biology, 148, 6471. https://doi.org/10.1016/j.jsbmb.2014.12.001 CrossRefGoogle ScholarPubMed
Figure 0

Table 1. Counts and univariates statistics for key demographic variables of the European and African ancestry groups

Figure 1

Figure 1. The association between PGS25OHD and disease phenotypes in individuals with primarily European ancestry (n = 66,482).Note: Associations for 1322 phenotypes are shown. On the x-axis, the phenotypes clustered according to broad phenotype categories represented by different colors. P values are shown on the y-axis, with upturned triangles representing positive associations and downturned triangles representing negative associations. The top phenotypes with p values exceeding the Bonferroni multiple testing threshold (p < 3.78e-5), are labeled. Full details are provided in Supplementary data 1.

Figure 2

Figure 2. The association between PGS25OHD and laboratory results, in individuals with primarily European ancestry (n = 66,482).Note: Associations for 315 laboratory results are shown. On the x-axis, the laboratory tests are clustered according to broad organ or pathology categories, represented by different colors. P values are shown on the y-axis, with upturned triangles representing positive associations and downturned triangles representing negative associations. Laboratory tests with p-values exceeding the Bonferroni multiple testing threshold (p < 1.59e-04), shown as a pink horizontal reference line, are labeled. Full details are provided in Supplementary data 2. 25OHD_a2 and 25OHD_a3 are two different types of 25 hydroxyvitamin D assays. 1,25OHD_a1 and 1,25OHD_a4 are two different types of 1,25 dihydroxyvitamin D assays. Trigs, triglycerides; Chol, cholesterol, LDL.C, low density lipoprotein cholesterol; Gluc, glucose.

Figure 3

Figure 3. The associations between PGSDBP, PGSDBP_GC and disease phenotypes in individuals with primarily European ancestry (n = 66,482).Note: Panel A, PheWAS for PGSDBP. Panel B, PGSDBP_GC. Associations for 1322 phenotypes are shown. On the x-axis, the phenotypes clustered according to broad phenotypes represented by different colors. P values are shown on the y-axis, with upturned triangles representing positive associations and downturned triangles representing negative associations. The five phenotypes with the smallest p values are labeled; however, none of phenotypes exceeded the Bonferroni multiple testing threshold (p < 3.78e-05). In Panel B, Vitamin D deficiency is also labeled for reference. Full details are provided in Supplementary Data 3 and 5.

Figure 4

Figure 4. The associations between PGSDBP, PGSDBP_GC and laboratory measures in individuals with primarily European ancestry (n = 66,482).Note: Panel A, LabWAS for PGSDBP. Panel B, LabWAS for PGSDBP_GC. Associations for 315 laboratory results are shown. On the x-axis, the laboratory tests are clustered according to broad organ or pathology categories, represented by different colors. P values are shown on the y-axis, with upturned triangles representing positive associations and downturned triangles representing negative associations. Laboratory tests with p values exceeding the Bonferroni multiple testing threshold (p < 1.59e-04), shown as a pink horizontal reference line, are labeled. 25OHD_a2 and 25OHD_a3 are two different types of 25 hydroxyvitamin D assays. 1,25OHD_a1 and 1,25OHD_a4 are two different types of 1,25 dihydroxyvitamin D assays (see Methods). WBC, leukocytes (#/volume) in blood by automated count. LymAbs, lymphocytes (#/volume) in blood by automated count. MonAbs, absolute count of monocytes. NtAb, absolute count of neutrophils. EoAb, absolute count of eosinophils. Full details are provided in Supplementary data 4 and 6.

Supplementary material: File

Kresge et al. supplementary material

Kresge et al. supplementary material
Download Kresge et al. supplementary material(File)
File 2.7 MB