Polygenic scores (PGSs) have excited biomedical researchers for years, but they have recently received increasing attention from social scientists interested in normal psychological variation (Harden, Reference Harden2021). PGSs have been touted as a way to study “the causes and consequences” of even complex behavioral phenotypes like intelligence (Plomin & von Stumm, Reference Plomin and von Stumm2018, p. 148). However, their utility remains as controversial as that of the genome-wide association studies (GWASs) from which they derive (Charney, Reference Charney2022). Burt's target article provides a potentially devastating critique of the value of these scores.
Burt correctly notes that all serious scientists accept that genetic differences “influence – in some complex, context-dependent way – developmental differences.” Genetic variation is associated with phenotypic variation in part because DNA is used in the causal chain of events that builds phenotypes. Even so, this acknowledgment does not mean PGSs can offer useful predictions about individual outcomes, let alone causal insights about how to helpfully affect developmental processes.
As Turkheimer (Reference Turkheimer, Plaisance and Reydon2012) has observed, the statistical tools employed by social scientists working with genomic data have “not succeeded in discriminating actual causal processes from spurious correlations and non-causal associations” (p. 51). As a result, genomic social science can be “causally refractory… no one is about to use polygenic scores to figure out why children excel or fail in school or become addicted to drugs” (Turkheimer, Reference Turkheimer2019, p. 46). Understanding causation in ways that permit beneficial intervention requires experimental studies that are entirely unlike the correlational research that generates PGSs.
In fact, even if a researcher's goal is to predict rather than to elucidate causation, the correlations that yield many PGSs cannot be trusted. For example, the enormity of the data set used by Lee et al. (Reference Lee, Wedow, Okbay, Kong, Maghzian, Zacher and Cesarini2018) – which involved a sample of 1.1 million individuals and produced one of the most highly regarded social science PGSs to date – ensured that some arbitrary correlations would inevitably appear to be “significant” (see Richardson & Jones [Reference Richardson and Jones2019] for this argument). Consequently, it is unsurprising that Morris, Davies, and Smith (Reference Morris, Davies and Smith2020) found educational-outcome PGSs to have predictive accuracy that is “poor…at the individual level [and] … . inferior to [that associated with] parental socioeconomic factors. [These scores] failed to accurately predict later achievement…[and] currently have limited use for accurately predicting individual educational performance” (p. 1). Likewise, Harden and Koellinger (Reference Harden and Koellinger2020) wrote “even the best currently available PGS for behavioural outcomes cannot make accurate predictions for the outcome of any specific individual” (p. 570).
Clearly, PGSs cannot be appropriately used for predicting individual outcomes. But making predictions is the best that correlational studies like GWASs can offer; because correlation does not indicate causation, GWASs cannot deliver effective treatments for behavioral challenges, either. If PGSs cannot be used to accurately predict individual outcomes and if the GWAS that gives rise to them cannot inspire effective interventions, these approaches should be understood to be of negligible value.
If the intended purpose of PGSs is to reveal something about individuals' “genetic propensities” (Harden et al., Reference Harden, Domingue, Belsky, Boardman, Crosnoe, Malanchini and Harris2020, pp. 1, 2, 5), this negative assessment of their value does not merely reflect an immature state of the art. Instead, it is unlikely that PGSs will ever be of much value. This is because DNA segments are used differently in different contexts (Lickliter, Reference Lickliter2017; Moore, Reference Moore2001; Noble, Reference Noble2006, Reference Noble2012; Pan, Shai, Lee, Frey, & Blencowe, Reference Pan, Shai, Lee, Frey and Blencowe2008; Waddington, Reference Waddington1957, Reference Waddington and Waddington1968). As all phenotypes can be influenced by variable non-genetic factors, there can be no absolute “genetic propensity” for any phenotype, as a “propensity” in one context could very well not be a “propensity” in another context. Ultimately, the notion of “genetic potential” is unfounded, because genetic factors specify a norm of reaction, not a restricted range of reaction (Gottlieb, Reference Gottlieb1995); because phenotypic outcomes are not constrained by genomes that operate in context-independent ways, it will always be impossible to identify context-independent “genetic propensities.” Remember, our developmental contexts are not fixed – after all, humans throughout history have continually invented new modes of education that have exposed children to never-before-experienced contexts – so a genotype that contributes to a below average phenotype in many contexts could nonetheless contribute to an above average phenotype in other not-yet-explored contexts (Lewontin, Reference Lewontin2000). As Burt stated, “the context-specificity of PGSs… precludes their use as ‘genetic potential’ in general.” I agree: Future refinement of PGSs will still not allow them to accurately characterize individuals' “genetic propensities.”
Contexts are crucial in phenotypic development in part because they affect the epigenetic states of genomes, thereby altering how those genomes work (Moore, Reference Moore2017). Burt's article draws appropriately critical conclusions regarding PGSs, but it omits mention of this important phenomenon. Although one's genetic sequence is thought to remain unchanged across the lifespan – which PGS proponents consider to be a strength of these scores – it is now clear that identical genomes can function differently depending on the experiential histories of the individual twins containing those genomes (Fraga et al., Reference Fraga, Ballestar, Paz, Ropero, Setien, Ballestar and Esteller2005; Morgan, Sutherland, Martin, & Whitelaw, Reference Morgan, Sutherland, Martin and Whitelaw1999). As a result, sequence data alone cannot lead to accurate predictions about developmental outcomes; the mere presence, in a cell, of a DNA segment with a particular sequence will have no functional consequences if that segment has been dramatically down-regulated via epigenetic mechanisms such as DNA methylation or histone modification (Moore, Reference Moore and Zelazo2013, Reference Moore2015, Reference Moore2016). And because experiential factors like social status (Tung et al., Reference Tung, Barreiro, Johnson, Hansen, Michopoulos, Toufexis and Gilad2012), diet (Morgan et al., Reference Morgan, Sutherland, Martin and Whitelaw1999), and maternal deprivation (Provencal et al., Reference Provencal, Suderman, Guillemin, Massart, Ruggiero, Wang and Szyf2012), for example, have been experimentally shown to epigenetically change genomic activity and phenotypic outcomes in mammals, the idea that evaluating a genome at conception could provide accurate insights into much-later-developing phenotypes should be recognized as fundamentally flawed.
Twenty-first century instantiations of behavioral genetics – including GWASs and the PGSs they generate – remain targets of valid criticism (Charney, Reference Charney2022; Richardson & Jones, Reference Richardson and Jones2019; Turkheimer, Reference Turkheimer, Plaisance and Reydon2012). Given molecular biologists' understanding that DNA, epigenetic processes, and contextual factors work together in interdependent ways to produce phenotypes that are in no way pre-specified in the genome, these latest attempts to predict behavioral outcomes from DNA sequence information alone are bound to fail.
Polygenic scores (PGSs) have excited biomedical researchers for years, but they have recently received increasing attention from social scientists interested in normal psychological variation (Harden, Reference Harden2021). PGSs have been touted as a way to study “the causes and consequences” of even complex behavioral phenotypes like intelligence (Plomin & von Stumm, Reference Plomin and von Stumm2018, p. 148). However, their utility remains as controversial as that of the genome-wide association studies (GWASs) from which they derive (Charney, Reference Charney2022). Burt's target article provides a potentially devastating critique of the value of these scores.
Burt correctly notes that all serious scientists accept that genetic differences “influence – in some complex, context-dependent way – developmental differences.” Genetic variation is associated with phenotypic variation in part because DNA is used in the causal chain of events that builds phenotypes. Even so, this acknowledgment does not mean PGSs can offer useful predictions about individual outcomes, let alone causal insights about how to helpfully affect developmental processes.
As Turkheimer (Reference Turkheimer, Plaisance and Reydon2012) has observed, the statistical tools employed by social scientists working with genomic data have “not succeeded in discriminating actual causal processes from spurious correlations and non-causal associations” (p. 51). As a result, genomic social science can be “causally refractory… no one is about to use polygenic scores to figure out why children excel or fail in school or become addicted to drugs” (Turkheimer, Reference Turkheimer2019, p. 46). Understanding causation in ways that permit beneficial intervention requires experimental studies that are entirely unlike the correlational research that generates PGSs.
In fact, even if a researcher's goal is to predict rather than to elucidate causation, the correlations that yield many PGSs cannot be trusted. For example, the enormity of the data set used by Lee et al. (Reference Lee, Wedow, Okbay, Kong, Maghzian, Zacher and Cesarini2018) – which involved a sample of 1.1 million individuals and produced one of the most highly regarded social science PGSs to date – ensured that some arbitrary correlations would inevitably appear to be “significant” (see Richardson & Jones [Reference Richardson and Jones2019] for this argument). Consequently, it is unsurprising that Morris, Davies, and Smith (Reference Morris, Davies and Smith2020) found educational-outcome PGSs to have predictive accuracy that is “poor…at the individual level [and] … . inferior to [that associated with] parental socioeconomic factors. [These scores] failed to accurately predict later achievement…[and] currently have limited use for accurately predicting individual educational performance” (p. 1). Likewise, Harden and Koellinger (Reference Harden and Koellinger2020) wrote “even the best currently available PGS for behavioural outcomes cannot make accurate predictions for the outcome of any specific individual” (p. 570).
Clearly, PGSs cannot be appropriately used for predicting individual outcomes. But making predictions is the best that correlational studies like GWASs can offer; because correlation does not indicate causation, GWASs cannot deliver effective treatments for behavioral challenges, either. If PGSs cannot be used to accurately predict individual outcomes and if the GWAS that gives rise to them cannot inspire effective interventions, these approaches should be understood to be of negligible value.
If the intended purpose of PGSs is to reveal something about individuals' “genetic propensities” (Harden et al., Reference Harden, Domingue, Belsky, Boardman, Crosnoe, Malanchini and Harris2020, pp. 1, 2, 5), this negative assessment of their value does not merely reflect an immature state of the art. Instead, it is unlikely that PGSs will ever be of much value. This is because DNA segments are used differently in different contexts (Lickliter, Reference Lickliter2017; Moore, Reference Moore2001; Noble, Reference Noble2006, Reference Noble2012; Pan, Shai, Lee, Frey, & Blencowe, Reference Pan, Shai, Lee, Frey and Blencowe2008; Waddington, Reference Waddington1957, Reference Waddington and Waddington1968). As all phenotypes can be influenced by variable non-genetic factors, there can be no absolute “genetic propensity” for any phenotype, as a “propensity” in one context could very well not be a “propensity” in another context. Ultimately, the notion of “genetic potential” is unfounded, because genetic factors specify a norm of reaction, not a restricted range of reaction (Gottlieb, Reference Gottlieb1995); because phenotypic outcomes are not constrained by genomes that operate in context-independent ways, it will always be impossible to identify context-independent “genetic propensities.” Remember, our developmental contexts are not fixed – after all, humans throughout history have continually invented new modes of education that have exposed children to never-before-experienced contexts – so a genotype that contributes to a below average phenotype in many contexts could nonetheless contribute to an above average phenotype in other not-yet-explored contexts (Lewontin, Reference Lewontin2000). As Burt stated, “the context-specificity of PGSs… precludes their use as ‘genetic potential’ in general.” I agree: Future refinement of PGSs will still not allow them to accurately characterize individuals' “genetic propensities.”
Contexts are crucial in phenotypic development in part because they affect the epigenetic states of genomes, thereby altering how those genomes work (Moore, Reference Moore2017). Burt's article draws appropriately critical conclusions regarding PGSs, but it omits mention of this important phenomenon. Although one's genetic sequence is thought to remain unchanged across the lifespan – which PGS proponents consider to be a strength of these scores – it is now clear that identical genomes can function differently depending on the experiential histories of the individual twins containing those genomes (Fraga et al., Reference Fraga, Ballestar, Paz, Ropero, Setien, Ballestar and Esteller2005; Morgan, Sutherland, Martin, & Whitelaw, Reference Morgan, Sutherland, Martin and Whitelaw1999). As a result, sequence data alone cannot lead to accurate predictions about developmental outcomes; the mere presence, in a cell, of a DNA segment with a particular sequence will have no functional consequences if that segment has been dramatically down-regulated via epigenetic mechanisms such as DNA methylation or histone modification (Moore, Reference Moore and Zelazo2013, Reference Moore2015, Reference Moore2016). And because experiential factors like social status (Tung et al., Reference Tung, Barreiro, Johnson, Hansen, Michopoulos, Toufexis and Gilad2012), diet (Morgan et al., Reference Morgan, Sutherland, Martin and Whitelaw1999), and maternal deprivation (Provencal et al., Reference Provencal, Suderman, Guillemin, Massart, Ruggiero, Wang and Szyf2012), for example, have been experimentally shown to epigenetically change genomic activity and phenotypic outcomes in mammals, the idea that evaluating a genome at conception could provide accurate insights into much-later-developing phenotypes should be recognized as fundamentally flawed.
Twenty-first century instantiations of behavioral genetics – including GWASs and the PGSs they generate – remain targets of valid criticism (Charney, Reference Charney2022; Richardson & Jones, Reference Richardson and Jones2019; Turkheimer, Reference Turkheimer, Plaisance and Reydon2012). Given molecular biologists' understanding that DNA, epigenetic processes, and contextual factors work together in interdependent ways to produce phenotypes that are in no way pre-specified in the genome, these latest attempts to predict behavioral outcomes from DNA sequence information alone are bound to fail.
Financial support
This research received no specific grant from any funding agency, commercial, or not-for-profit sectors.
Competing interest
None.