Burt must be strongly commended for challenging attempts to use polygenic scores (PGSs) in the social sciences. She is correct to note and emphasize the problems with any such attempt, especially those posed by haplotype–environment interactions and the unknown developmental biology of behaviorally relevant complex traits, especially in humans. However, the technical problems with the construction and interpretation of PGS are much worse than what she presents.
PGSs are supposed to be indicative of the causal contribution of genes to phenotypes in individuals (Harden, Reference Harden2021; Plomin & von Stumm, Reference Plomin and von Stumm2018). The first problem with PGS is that they are based on quantitative estimates of associations between alleles and phenotypes obtained from genome-wide association studies (GWASs). But these associations are notoriously population dependent: Even the same physical trait (for instance, skin, iris, or hair pigmentation in humans [Sarkar, Reference Sarkar2021, pp. 140–142]) is associated with different sets of loci (and alleles at these loci) in different populations. Consequently, any attempt to construct a causal account from these associations must provide independent warrant for causal attributions (Woodward, Reference Woodward2005). None has been forthcoming even though GWASs have a multi-decade history (Sarkar, Reference Sarkar1998, Reference Sarkar2021).
A more important problem is that PGSs are constructed by compounding these associations between different alleles and a trait over multiple (often enough thousands of) loci. The simplest compounding strategy is to use a weighted sum but, as Burt notes, more complicated statistical compounding techniques are also routinely used. The trouble is that, although these compounding methods are designed to eliminate bias arising from non-representative sampling of genomes, none of them incorporates the biological mechanisms by which a trait is generated during organismic development from zygote to adult (in sexual organisms), that is, they ignore the mechanisms that would empirically indicate which alleles at which loci are causally most relevant. Moreover, all extant compounding methods rely on adding contributions from different alleles at each implicated locus.
Thus, the calculation of PGSs assumes an underlying linear model of gene action as did traditional heritability analysis. Because of that, they inherit all the non-additivity problems with heritability estimates that were recognized in the 1970s (Sarkar, Reference Sarkar1998). The context then was the attempt to establish a causal connection between race and intelligence by figures such as Jensen (Reference Jensen1969). Critics not only challenged Jensen's conclusions but also the methodology of heritability analysis on the grounds of illegitimate assumptions about the additivity of gene action that ignored interactions between alleles within loci (dominance), between loci (epistasis), and between genotype and environment. The names of these critics read like a “Who's Who” of theoretical population and quantitative genetics of the 1970s: Feldman and Lewontin (Reference Feldman and Lewontin1975), Jacquard (Reference Jacquard1983), Kempthorne (Reference Kempthorne1978), and Lewontin (Reference Lewontin1974).
Most importantly, Layzer (Reference Layzer1974) analyzed in detail a causal model with the phenotype (P) being described as a mathematical function of genotype (G) and environment (E): P = f(G, E) with no constraint on the functional form (f). This very general assumption is enough to show that the phenotypic variance cannot be modeled as a sum only of variances (e.g., the genotypic variance, the environmental variance, and a gene–environment interaction variance). Rather the phenotypic variance must include a large number of covariances between variables. Thus no additive model, however enhanced (as is supposedly the case for PGSs), can capture the variability of phenotypes, let alone the phenotypic values in individuals. (Additionally, in humans, the required covariances are impossible to estimate from accessible empirical data.)
Together, these results showed that: (i) heritability estimates do not allow causal inferences because of the additivity of variance problem; (ii) dependence of heritability estimates on the genotypic composition of population (which changes every generation); (iii) dependence of heritability estimates on limitations in the environments to which a population have been exposed; and (iv) dependence on interaction mechanisms such as dominance and epistasis besides those between genotype and environment. This work was synthesized in Sarkar (Reference Sarkar1998).
Post-Human Genome Project (HGP), the emergence of GWASs led to the revival of these criticisms by many prominent figures including Lander (Zuk, Hechter, Sunyaev, & Lander, Reference Zuk, Hechter, Sunyaev and Lander2012) and Feldman (Feldman & Ramachandran, Reference Feldman and Ramachandran2018) in discussions of the so-called missing heritability problem. It was correctly pointed out that traditional heritability scores were over-estimates because of invalid additivity assumptions. Moreover, results (ii) and (ii) from the previous paragraph explain the population and context dependence of GWAS association values.
PGSs are touted as having sidestepped these problems (Plomin & von Stumm, Reference Plomin and von Stumm2018) but such claims are not credible. As noted earlier, PGS computation assumes an additive model of gene action that has been discredited by theoretical critiques of heritability analyses from the 1970s. Of course, this situation still admits the possibility that PGS values make accurate empirical predictions of phenotype but there has been no evidence produced for any such claim: Proponents of PGS use have been remarkably unwilling to make definite quantitative prediction of phenotypic values. Against this background it takes a very vivid imagination to believe that phenotypes can be determined according to an additive model of gene action that allows for no relevant interactions between alleles, loci, genotype, and environment (Sarkar, Reference Sarkar2021). For time being PGS seems to be more akin to astrology than science: full of calculations based on no more than pious beliefs such as a commitment to genetic determinism and reductionism. The social sciences would do well to ignore PGS entirely.
Burt must be strongly commended for challenging attempts to use polygenic scores (PGSs) in the social sciences. She is correct to note and emphasize the problems with any such attempt, especially those posed by haplotype–environment interactions and the unknown developmental biology of behaviorally relevant complex traits, especially in humans. However, the technical problems with the construction and interpretation of PGS are much worse than what she presents.
PGSs are supposed to be indicative of the causal contribution of genes to phenotypes in individuals (Harden, Reference Harden2021; Plomin & von Stumm, Reference Plomin and von Stumm2018). The first problem with PGS is that they are based on quantitative estimates of associations between alleles and phenotypes obtained from genome-wide association studies (GWASs). But these associations are notoriously population dependent: Even the same physical trait (for instance, skin, iris, or hair pigmentation in humans [Sarkar, Reference Sarkar2021, pp. 140–142]) is associated with different sets of loci (and alleles at these loci) in different populations. Consequently, any attempt to construct a causal account from these associations must provide independent warrant for causal attributions (Woodward, Reference Woodward2005). None has been forthcoming even though GWASs have a multi-decade history (Sarkar, Reference Sarkar1998, Reference Sarkar2021).
A more important problem is that PGSs are constructed by compounding these associations between different alleles and a trait over multiple (often enough thousands of) loci. The simplest compounding strategy is to use a weighted sum but, as Burt notes, more complicated statistical compounding techniques are also routinely used. The trouble is that, although these compounding methods are designed to eliminate bias arising from non-representative sampling of genomes, none of them incorporates the biological mechanisms by which a trait is generated during organismic development from zygote to adult (in sexual organisms), that is, they ignore the mechanisms that would empirically indicate which alleles at which loci are causally most relevant. Moreover, all extant compounding methods rely on adding contributions from different alleles at each implicated locus.
Thus, the calculation of PGSs assumes an underlying linear model of gene action as did traditional heritability analysis. Because of that, they inherit all the non-additivity problems with heritability estimates that were recognized in the 1970s (Sarkar, Reference Sarkar1998). The context then was the attempt to establish a causal connection between race and intelligence by figures such as Jensen (Reference Jensen1969). Critics not only challenged Jensen's conclusions but also the methodology of heritability analysis on the grounds of illegitimate assumptions about the additivity of gene action that ignored interactions between alleles within loci (dominance), between loci (epistasis), and between genotype and environment. The names of these critics read like a “Who's Who” of theoretical population and quantitative genetics of the 1970s: Feldman and Lewontin (Reference Feldman and Lewontin1975), Jacquard (Reference Jacquard1983), Kempthorne (Reference Kempthorne1978), and Lewontin (Reference Lewontin1974).
Most importantly, Layzer (Reference Layzer1974) analyzed in detail a causal model with the phenotype (P) being described as a mathematical function of genotype (G) and environment (E): P = f(G, E) with no constraint on the functional form (f). This very general assumption is enough to show that the phenotypic variance cannot be modeled as a sum only of variances (e.g., the genotypic variance, the environmental variance, and a gene–environment interaction variance). Rather the phenotypic variance must include a large number of covariances between variables. Thus no additive model, however enhanced (as is supposedly the case for PGSs), can capture the variability of phenotypes, let alone the phenotypic values in individuals. (Additionally, in humans, the required covariances are impossible to estimate from accessible empirical data.)
Together, these results showed that: (i) heritability estimates do not allow causal inferences because of the additivity of variance problem; (ii) dependence of heritability estimates on the genotypic composition of population (which changes every generation); (iii) dependence of heritability estimates on limitations in the environments to which a population have been exposed; and (iv) dependence on interaction mechanisms such as dominance and epistasis besides those between genotype and environment. This work was synthesized in Sarkar (Reference Sarkar1998).
Post-Human Genome Project (HGP), the emergence of GWASs led to the revival of these criticisms by many prominent figures including Lander (Zuk, Hechter, Sunyaev, & Lander, Reference Zuk, Hechter, Sunyaev and Lander2012) and Feldman (Feldman & Ramachandran, Reference Feldman and Ramachandran2018) in discussions of the so-called missing heritability problem. It was correctly pointed out that traditional heritability scores were over-estimates because of invalid additivity assumptions. Moreover, results (ii) and (ii) from the previous paragraph explain the population and context dependence of GWAS association values.
PGSs are touted as having sidestepped these problems (Plomin & von Stumm, Reference Plomin and von Stumm2018) but such claims are not credible. As noted earlier, PGS computation assumes an additive model of gene action that has been discredited by theoretical critiques of heritability analyses from the 1970s. Of course, this situation still admits the possibility that PGS values make accurate empirical predictions of phenotype but there has been no evidence produced for any such claim: Proponents of PGS use have been remarkably unwilling to make definite quantitative prediction of phenotypic values. Against this background it takes a very vivid imagination to believe that phenotypes can be determined according to an additive model of gene action that allows for no relevant interactions between alleles, loci, genotype, and environment (Sarkar, Reference Sarkar2021). For time being PGS seems to be more akin to astrology than science: full of calculations based on no more than pious beliefs such as a commitment to genetic determinism and reductionism. The social sciences would do well to ignore PGS entirely.
Financial support
This work received no external funding.
Competing Interest
None.