Hostname: page-component-78c5997874-xbtfd Total loading time: 0 Render date: 2024-11-02T19:00:05.215Z Has data issue: false hasContentIssue false

Often wrong, sometimes useful: Including polygenic scores in social science research

Published online by Cambridge University Press:  11 September 2023

Jason Fletcher*
Affiliation:
Center for Demography of Health and Aging, La Follette School of Public Affairs, University of Wisconsin-Madison, Madison, WI 53706, USA [email protected]

Abstract

This commentary seeks to briefly outline a clear-eyed middle ground between Burt's claims that the inclusion of polygenic scores (PGSs) is essentially useless for social science and proponents' vast overstatements and over-interpretations of these scores. Current practice of including PGSs in social science is often wrong but sometimes useful.

Type
Open Peer Commentary
Copyright
Copyright © The Author(s), 2023. Published by Cambridge University Press

Burt's goal is to challenge arguments about the value of including genetic measurements in social science research. The author focuses on a subset of “genetics” – the use of polygenic scores (PGSs) and lists four key “limitations of PGS that undermine their utility for social science.” I'll summarize these as (1) they are not “purely” genetics and are thus confounded (2) the causal mechanism is unclear; in cases where the mechanism is environmental, this is a labeled “downward causation” which is said to produce “artificial” genetic signals (3) they are incomplete measures of genetic variation and (4) their interpretation is context-dependent. On their face, these four limitations would seemingly apply to, essentially, all variables used in social science research – and this is a key double-edged sword of exploring the use of “genetics” in social science: To on the one hand treat them as special and on the other treat them as “regular” variables. Proponents want them to be treated as regular when evaluating their general use and special when interpreting their effects and opponents want the opposite. Like other commenters, Burt's arguments are too unfocused and often imprecise, in my view; focusing on the dissonance of opponent's treatment of these variables without acknowledging the dissonance in her own arguments. The arguments lack the specificity that is needed and conflates issues of interpretation of the PGS in an empirical application with the net-scientific-value of including the PGS at all. Instead, I believe the two key features of using PGSs are its utility in the specific application and a need to under-, rather than over-, interpret the PGS as “genetic” effectsFootnote 1 at all. Like models, PGS inclusion can be wrong but useful. Unfortunately, many proponents want to leverage the “wrong” to strengthen arguments of the importance of genetics more broadly and put less emphasis on the “useful.”

Burt is absolutely correct that the ambiguous nature of a PGS's interpretation has led far too many investigators to over-interpret and narrowly label a PGS as “genetic,” often to elevate the perceived importance of “genetics” in contributing to social science outcomes.Footnote 2 For example, many investigators aspire to specifically distinguish a “genetic” effect from an effect stemming from a broader “family background” source. At present, I believe this effort is a fool's errand and research that attempts such a separation should be understood as over-stepping and over interpreting and largely dismissed as such (Fletcher, Wu, Li, & Lu, Reference Fletcher, Wu, Li and Lu2021).

However, let's return to some purported uses of PGSs that may shed light rather than only muddying the waters. In many investigations of whether an environmental exposure affects an outcome, researchers are worried that some “third factor” might cause both the environment and outcomes. In many such analyses, a standard and reasonable question is whether genetics and/or family background is the “third factor.” A very standard approach to partially address this specific concern is to compare siblings (i.e., hold constant shared family background and shared genetics). This approach is useful but imperfect (e.g., Boardman & Fletcher, Reference Boardman and Fletcher2015). In some circumstances it can provide useful, directional evidence of the importance of this particular third factor source. As an alternative – in a situation without sibling data, for example – researchers could instead control for PGS, perhaps in conjunction with a more formal sensitivity analysis of the original results (Oster, Reference Oster2019). If the researchers do not attempt to interpret the effect of this third factor on the outcome (i.e., “genetic effects”), which follows standard practice in interpretations of an included potential third factor, then I believe the inclusion of PGS can be quite useful in standard social science analysis. A second use is, essentially, imputation of variables the researcher does not have in the data. Again, the focus is not on interpreting the “effects” of the PGS, but using them as signals for where to collect social science data in the future. Burt describes cases where a researcher has data on both a trait of interest and a relevant PGS and chooses to use the PGS (e.g., school grades) in a downstream analysis of, say, predicting high school graduation. But what about the case where school grades are not measured? Or when school grades are only measured post-treatment (e.g., for an early-life intervention)? In these cases a PGS – and even better, many different PGSs – could be used for hypothesis generation for future analysis. For example, an early (or in utero) intervention that is shown to interact differentially with PGS for cognition, PGS for ADHD, PGS for risk tolerance, and so on in predicting high school graduation could be both wrong (if we try to interpret the “genetic effects” directly) but useful (if we do not).

Overall, Burt's paper summarizes a useful set of issues around the inclusion of PGS measures in social science research. I believe the issues raised are mostly correct when focusing on the broad misinterpretation of PGSs as representing “genetic effects” in the emerging literature. However, I believe these misinterpretations can be challenged and corrected directly without the need to abandon the inclusion of PGSs is a limited and focused role in social science research.

Financial support

This research received no specific grant from any funding agency, commercial, or not-for-profit sectors.

Competing interest

None.

Footnotes

1. As Boardman and Fletcher (Reference Boardman and Fletcher2021) state “…even if each of the genetic variant effects that are added together were causal effects, the resulting summary measure [(i.e. PGS)] would not have a clear interpretation. Many researchers have used vague terms, such as genetic endowment, genetic risk, or genetic predisposition, in labeling these constructs… the fact that many of the genetic variant effects are not causal further challenges the interpretation – so much so that it is not clear that they can be called ‘genetic’ effects at all…”

2. As described in more detail in Fletcher (Reference Fletcher2022a, Reference Fletcher2022b), Harden (Reference Harden2021) is a particularly poignant case of over-stating current knowledge and methods. Likewise, the new chapter by Madole and Harden (Reference Madole and Harden2023) overstates and oversells.

References

Boardman, J. D., & Fletcher, J. M. (2015). To cause or not to cause? That is the question, but identical twins might not have all of the answers. Social Science and Medicine, 127, 198200.CrossRefGoogle ScholarPubMed
Boardman, J. D., & Fletcher, J. M. (2021). Evaluating the continued integration of genetics into medical sociology. Journal of Health and Social Behavior, 62(3), 404418.CrossRefGoogle ScholarPubMed
Fletcher, J. (2022a). Backdoor to a dead end: A review essay: The genetic lottery: Why DNA matters for social equality by Kathryn Paige Harden. Population and Development Review, 48(1), 253258.CrossRefGoogle Scholar
Fletcher, J. (2022b). Lost in translation: Importing causal designs into behavioral genetics may be a dead end. https://osf.io/preprints/socarxiv/zyvkr/CrossRefGoogle Scholar
Fletcher, J., Wu, Y., Li, T., & Lu, Q. (2021). Interpreting polygenic score effects in sibling analysis. BioRxiv.Google Scholar
Harden, K. P. (2021). The genetic lottery: Why DNA matters for social equality. Princeton University Press.Google Scholar
Madole, J., & Harden, K. P. (2023). Building causal knowledge in behavior genetics. Behavioral and Brain Sciences. https://doi.org/10.1017/S0140525X22000681Google Scholar
Oster, E. (2019). Unobservable selection and coefficient stability: Theory and evidence. Journal of Business & Economic Statistics, 37(2), 187204.CrossRefGoogle Scholar