Hostname: page-component-cd9895bd7-lnqnp Total loading time: 0 Render date: 2024-12-26T19:54:05.096Z Has data issue: false hasContentIssue false

Does Stereotype Threat Contribute to the Political Knowledge Gender Gap? A Preregistered Replication Study of Ihme and Tausendpfund (2018)

Published online by Cambridge University Press:  16 March 2023

Flavio Azevedo*
Affiliation:
Department of Psychology, University of Cambridge, Cambridge, UK
Leticia Micheli
Affiliation:
Institute of Psychology, Leiden University, Leiden, The Netherlands
Deliah Sarah Bolesta
Affiliation:
Center for Criminological Research Saxony (ZKFS), Technical University Chemnitz, Chemnitz, Germany
*
*Corresponding author. Email: [email protected]
Rights & Permissions [Opens in a new window]

Abstract

The gender gap in political knowledge is a well-established finding in Political Science. One explanation for gender differences in political knowledge is the activation of negative stereotypes about women. As part of the Systematizing Confidence in Open Research and Evidence (SCORE) program, we conducted a two-stage preregistered and high-powered direct replication of Study 2 of Ihme and Tausendpfund (2018). While we successfully replicated the gender gap in political knowledge – such that male participants performed better than female participants – both the first (N = 671) and second stage (N = 831) of the replication of the stereotype activation effect were unsuccessful. Taken together (pooled N = 1,502), results indicate evidence of absence of the effect of stereotype activation on gender differences in political knowledge. We discuss potential explanations for these findings and put forward evidence that the gender gap in political knowledge might be an artifact of how knowledge is measured.

Type
Replication Study
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - SA
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-ShareAlike licence (http://creativecommons.org/licenses/by-sa/4.0/), which permits re-use, distribution, and reproduction in any medium, provided the same Creative Commons licence is used to distribute the re-used or adapted article and the original article is properly cited.
Copyright
© The Author(s), 2023. Published by Cambridge University Press on behalf of American Political Science Association

The gender gap in political knowledge is considered “one of the most robust findings in the field of political behavior” (Dow Reference Dow2009, 117) and is thought to be linked to women’s lower political participation and representation (Ondercin and Jones-White Reference Ondercin and Jones-White2011). The underlying reasons for this knowledge gap, however, remain contentious. Until recently, most studies focused on cultural and macro-level factors (Burns, Schlozman and Verba Reference Burns, Schlozman and Verba2001; Carpini and Keeter Reference Carpini, Keeter, Tolleson-Rinehart and Josephson2005). In contrast, Ihme and Tausendpfund (Reference Ihme and Tausendpfund2018) offered a psychological explanation. Specifically, they explored whether the activation of negative stereotypes about women’s lower political knowledge can harm women’s performance.

According to the stereotype threat literature, exposure to negative stereotypes about one’s in-group increases anxiety, negative thinking, and psychological discomfort, all of which overload the working memory and ultimately hamper cognitive performance (McGlone and Pfiester Reference McGlone and Pfiester2007; Pennington, Heim, Levy, and Larkin Reference Pennington, Heim, Levy and Larkin2016). These psychological processes, in turn, reinforce the existing stereotypes (Schmader, Johns, and Forbes Reference Schmader, Johns and Forbes2008). Conversely, non-stigmatized individuals exhibit an enhanced task performance when exposed to negative stereotypes related to their outgroup (i.e., a “stereotype lift”; Walton and Cohen, Reference Walton and Cohen2003). Consistent with both stereotype threat and stereotype lift, Ihme and Tausendpfund (Reference Ihme and Tausendpfund2018) found that female participants performed worse than males in a political knowledge test when gender stereotypes were activated (N = 377). They also observed no knowledge gap in the absence of activated gender stereotypes. Specifically, women performed worse (and men performed better) when gender stereotypes were activated compared to a control condition. These findings persisted even when controlling for political interest, ruling out that the results are a function of women’s lack of interest on the topic. Further, the effect of stereotype threat on the gender gap in political knowledge was more pronounced for female students of Politics, presumably because the test represented higher stakes for them as supposedly experts on the topic. The authors concluded that “the often-found gender gap in political knowledge might – to some extent – be the result of stereotyping” (Ihme and Tausendpfund Reference Ihme and Tausendpfund2018, 12). These findings represent an important practical contribution, as they suggest that the political knowledge gender gap is not necessarily stable, and thus could be potentially mitigated by a range of interventions (for a review, see Lewis and Sekaquaptewa Reference Lewis and Sekaquaptewa2016).

The effects of stereotype threat on gender differences in performance have not been consistent in the literature. Pruysers and Blais (Reference Pruysers and Blais2014) found no effect of stereotype threat on the political knowledge gap. McGlone, Aronson and Kobrynowicz (Reference McGlone, Aronson and Kobrynowicz2006) found that implicit and explicit cues of gender stereotype threats impaired women’s performance on a political knowledge test, but did not improve males’ performance. Adding to the contention, careful examinations of stereotype threat effects on other domains, such as women’s and girls’ mathematics performance, reveal at most weak evidence in its favor (Flore and Wicherts Reference Flore and Wicherts2015; Flore, Mulder and Wicherts Reference Flore, Mulder and Wicherts2018; Pennington, Litchfield, McLatchie, and Heim Reference Pennington, Litchfield, McLatchie and Heim2018). These inconsistent patterns call into question whether the effect of stereotype threat on the political knowledge gap is replicable and, if so, to what extent. To date, no direct replication of this effect has been conducted.

As part of a large-scale replication initiative led by the Center for Open Science and SCORE program (Systematizing Confidence in Open Research and Evidence; https://www.cos.io/score), aiming to investigate the credibility of scientific claims in social and behavioral sciences (Alipourfard et al. Reference Alipourfard, Arendt, Benjamin, Benkler, Bishop, Burstein and Wu2021), we have conducted a preregistered (peer-reviewed), well-powered, two-step direct replication of Ihme and Tausendpfund (Reference Ihme and Tausendpfund2018).

Methods

As determined by SCORE, the focal claim we attempted to replicate was that “the activation of gender stereotypes affects performance on a political knowledge test” (Ihme and Tausendpfund Reference Ihme and Tausendpfund2018, 1). As in the original study, we employed a 2 (gender: male vs. female) × 2 (field of study/work: non-politics vs. politics) × 3 (stereotype activation: stereotype activated by gender question vs. stereotype activated by gender difference statement vs. stereotype not activated) between-subjects design. Note that the original study included the variable field of study in all reported analyses. Thus, even though this variable was not necessary to the replication of the effect of gender stereotype activation on political knowledge, we included it in our direct replication so our study design and analyses were as similar and comparable as possible to the original study. According to SCORE guidelines, the replication would be deemed successful if the statistical results showed a significant interaction (α = 0.05) between stereotype activation and gender. All study materials, containing ethical approval, power calculation, and preregistration, are publicly available at OSF (https://osf.io/8feku/?view_only=99a41a96c8cd43c4ab349e44d79919cd).

Sample

The required sample size for replicating the focal claim was determined with power analyses carried out using the “pwr” package (Champely Reference Champely2020) in R (R Core Team 2020). Power calculations were performed in accordance with the guidelines of the Social Sciences Replication Project (http://www.socialsciencesreplicationproject.com/). As per SCORE guidelines, data collection should proceed in two stages, with a second round of data being collected only if the first round resulted in an unsuccessful replication. Two power calculations were then performed to derive the sample sizes required for each stage of data collection. For the first round of data collection, 90% of power should be achieved. Assuming that the true effect size of the interaction term between gender and stereotype activation was 75% of that reported in the original study, the power analysis yielded a sample of 667 participants. The pooled sample (including both the first and second stages of data collection) should achieve 90% power. Assuming that the true effect size of the interaction between gender and stereotype activation was 50% of that reported in the original study, the second power analysis suggested an additional 830 responses would be needed. Participants were recruited using a professional survey firm (https://www.cint.com) using attention checks as recommended (Aronow, Kalla, Orr, and Ternovski Reference Aronow, Kalla, Orr and Ternovski2020). Only American citizens older than 18 years studying or working at the time of the survey were invited to take part.

Procedure

To ensure a fair and reliable replication attempt, the study design and analysis plan were peer-reviewed by independent researchers selected by SCORE and preregistered on OSF (https://osf.io/nxrg7). The study was approved by an independent IRB ethics committee, BRANY (https://www.brany.com), and the U.S. Army’s Human Research Protection Office (HRPO)#20-032-764 (Award Number HR00112020015, HRPO Log Number A-21036.50).

According to existing definition efforts (Parsons et al. Reference Parsons, Azevedo, Elsherif, Guay, Shahim, Govaart, Norris, O’Mahony, Parker, Todorovic, Pennington, Garcia-Pelegrin, Lazić, Robert-son, Middleton, Valentini, McCuaig, Baker, Collins and Aczel2021), our study can be considered a direct replication of Study 2 by Ihme and Tausendpfund (Reference Ihme and Tausendpfund2018) as it uses the same methodology and experimental design employed by the authors of the original study, with few modifications as follows. First, our sample was composed not only of students, as in the original study, but also of working adults. This modification was necessary to achieve the required sample size, which was considerably higher than the original study, and to check whether the original findings (in German students) generalize to the adult population of the United States. As a consequence, the political knowledge scale used in our study had to be adapted from a German political scenario to the contemporary political context of the United States (see Table S1). Second, as our sample was composed of both students and working adults, the measurement of participants’ field of study had to be expanded to encompass fields of study or work. Data were collected online and hosted at Qualtrics. Both stages of data collection had exactly the same procedures and measures. Before participants answered the political knowledge test, we measured political interest and manipulated stereotype activation in the same way as Ihme and Tausendpfund (Reference Ihme and Tausendpfund2018). We provide additional sample, procedural, and question wording details in the Supplementary Materials.

Data analysis

Following the analyses reported in the original study and the analysis script made available by the original authors, we tested the replication claim that activation of gender stereotypes influences performance in a political knowledge test with a 2 (gender) × 2 (field of work/study) × 3 (Gender Stereotype Activation) ANCOVA. The dependent variable was participants’ total score on the political knowledge test. As in the original study, a single score of political interest was calculated per participant (i.e., average of responses in the short scale of political interest) and included as a covariate. In addition, we use Bayesian analyses to adjudicate about whether results indicate absence of evidence or evidence of absence. All analyses were conducted in R. To increase comparability between the direct replication and original results, we adjusted the sum of squares in R to type III, which is the default in the SPSS software used by the original authors to perform their analyses.

Results

Stage 1

Results of the ANCOVA yielded a non-significant interaction between stereotype activation and gender, F(2, 658) = 0.691, p = 0.501, partial η2 = 0.002, 95% CI = [.00, .01], N = 671. Thus, according to the SCORE criteria, the replication was considered unsuccessful at the first stage (see Tables S2S4 for detailed results). As preregistered, to provide further evidence regarding the (non)replicability of gender stereotype threat on gender differences in political knowledge, we then proceeded to a second stage of data collection.

Stage 2

The pooled analytical sample (first and second stages together) was composed of 1,502 participants (Mage = 45.87 years, SDage = 17.35, 48.74% female). The distribution of participants across conditions resembled the distribution of the original study (see Table S5). Consistent with the original study and a large body of research, ANCOVA results revealed a main effect of gender on political knowledge, such that men generally scored higher than women on the political knowledge test, F(1, 1489) = 28.61, p < 0.001, partial η² = 0.02, 95% CI = [.01, .04]; M female = 7.36, SD = 3.62; M male = 9.81, SD = 3.87. Also in line with the original study, we found no main effect of stereotype activation on political knowledge, F(2, 1489) = 0.27, p = 0.77, partial η² = 0.00, 95% CI = [.00, .00], and a significant effect of political interest, such that the more participants were interested in politics, the higher their score on the political knowledge test F(1, 1489) = 194.78, p < 0.001, partial η² = 0.12; 95% CI = [.09, .15]. Our focal test, however, diverged from the results reported in the original study, as the interaction between gender and stereotype activation was not significant F(2, 1489) = 1.22, p = 0.3, partial η² = 0.00, 95% CI = [.00, .01]. Thus, according to the criteria outlined by SCORE, the replication of the effect of stereotype threat on the gender gap on political knowledge was unsuccessful even after the second stage of data collection.

We further explore the results by conducting Bonferroni-corrected pairwise comparisons with the emmeans function in R (Lenth Reference Lenth2022). As illustrated in Figure 1, males’ scores were significantly higher than females’ in the stereotype not activated condition t(1489) = −7.42, p < 0.001, the stereotype activated by gender question t(1489) = −4.36, p < 0.001, and in the gender difference statement condition t(1489) = −6.02, p < 0.001. In addition, we did not find evidence of either stereotype threat or stereotype lift, as women’s performance did not decrease nor men’s performance increased in the stereotype-activated conditions compared to the stereotype not activated condition (see supplementary materials section 3.3 for detailed analyses). The interaction between field of study/work and stereotype activation as well as the three-way interaction between field of study/work, stereotype activation, and gender were not significant (p = 0.32 and p = 0.81, respectively). Additional analyses and a comparison between the replication results and the results of the original study can be found in the supplementary materials (Tables S6S7).

Figure 1 Unconditional means comparison of Political Knowledge Test scores for each gender and experimental condition.

Exploratory analyses

In order to evaluate our replication attempt, we computed the evidence-updated replication Bayes factors for both stages of data collection (Ly, Etz, Marsman, and Wagenmakers Reference Ly, Etz, Marsman and Wagenmakers2019; Verhagen and Wagenmakers Reference Verhagen and Wagenmakers2014). Using the “posterior distribution obtained from the original study as a prior distribution for the test of the data from the replication study” (Ly, Etz, Marsman, and Wagenmakers Reference Ly, Etz, Marsman and Wagenmakers2019, 2504), we computed an overall Bayes Factor of BF10(d orig, d rep ) = 0.009 for the interaction term of gender and stereotype activation on political knowledge at Stage 1. Dividing the overall Bayes factor by the Bayes factor from the original data (BF10(d orig ) = 0.142) yielded a replication Bayes factor of BF10(d orig | d rep ) = 0.064. For Stage 2, an overall Bayes Factor of BF10(d orig, d rep ) = 0.001 for the interaction effect of gender and stereotype activation on political knowledge was computed. Again, dividing this by the original study’s Bayes Factor resulted in a replication Bayes factor of BF10(d orig | d rep ) = 0.007. This means that the replication data are predicted 1/0.064 = 15.8 (Stage 1) or 1/0.007 = 143 (Stage 2) times better by the null hypothesis than by the alternative hypothesis in the original dataset. Hence, the replication cannot be deemed successful (Zwaan, Etz, Lucas, and Donnellan Reference Zwaan, Etz, Lucas and Donnellan2018).

In addition, we evaluated – as per original authors’ advice – whether the political knowledge scale is a “sufficiently difficult test.” Using Ihme and Tausendpfund (Reference Ihme and Tausendpfund2018) original data, we compared scales’ difficulty. Comparing the political knowledge test distribution of the original and the replication data revealed no significant differences for Stage 1 (z = −1.53, p = 0.06) or Stage 2 (z = −1.22, p = 0.11, see Figure 2). To test this more in depth, we used Item Response Theory two-parameter model (2PL). As indicated in Figure 3, both scales display equivalent levels of reliability across the latent construct θ (panel a), show equivalent test difficulty and total score across θ levels (panel b), and – albeit some inter-item differences – have overall corresponding item difficulties (panel c). These findings suggest comparable scale properties for both the original and replication, allowing us to rule out measurement-related (difficulty) issues underlying the non-replication. A variety of robustness checks and additional exploratory analyses are reported in the Supplementary Materials (Tables S8S20).

Figure 2 Frequency distribution of Political Knowledge Test scores for the replication data (blue) and the original data (pink) in Stage 1 (upper panel) and Stage 2 (bottom panel).

Figure 3 Results of IRT’s 2PL model of the Political Knowledge Test Scores.

Discussion

Ihme and Tausendpfund (Reference Ihme and Tausendpfund2018) have proposed that the activation of negative gender stereotypes accounts for the variance of the political knowledge gender gap. In our independent and well-powered direct replication, we find no evidence that activation of gender stereotypes affects participants’ performance in a political knowledge test. Indeed, we find evidence of absence of this effect.

We note that some elements of our study design diverged from the original study and could have contributed to the observed non-replication. Our study was conducted with American students and working adults, whereas the original study included German students. As the United States has achieved relatively lower gender parity than Germany in political empowerment (World Economic Forum 2021), one could argue that negative stereotypes about women might be more salient for Americans than Germans, undermining women’s cognitive performance even in the absence of stereotype activation (e.g., in the control condition). Although we cannot rule out that some populations might be more vulnerable to gender stereotyping than others, we have reduced cultural biases as much as possible by devising a political knowledge test that was – at the same time – similar to the one used in the original study regarding the level of difficulty, as our data suggest, and relevant to the American political context. A comparison of the effect of stereotype threat on gender differences in political knowledge across countries with varying levels of gender equality would be beneficial for a better understanding of potential cultural differences in stereotype threat. Second, as a direct consequence of including working adults in our sample, it was necessary to adapt the measure of field of study to encompass the field of work. We argue, however, that this should not have contributed to the unsuccessful replication. If our measure of field of study/work would inadvertently make participants aware of their affiliation with a Politics or Non-Politics group, the effects of gender stereotype activation on performance would presumably become more salient. Instead, our results show that the field of study/work did not influence the results (Tables S16S17). An argument can be made, however, that the extensive list of topics in our study reduced participants’ self-identity with Politics. Nevertheless, adding participants’ attributed importance of Politics to their study/work as a covariate in the analyses did not change results (Tables S18S19). We have also conducted further tests restricting our sample to young and educated adults to achieve a sample more similar in composition to the respondents in the original study, but we could still not replicate the effect of stereotype activation on the gender gap in political knowledge (Table S20).

We note that our failure to replicate the effect of stereotype threat on gender differences in political knowledge is consistent with recent research efforts challenging the effect of stereotype threat on academic performance more broadly. Stoet and Geary (Reference Stoet and Geary2012) showed that only 30% of efforts aiming to replicate the gender gap in mathematical performance do succeed. In addition, a meta-analysis investigating the effect of gender stereotype threats on the performance of schoolgirls in stereotyped subjects (e.g., science, math) indicated several signs of publication bias within this literature (Flore and Wicherts Reference Flore and Wicherts2015). Given these results, it is plausible that the effect of gender stereotype activation might be small in magnitude and/or might be decreasing over time (Lewis and Michalak Reference Lewis and Michalak2019).

Furthermore, we find robust evidence of a gender gap in political knowledge even after controlling for political interest. Our results validate previous accounts that the gender gap on political knowledge may be an artifact of how knowledge is conceptualized and measured and of different gender attitudes toward standard tests. In line with previous research stating that the political knowledge gap might be artificially inflated by a disproportionate amount of men who are willing to guess rather than chose the “don’t know” option – even if that might lead to an incorrect answer (Mondak and Anderson Reference Mondak and Anderson2004) – we find that female participants attempted to answer less questions and used the “don’t know” response option in the political knowledge test more frequently than their male counterparts whereas men guessed their answers more frequently than women, resulting in a larger amount of incorrect answers (Tables S8S14). This suggests factors other than knowledge might contribute to the gender gap in political knowledge (Mondak Reference Mondak1999). For example, gender differences in risk taking and competitiveness (Lizotte and Sidman Reference Lizotte and Sidman2009) as well as in self-confidence (Wolak Reference Wolak2020) and self-efficacy (Preece Reference Preece2016) may lead women to second-guess themselves and be less prone to attempt answering the questions of which they are unsure. Meanwhile, higher competitiveness and confidence in males might lead them to guess and “gain the advantage from a scoring system that does not penalize wrong answers and rewards right ones” (Kenski and Jamieson Reference Kenski, Jamieson and Jamieson2000, 84). Measurement non-invariance, too, appears to detrimentally affect the interpretation and validity of political knowledge scales across several sociodemographics. For example, Lizotte and Sidman (Reference Lizotte and Sidman2009) and Mondak and Anderson (Reference Mondak and Anderson2004) have shown political knowledge instruments violate the equivalence assumption for gender, while Abrajano (Reference Abrajano2015) and Pietryka and MacIntosh (Reference Pietryka and MacIntosh2013) found non-invariance across age, income, race, and education. In our own replication attempt, we also found evidence of measurement non-invariance using item response theory and showed that the magnitude of the gender systematic bias appears to be contingent on respondents’ knowledge levels such that lack of equivalence by gender is stronger at average scores and weaker at the extremes of the political knowledge continuum (see Table S21 and Figure S1).

As Politics has been essentially a male-dominated field since its creation, it should not come as a surprise that current measures of political knowledge tend to favor what men typically know. Previous studies have shown that the mere inclusion of gendered items on scales of political knowledge lessens the gender gap (Barabas, Jerit, Pollock, and Rainey Reference Barabas, Jerit, Pollock and Rainey2014; Dolan Reference Dolan2011). The investigation and validation of measures of political knowledge that capitalize on the fact that men and women might not only know different things but also may react in different ways to standard tests is paramount for a more accurate understanding of the gender gap in political knowledge and its bias.

Finally, we note that measurement issues are not unique to political knowledge and in fact are pervasive in Political Science with consequences for how we measure populism (Van Hauwaert, Schimpf, and Azevedo Reference Van Hauwaert, Schimpf and Azevedo2018, Reference Van Hauwaert, Schimpf and Azevedo2020; Wuttke, Schimpf, and Schoen Reference Wuttke, Schimpf and Schoen2020), operational ideology (Azevedo and Bolesta Reference Azevedo and Bolesta2022; Azevedo, Jost, Rothmund, and Sterling Reference Azevedo, Jost, Rothmund and Sterling2019; Kalmoe Reference Kalmoe2020), and political psychological constructs such as authoritarianism, racial resentment, personality traits, and moral traditionalism (Azevedo and Jost Reference Azevedo and Jost2021; Bromme, Rothmund, and Azevedo Reference Bromme, Rothmund and Azevedo2022; Pérez and Hetherington Reference Pérez and Hetherington2014; Pietryka and MacIntosh Reference Pietryka and MacIntosh2022). If the basic measurement properties of widely used constructs are flawed, it is likely that insights from research will be biased. Valid, invariant, and theoretically derived instruments are urgently needed for the reliable accumulation of knowledge in Political Science.

Supplementary material

To view supplementary material for this article, please visit https://doi.org/10.1017/XPS.2022.35

Data availability

This work was carried out as part of the Center for Open Science’s Systematizing Confidence in Open Research and Evidence (SCORE) program, which is funded by the Defense Advanced Research Projects Agency. The data, code, and any additional materials required to replicate all analyses in this article are available at the Journal of Experimental Political Science Dataverse within the Harvard Dataverse Network, at https://doi.org/10.7910/DVN/ETUUOD. All study materials and preregistration information for the current study have been made publicly available via OSF and can be accessed at https://osf.io/8feku/?view_only=99a41a96c8cd43c4ab349e44d79919cd.

Acknowledgements

We would like to thank the staff and researchers at the Center for Open Science for their guidance and assistance, and especially Zach Loomas and Beatrix Arendt for their patience and kindness. We also would like to show our appreciation for Kimberly Quinn’s editorship of the preregistration stage. Lastly, we would like to thank Charlotte R. Pennington for providing helpful feedback on an earlier manuscript.

Author contributions

Conceptualization: F.A, L.M, and D.S.B; Data Curation: F.A, L.M, and D.S.B; Formal analyses: F.A, L.M, and D.S.B; Investigation: F.A, L.M, and D.S.B, Methodology: F.A, L.M, and D.S.B; Project administration: F.A; Software: F.A, L.M, and D.S.B; Visualization: F.A, L.M, and D.S.B; Writing (original draft): F.A and L.M; Writing (review and editing): F.A, L.M, and D.S.B.

Conflicts of interest

The authors report no conflicts of interest.

Ethics statement

The study reported here was approved by an independent IRB ethics committee, BRANY (https://www.brany.com), and the U.S. Army’s Human Research Protection Office (HRPO)#20–032–764 (Award Number HR00112020015, HRPO Log Number A-21036.50) and adheres to APSA’s Principles and Guidance for Human Subjects Research. More information can be found in the Supplementary materials (section 1: “Procedures and Measures”).

Footnotes

This article has earned badges for transparent research practices. For details see the Data Availability Statement.

References

Abrajano, M. 2015. Reexamining the “Racial Gap” in Political Knowledge. The Journal of Politics 77(1): 4454. https://doi.org/10.1086/678767 CrossRefGoogle Scholar
Alipourfard, N., Arendt, B., Benjamin, D. M., Benkler, N., Bishop, M. M., Burstein, M., … Wu, J.. 2021. Systematizing Confidence in Open Research and Evidence (SCORE). OSF. https://doi.org/10.31235/osf.io/46mnb Google Scholar
Aronow, P. M., Kalla, J., Orr, L., and Ternovski, J.. 2020. Evidence of Rising Rates of Inattentiveness on Lucid in 2020. OSF. https://doi.org/10.31235/osf.io/8sbe4 Google Scholar
Azevedo, F., and Bolesta, D. S.. 2022. Measuring Ideology: Current Practices, Its Consequences, and Recommendations. https://measuring.ideology.flavioazevedo.com/ Google Scholar
Azevedo, F., L. Micheli, and Bolesta, D. S.. 2022. Replication Data for: Does Stereotype Threat contribute to the Political Knowledge Gender Gap? A Preregistered Replication Study of Ihme and Tausendpfund (2018). Harvard Dataverse. https://doi.org/10.7910/DVN/ETUUOD.Google Scholar
Azevedo, F., and Jost, J. T. 2021. The Ideological Basis of Antiscientific Attitudes: Effects of Authoritarianism, Conservatism, Religiosity, Social Dominance, and System Justification. Group Processes & Intergroup Relations 24(4): 518–49. https://doi.org/10.1177/1368430221990104 CrossRefGoogle Scholar
Azevedo, F., Jost, J. T., Rothmund, T., and Sterling, J.. 2019. Neoliberal Ideology and the Justification of Inequality in Capitalist Societies: Why Social and Economic Dimensions of Ideology are Intertwined. Journal of Social Issues 75(1): 4988. https://doi.org/10.1111/josi.12310 CrossRefGoogle Scholar
Barabas, J., Jerit, J., Pollock, W., and Rainey, C. 2014. The Question (s) of Political Knowledge. American Political Science Review 108(4): 840–55. https://doi.org/10.1017/S0003055414000392 CrossRefGoogle Scholar
Bromme, L., Rothmund, T., and Azevedo, F.. 2022. Mapping Political Trust and Involvement in the Personality Space—A Meta-Analysis and New Evidence. Journal of Personality. https://doi.org/10.1111/jopy.12700 CrossRefGoogle ScholarPubMed
Burns, N., Schlozman, K. L., and Verba, S. 2001. The Private Roots of Public Action. Gender, Equality, and Political Participation. Cambridge: Harvard University Press. https://doi.org/10.4159/9780674029088 CrossRefGoogle Scholar
Carpini, M. X. D., and Keeter, S.. 2005. Gender and Political Knowledge. In Gender and American Politics. Women, Men, and the Political Process, eds. Tolleson-Rinehart, S. & Josephson, J. J. Armonk: Sharpe. https://doi.org/10.4324/9781315289779 Google Scholar
Champely, S. 2020. pwr: Basic Functions for Power Analysis. R Package Version 1.3-0. https://CRAN.R-project.org/package=pwr Google Scholar
Dolan, K. 2011. Do Women and Men Know Different Things? Measuring Gender Differences in Political Knowledge. The Journal of Politics 73(1): 97107. https://doi.org/10.1017/S0022381610000897 CrossRefGoogle Scholar
Dow, J. K. 2009. Gender Differences in Political Knowledge: Distinguishing Characteristics-Based and Returns-Based Differences. Political Behavior 31(1): 117–36. https://doi.org/10.1007/s11109-008-9059-8 CrossRefGoogle Scholar
Flore, P. C., Mulder, J., and Wicherts, J. M.. 2018. The Influence of Gender Stereotype Threat on Mathematics Test Scores of Dutch High School Students: A Registered Report. Comprehensive Results in Social Psychology 3(2): 140–74. https://doi.org/10.1080/23743603.2018.1559647 CrossRefGoogle Scholar
Flore, P. C., and Wicherts, J. M.. 2015. Does Stereotype Threat Influence Performance of Girls in Stereotyped Domains? A Meta-Analysis. Journal of School Psychology 53(1): 2544. https://doi.org/10.1016/j.jsp.2014.10.002 CrossRefGoogle ScholarPubMed
Ihme, T. A., and Tausendpfund, M.. 2018. Gender Differences in Political Knowledge: Bringing Situation Back In. Journal of Experimental Political Science 5(1): 3955. https://doi.org/10.1017/XPS.2017.21 CrossRefGoogle Scholar
Kalmoe, N. P. 2020. Uses and Abuses of Ideology in Political Psychology. Political Psychology 41(4): 771–93. https://doi.org/10.1111/pops.12650 CrossRefGoogle Scholar
Kenski, K., and Jamieson, K. H. 2000. The Gender Gap in Political Knowledge: Are Women Less Knowledgeable Than Men about Politics? In Everything You Think You Know about Politics…And Why You’re Wrong, ed. Jamieson, K. H. New York: Basic Books, 83–89.Google Scholar
Lenth, R. V. 2022. emmeans: Estimated Marginal Means, aka Least-Squares Means. R Package Version 1.8.1-1. https://CRAN.R-project.org/package=emmeans Google Scholar
Lewis, N. A. Jr., and Michalak, N. M.. 2019. Has Stereotype Threat Dissipated Over Time? A Cross-Temporal Meta-Analysis. Psyarxiv. https://doi.org/10.31234/osf.io/w4ta2 Google Scholar
Lewis, N. A. Jr., and Sekaquaptewa, D. 2016. Beyond Test Performance: A Broader View of Stereotype Threat. Current Opinion in Psychology 11: 40–3. https://doi.org/10.1016/j.copsyc.2016.05.002 CrossRefGoogle Scholar
Lizotte, M. K., and Sidman, A. H. 2009. Explaining the Gender Gap in Political Knowledge. Politics & Gender 5(2), 127–51. https://doi.org/10.1017/S1743923X09000130 CrossRefGoogle Scholar
Ly, A., Etz, A., Marsman, M., and Wagenmakers, E. J.. 2019. Replication Bayes Factors from Evidence Updating. Behavior Research Methods 51(6): 2498–508. https://doi.org/10.3758/s13428-018-1092-x CrossRefGoogle ScholarPubMed
McGlone, M. S., Aronson, J., and Kobrynowicz, D. 2006. Stereotype Threat and the Gender Gap in Political Knowledge. Psychology of Women Quarterly 30(4): 392–8. https://doi.org/10.1111/j.1471-6402.2006.00314.x CrossRefGoogle Scholar
McGlone, M. S., and Pfiester, R. A. 2007. The Generality and Consequences of Stereotype Threat. Sociology Compass 1(1): 174–90. https://doi.org/10.1111/j.1751-9020.2007.00021.x CrossRefGoogle Scholar
Mondak, J. J. 1999. Reconsidering the Measurement of Political Knowledge. Political Analysis 8(1): 5782. https://doi.org/10.1093/oxfordjournals.pan.a029805 CrossRefGoogle Scholar
Mondak, J. J., and Anderson, M. R. 2004. The Knowledge Gap: A Reexamination of Gender-Based Differences in Political Knowledge. The Journal of Politics 66(2): 492512. https://doi.org/10.1111/j.1468-2508.2004.00161.x CrossRefGoogle Scholar
Ondercin, H. L., and Jones-White, D.. 2011. Gender Jeopardy: What Is the Impact of Gender Differences in Political Knowledge on Political Participation? Social Science Quarterly 92(3): 675–94. https://doi.org/10.1111/j.1540-6237.2011.00787.x CrossRefGoogle Scholar
Parsons, S., Azevedo, F., Elsherif, M. M., Guay, S., Shahim, O. N., Govaart, G. H., Norris, E., O’Mahony, A., Parker, A. J., Todorovic, A., Pennington, C. R., Garcia-Pelegrin, E., Lazić, A., Robert-son, O. M., Middleton, S. L., Valentini, B., McCuaig, J., Baker, B. J., Collins, E., … Aczel, B.. 2021. A Community-Sourced Glossary of Open Scholarship Terms. Retrieved from https://forrt.org/glossary/ Google Scholar
Pennington, C. R., Heim, D., Levy, A. R., and Larkin, D. T.. 2016. Twenty Years of Stereotype Threat Research: A Review of Psychological Mediators. PLoS ONE 11(1): e0146487. https://doi.org/10.1371/journal.pone.0146487 CrossRefGoogle ScholarPubMed
Pennington, C. R., Litchfield, D., McLatchie, N., and Heim, D. 2018. Stereotype Threat May Not Impact Women’s Inhibitory Control or Mathematical Performance: Providing Support for the Null Hypothesis. European Journal of Social Psychology 49(4): 717–34. https://doi.org/10.1002/ejsp.2540 CrossRefGoogle Scholar
Pérez, E., and Hetherington, M. 2014. Authoritarianism in Black and White: Testing the Cross-Racial Validity of the Child Rearing Scale. Political Analysis 22(3): 398412. https://doi.org/10.1093/pan/mpu002 CrossRefGoogle Scholar
Pietryka, M. T., and MacIntosh, R. C.. 2013. An Analysis of ANES Items and Their Use in the Construction of Political Knowledge Scales. Political Analysis 21(4): 407–29. https://doi.org/10.1093/pan/mpt009 CrossRefGoogle Scholar
Pietryka, M. T., and MacIntosh, R. C.. 2022. ANES Scales Often Do Not Measure What You Think They Measure. The Journal of Politics 84(2): 1074–90. https://doi.org/10.1086/715251 CrossRefGoogle Scholar
Preece, J. R. 2016. Mind the Gender Gap: An Experiment on the Influence of Self-Efficacy on Political Interest. Politics & Gender 12(1): 198217. https://doi.org/10.1017/S1743923X15000628 CrossRefGoogle Scholar
Pruysers, S., and Blais, J.. 2014. Anything Women Can Do Men Can Do Better: An Experiment Examining the Effects of Stereotype Threat on Political Knowledge and Efficacy. The Social Science Journal 51(3): 341–9. https://doi.org/10.1016/j.soscij.2014.05.005 CrossRefGoogle Scholar
R Core Team. 2020. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from https://www.R-project.org/ Google Scholar
Schmader, T., Johns, M., and Forbes, C.. 2008. An Integrated Process Model of Stereotype Threat Effects on Performance. Psychological Review 115(2): 336–56. https://doi.org/10.1037/0033-295X.115.2.336 CrossRefGoogle ScholarPubMed
Stoet, G., and Geary, D. C. 2012. Can Stereotype Threat Explain the Gender Gap in Mathematics Performance and Achievement? Review of General Psychology 16(1): 93102. https://doi.org/10.1037/a0026617 CrossRefGoogle Scholar
Van Hauwaert, S. M., Schimpf, C. H., and Azevedo, F.. 2018. Public Opinion Surveys: Evaluating Existing Measures. In The Ideational Approach to Populism: Routledge, 154172. https://doi.org/10.4324/9781315196923-7 Google Scholar
Van Hauwaert, S. M., Schimpf, C. H., and Azevedo, F.. 2020. The Measurement of Populist Attitudes: Testing Cross-National Scales Using Item Response Theory. Politics 40(1): 321. https://doi.org/10.1177/0263395719859306 CrossRefGoogle Scholar
Verhagen, J., and Wagenmakers, E. J.. 2014. Bayesian Tests to Quantify the Result of a Replication Attempt. Journal of Experimental Psychology: General 143(4): 1457–75. https://doi.org/10.1037/a0036731 CrossRefGoogle ScholarPubMed
Walton, G. M., and Cohen, G. L. 2003. Stereotype Lift. Journal of Experimental Social Psychology 39(5): 456–67. https://doi.org/10.1016/S0022-1031(03)00019-2 CrossRefGoogle Scholar
Wolak, J. 2020. Self-Confidence and Gender Gaps in Political Interest, Attention and Efficacy. The Journal of Politics 82(4): 1490–501. https://doi.org/10.1086/708644 CrossRefGoogle Scholar
World Economic Forum. 2021. The Global Gender Gap Report 2021. Retrieved from https://www.weforum.org/reports/ab6795a1-960c-42b2-b3d5-587eccda6023 Google Scholar
Wuttke, A., Schimpf, C., and Schoen, H.. 2020. When the Whole Is Greater Than the Sum of Its Parts: On the Conceptualization and Measurement of Populist Attitudes and Other Multidimensional Constructs. American Political Science Review 114(2): 356–74. https://doi.org/10.1017/S0003055419000807 CrossRefGoogle Scholar
Zwaan, R. A., Etz, A., Lucas, R. E., and Donnellan, M. B. 2018. Making Replication Mainstream. Behavioral and Brain Sciences 41: E120. https://doi.org/10.1017/S0140525X17001972 CrossRefGoogle ScholarPubMed
Figure 0

Figure 1 Unconditional means comparison of Political Knowledge Test scores for each gender and experimental condition.

Figure 1

Figure 2 Frequency distribution of Political Knowledge Test scores for the replication data (blue) and the original data (pink) in Stage 1 (upper panel) and Stage 2 (bottom panel).

Figure 2

Figure 3 Results of IRT’s 2PL model of the Political Knowledge Test Scores.

Supplementary material: Link

Azevedo et al. Dataset

Link
Supplementary material: File

Azevedo et al. supplementary material

Azevedo et al. supplementary material 1
Download Azevedo et al. supplementary material(File)
File 349 KB
Supplementary material: PDF

Azevedo et al. supplementary material

Azevedo et al. supplementary material 2
Download Azevedo et al. supplementary material(PDF)
PDF 173.2 KB
Supplementary material: PDF

Azevedo et al. supplementary material

Azevedo et al. supplementary material 3
Download Azevedo et al. supplementary material(PDF)
PDF 341.9 KB