1 Introduction
Bar-Hillel, Noah and Frederick (2018) discussed a class of riddles that challenge respondents to explain a situation which – at first blush – seems impossible or paradoxical. We called these riddles stumpers, because respondents often cannot fathom what they are missing, or what alternate representation might permit a solution. Stumpers are a subset of a broader class of insight problems (Reference Gilhooly and MurphyGilhooly & Murphy, 2005), called “single step insight problems” by Reference Murray and ByrneMurray and Byrne (2013).
We liken these stumpers to a play’s script, with our mind as the director arranging the scene. Subjects are stumped if the scene they first construct (their “mental model”; see Reference CraikCraik, 1943; Reference Johnson-LairdJohnson-Laird, 1983) does not contain the solution, and they remain stumped until they are able construct a different scene that can accommodate the script’s elements. The four stumpers we used are reproduced in Table 1Footnote 1, with solutions and the alternate representations that afford them shown in the Appendix. We extend our earlier paper here, by examining how the ability to solve stumpers correlates with two other types of tasks: the CRT and the CRAT.
Frederick (2005) studied a class of problems that he termed the “Cognitive Reflection Test” (or CRT). For these items, respondents never feel stumped: they typically come up with an answer immediately, offer it without hesitation, and are later surprised to learn it is not correct. The four items we used (see Table 2) are not from the original CRT but share many of the same propertiesFootnote 2. Except for Mary, three numerical answers, plus an option for “Other”, were provided for each item (shown in the Appendix).
Our subjects also answered four items from the Remote Associates Test (or RAT; Mednik, 1962, 1968), in which respondents seek a fourth word that is associated with each of three presented words (see Table 3). We used a more restrictive variant of this test, called the Compound Remote Associates Test or CRAT (Reference Bowden and Jung-BeemanBowden & Jung-Beeman, 2003), As explained in the instructions: “In the next task, you will be presented with three words. You will have 15 seconds to think of a word that forms a word-pair with each of them,” and an example was provided.
We included these additional tests because we thought all required some form of creativity, defined by Mednik (1962) as “the forming of … elements into new combinations which … are in some way useful” (p. 221). Stumpers typically require subjects to visualize the narrative elements in new ways. As its name implies, the RAT requires respondents to search beyond the immediately available associations until they can find one that all three stem words share (Barr, Pennycook, Stolz & Fuselgang, 2015). Finally, the CRT items generally require respondents to subject their initial intuitive solutions to a subsequent search for potentially disqualifying observations: Mary herself counts as one of her mother’s children; the bear lost 20% of its pre-hibernation weight, when it weighed more than 1000 pounds; the food in a trough will be consumed faster when more animals feed from it, so certainly faster than 6 days; the 15th tallest and 15th shortest individual is the same person – Jerry – so simply adding those numbers will double count him.Footnote 3
2 Method
The four Study 2 stumpers were answered by 394 respondents, recruited on Amazon’s Mechanical Turk. Their mean age was 38, and 55% were female. Respondents were randomly assigned to one of four experimental groups, defined by the four stumpers in Table 1, and paid a dollar to participate. They answered a multi-screen questionnaire, administered through Qualtrics. The questionnaire began with one of our four stumpers, and ended with all the CRT items and RAT items shown in Tables 2 and 3Footnote 4 (full details can be found in Bar-Hillel et al., 2018). Thus, although each stumper was answered by only about 100 respondents, all 394 received the four cognitive reflection items and the four remote associate items. Respondents were allowed to leave any item they wished unanswered.
3 Results
We scored respondents’ stumper performance as 0 for “failed”, and 1 for “solved”. We scored their performance on the two other tasks by the number of items they solved (from 0 to 4). Cronbach’s alpha was .49 for the CRT, and .50 for RAT, hence essentially identical.
Figure 1 shows the scores on these other scales when respondents are split by success [or failure] on their stumper. Those who solved their stumper scored significantly higher on our four CRT items, but not on our four-item CRAT.
Table 4, in addition to the size of these effects, shows the associated point-biserial correlation between the scales (r). Solving the stumper predicted solving the CRTs (r(394) = .27, p <.001), but did not predict solving the CRATs (r(394) = .063, p = .210). These two correlations differ significantly (Z = 3.310, p < .01).
The correlation between the CRT and the CRAT, .187, was highly significant.
Stumper success was not significantly correlated with either age or gender (excepting the Accountant stumper, which was solved by 59% of women, but just 36% of men).
4 Discussion
Our interest in stumpers reflects our belief that they might reveal novel psychological principles, but it remains unabashedly exploratory. The present paper focuses on relations between performance on stumpers and two other types of problems — the CRT and the CRAT, which have each been studied extensively, and correlate with many other psychological variables (see lists in, e.g., Reference Lee, Huggins and TherriaultLee, Huggins & Therriault, 2014, for the RAT; Reference Pennycook, Cheyne, Koehler and FugelsangPennycook, Cheyne, Koehler & Fugelsang, 2016, for the CRT). But, aside from expecting the three tasks to correlate positively (as all intellective tasks do), we had no strong predictions, as all seem to involve a similar sort of skill: the ability or disposition to broaden one’s search beyond the elements that are initially most accessible.
Accordingly, we remain perplexed why solving stumpers correlates much more strongly with the CRT than with the CRAT, particularly since versions of those two scales often correlate strongly with – and thereby index – general cognitive ability (see, e.g., Frederick, 2005; Reference Chein and WeisbergChein & Weisberg, 2014; Reference Lee, Huggins and TherriaultLee, Huggins & Therriault, 2014). We assume future research will reveal the essential differences among these related types of problems, and hope stumpers will join the ranks of RAT, CRT, insight problems, and other reasoning tasks as a tool for studying mental processes.
Appendix: Solutions
CRAT solutions:
WMS: Street; CFT: Cookie; RBS: Bath; AHC: Head