1. Introduction
Deviations from optimal behavior have long been studied in economics using various approaches. In the experimental literature, many studies have examined such deviations, especially when contingent reasoning is engaged. Another strand of literature studies how complexity creates a cognitive burden and affects optimal choices.
In this paper, we bridge complexity in decision problems and contingent reasoning. Specifically, we investigate complexity in decision problems as a cause of failures in contingent reasoning. For this purpose, we introduce three dimensions of complexity: the number of contingencies, the existence of dominant or obviously dominant choices, and reducible states.
We do this in a decision problem that we develop, which has a similar structure to the decision problem studied by Martínez-Marquina et al. Reference Martínez-Marquina, Niederle and Vespa(2019). Specifically, we consider a labor market setting where there are two workers who vary in two dimensions: The revenue they bring and their willingness to accept. With this information about workers (revenue and willingness to accept) but without knowing which worker they will hire, a subject chooses one wage for a job for the potential hire. Calculating the optimal wage requires contingent reasoning because the subject needs to consider which workers would accept any given wage.
For complexity, we consider three dimensions: the number of contingencies, the existence of reducible states, and the dominance property of choices. We borrow the first two concepts from Oprea Reference Oprea(2020). He considers the number of states as a measure of complexity, whereas we consider the number of contingencies as a measure of complexity. In our setting, states represent the uncertainty resolved by nature, while contingencies are also affected by the worker’s actions. For example, for each wage choice, both which worker they are assigned and whether the assigned worker would accept the wage matters. For the second dimension of complexity, a decision problem has reducible states if there exists an equivalent representation of the problem involving fewer states. We say this equivalent problem then involves “reduced” states. Finally, we also consider the dominant/obviously dominant strategies concept suggested by Li Reference Li(2017), but adapt to individual choice problems.
In addition, we test the Power of Certainty (henceforth PoC) – suggested by Martínez-Marquina et al. Reference Martínez-Marquina, Niederle and Vespa(2019) – by changing the baseline setting to one without uncertainty while keeping the number of contingencies and the existence of dominant/obvious dominance the same.Footnote 1
Our results show that the number of contingencies has the largest effect on the failures in contingent reasoning. The existence of dominant and obviously dominant choices also has an effect, but the degree of obvious dominance is smaller than the existing literature suggests. Redundant states do not affect the failures in contingent reasoning. We also examine how often subjects choose un-dominated choices and suggest that people may exhibit contingent reasoning partially, even if they fail to implement it fully. Lastly, the PoC exists in our setting, too, and has the largest effect in problems with a higher number of contingencies.
1.1. Related literature
Existing papers in the literature that study contingent reasoning have done so using decision problems that differ from our labor market decision problem. For example, Charness & Levin Reference Charness and Levin(2009) and Martínez-Marquina et al. Reference Martínez-Marquina, Niederle and Vespa(2019) use an acquiring-a-company game to test for failures in contingent reasoning. Although these decision problems are different from ours, the problem studied by Martínez-Marquina et al. Reference Martínez-Marquina, Niederle and Vespa(2019) shares some similarities with ours. Specifically, both decision problems require subjects to eliminate dominated choices first. Then, non-dominated choices involve contingencies that depend on whether each firm (or worker) will reject a given offer or not. However, their work has a winner’s curse feature that ours does not have. Other decision problems that have been studied include voting games (Ali et al., Reference Ali, Mihm, Siga and Tergiman2021; Esponda & Vespa, Reference Esponda and Vespa2014), dynamic public goods games (Calford & Cason, Reference Calford and Cason2024), investment decisions with endogenous selection (Esponda & Vespa, Reference Esponda and Vespa2018), the sure-thing principle (Esponda & Vespa, Reference Esponda and Vespa2021; Shafir & Tversky, Reference Shafir and Tversky1992), and the Monty Hall problem (Friedman, Reference Friedman1998).Footnote 2
Ours is not the first paper to use the number of states as a measure of complexity. As mentioned earlier, Oprea Reference Oprea(2020) considers the number of states as one measure of complexity in a typing task.Footnote 3 He shows that rules that require more states are more complex and they incur higher costs. We apply this concept of complexity to the number of contingencies in decision problems. Similarly, in her experiment, Puri Reference Puri2020 defines the complexity of a lottery by its number of possible outcomes.Footnote 4 For reducibility as a complexity measure, Oprea Reference Oprea(2020) finds that reducible states are more costly than reduced states, which implies reducible states make a rule more complex.Footnote 5
In other related work, Halevy Reference Halevy(2007) studies the Ellsberg paradox and shows that a substantial portion of subjects violate the axiom of reduction of compound lotteries, which relates to our reducibility result. Other than Li Reference Li(2017), there are several papers that consider dominance in a decision problem. Kagel et al. Reference Kagel, Harstad and Levin(1987) and Kagel and Levin Reference Kagel and Levin(1993) show failures in contingent reasoning in an auction setting, where optimal strategies in each auction have different dominance notions. Like ours, their settings also separate dominance and obvious dominance. Rabin & Weizsäcker Reference Rabin and Weizsäcker(2009) also study dominated choices. They show that some subjects choose a first-order stochastically dominated option, which is consistent with our observation that a substantial portion of subjects fail to make an optimal choice, even when a dominant choice exists.
There are other papers on complexity that consider dimensions different from ours. Banovetz and Oprea Reference Banovetz and Oprea2022 and Mononen Reference Mononen(2022) consider complexity as (mental) costs, which may explain why subjects fail in contingent reasoning with certain domains of complexity more than others. (For further work on complexity, see Araujo & Piermont, Reference Araujo and Piermont2023; Bossaerts & Murawski, Reference Bossaerts and Murawski2017; Campbell, Reference Campbell1988; Carvalho & Silverman, Reference Carvalho and Silverman2019; Echenique et al., Reference Echenique, Golovin and Wierman2011). In addition, there are papers in ambiguity literature that explain ambiguity preference with complexity. For example, Kovářík et al. Reference Kovářík, Levin and Wang(2016) explain the Ellsberg paradox via complexity aversion, and Aydogan et al. Reference Aydogan, Berger and Théroude(2023) show that complexity and ambiguous preferences explain heterogeneity in Ellsberg’s urn setting.
The rest of the paper is organized as follows: Section 2 presents the experimental design. Section 3 shows the results, and Section 4 concludes the paper with a discussion.
2. Experimental design
2.1. Framework
We study a decision maker who faces wage decision problems under uncertainty with different dimensions of complexity. A decision-maker (henceforth DM) has to decide on one wage for a job when they do not know the exact quality of workers. In this section, we formalize the notions of complexities that we consider.
2.1.1. Baseline – one job problem
There is one available job, and the DM wants to hire a worker to maximize their profits. There are two candidate workers: Ann and Bob. Only one of Ann or Bob applies for the job, each with an equal chance. The DM has to decide one wage for the job before observing who applies. Ann and Bob have different willingnesses to accept (θA and θB) and bring different revenues (vA and vB). We assume Ann always brings higher revenue than Bob (
$v_A \gt v_B$), Ann’s willingness to accept is lower than that of Bob (
$\theta_A \lt \theta_B$), and the revenue Ann brings is higher than Ann’s willingness to accept (
$v_A \gt \theta_A$). A worker will accept the job if and only if the wage the DM chooses is greater than or equal to their willingness to accept. Since Bob’s willingness to accept is greater than Ann’s, there is no wage that Ann rejects and Bob accepts. If the worker who applies rejects the wage, then the DM’s payoff will be 0. In this problem, the set of states of the world is
$\Omega=\{\mathbf{A}, \mathbf{B} \}$, where A represents the state where Ann applies for the job and B represents the other.
Niederle and Vespa Reference Niederle and Vespa(2023) describe the contingent reasoning process as thinking of all hypothetical states and computing payoffs for each action-state pair.Footnote 6 We slightly modify this concept to define contingent reasoning. As a first step, we define the action-derived states as contingencies. Specifically, for any wage choice w, each state can be extended to be one where either a worker Accepts or Rejects (for simplicity, A and R, respectively). Thus, for the entire set of wage choices,
$\mathcal{R}_+$, we can consider a Cartesian product of extended states
$\{A, R\}\times \{A, R\} = \{(A, A), (A, R), (R, A), (R, R)\}$, where the first element of the pair represents Ann’s decision and the second element represents Bob’s decision.Footnote 7 Each wage w will trigger exactly one of these extended states. We define each element of this Cartesian product as a contingency.
Next, for each of these contingencies, we call the wage that maximizes their utility in that contingency the contingency-optimal wage. Thus, there will be four contingency-optimal wages in our problem. We then define two levels of contingent reasoning: First, does the DM choose one of the contingency-optimal wages? Second, do they choose the utility-maximizing wage from that set?
In our baseline setting, we can simplify the set of contingencies. Since
$\theta_A \lt \theta_B$, we can rule out (R, A) and consider three relevant contingencies for any wage w: (1)
$w \geq \theta_B$, which triggers (A, A), (2)
$\theta_A \leq w \lt \theta_B$, which triggers (A, R), and (3)
$w \lt \theta_A$, which triggers (R, R). Assuming expected utility, the DM evaluates each w according to

where
$u(\cdot)$ is an increasing function with
$u(0)=0$. Since
$u(\cdot)$ is an increasing function, U decreases in w. Thus, the contingency-optimal wage is the threshold value in each contingency:
$w=\theta_B$,
$w=\theta_A$, and
$w \in [0,\theta_A)$ respectively. Our first question is whether the DM chooses one of these three contingency-optimal wages. Then, we ask if they choose optimally from that set. Since
$v_A \gt \theta_A$, it is trivial that
$w \lt \theta_A$ would give a lower utility than θA, so we can rule out
$w \in [0,\theta_A)$. Thus, fully successful contingent reasoning requires choosing the wage that maximizes the DM’s utility from
$\{\theta_B, \theta_A\}$.
2.1.2. Various notions of dominance
As one dimension of complexity, we consider the dominance property of choices. Specifically, we introduce a modified concept of dominant and obviously dominant strategies, called dominant and obviously dominant choices. In addition to the well-known dominant strategies concept, Li Reference Li(2017) suggests a concept called “obviously dominant strategies” and shows that obviously dominant strategies lead to rational choices more often than dominant strategies. We apply this notion, restricting choices first to those that are contingent-optimal.Footnote 8
Specifically, let
$W^*$ be the set of contingency-optimal choices. By a slight abuse of notation, let u be a function of wage and state, (w, s), in this subsection. We say that
$w^*$ is dominant (or a dominant choice) if
$u(w^*,s) \geq u(w,s)$ for all
$ s \in \Omega$ and for all
$w \in W^*$. We say that
$w^*$ is obviously dominant (or an obviously dominant choice) if
$\min_{s \in \Omega}u(w^*,s) \geq \max_{s \in \Omega}u(w,s)$ for all
$w \in W^*$. Note that any obviously dominant choice is also a dominant choice. For convenience, a “dominant choice” henceforth refers to a dominant but not obviously dominant choice. We regard a decision problem without a dominant choice to be more complex than one with a dominant choice, and a decision problem with a dominant choice is regarded as more complex than one with an obviously dominant choice.
In the baseline setting, if
$v_B-\theta_B \leq 0 \leq v_A -\theta_B$, then θA is a dominant choice, as in Panel (b) of Figure 1. If
$v_B-\theta_B \leq 0$, then θA is an obviously dominant choice (Panel (c) of Figure 1). When a decision problem does not have a dominant choice, the success in contingent reasoning is determined by the DM’s risk preference. For example, as in Panel (a) of Figure 1,
$w=\theta_A$ has a higher mean but also higher variance compared to
$w=\theta_B$. In this case, the optimal choice depends on the DM’s risk preference. How we elicit a risk preference in the experiment is explained in Subsection 2.2.

Fig. 1 Dominance property of choices and the payoffs given in each state when (a) neither choice is dominant, (b)
$w=\theta_A$ dominates
$w=\theta_B$, and (c)
$w=\theta_A$ obviously dominates
$w=\theta_B$.
2.1.3. Reducible problem
Oprea Reference Oprea(2020) proposes several notions of complexity in a decision problem based on the automata literature. “Reduciblity” is one such notion. If a decision problem can be reduced to an equivalent problem with fewer states, then the non-reduced (we call it reducible) problem is more complex than the reduced problem. In line with this measure of complexity, we construct a decision problem with a reducible state that can be reduced to the one job problem.Footnote 9
Consider the following decision that corresponds to this complexity. Similar to the one job problem, suppose that there is only one available job position. In the one job problem, there is always only one applicant, either Ann or Bob. In our reducible problem, there are now three states: Ann applies (A), Bob applies (B), and both apply (AB). If the true state is that both apply for the job (AB), then a randomization device decides who will get the job. The DM has to decide the wage w before knowing which is the true state.
Let a probability that the randomization device assigns to Ann in state AB be r and let the probability of a state ω being realized be pω. If we assume that the DM satisfies the reduction of compound lotteries axiom then this decision problem is equivalent to a decision problem with
$\Omega=\{\mathbf{A}, \mathbf{B}\}$ with probabilities
$(p_{\mathbf{A}}+p_\mathbf{AB}r,p_{\mathbf{B}}+p_\mathbf{AB}(1-r))$. If
$p_{\mathbf{A}}+p_\mathbf{AB}r=p_{\mathbf{B}}+p_\mathbf{AB}(1-r)=\frac{1}{2}$, then the problem can be reduced to the one job problem. We regard this reducible problem as more complex than the corresponding (reduced) one job problem.
Within the reducible problems, we can also create additional complexity due to the dominance property of choices (without dominant choices, with dominant choices, and with obviously dominant choices) by using the same parametrizations as the one job problems.
2.1.4. Two jobs problem: Increasing the number of contingencies
Suppose now that there are two jobs, job x and job y. There is only one available position for each job. There are two workers, Ann and Bob, who differ in their willingness to accept for each job (θij, i is Ann or Bob, and j is job x or job y), and the revenue they bring for each job is
$(v_{ij})$. We impose the parameter conditions that Bob’s willingness to accept is higher than Ann’s for both jobs (
$\theta_{Bj} \gt \theta_{Aj},\;j=x,y$), Ann brings more revenue than Bob for both jobs (
$v_{Aj} \gt v_{Bj},\;j=x,y$), both workers have a higher willingness to accept for job x (
$\theta_{ix} \gt \theta_{iy}$,
$i=$Ann, Bob), and Ann brings more revenue than her willingness to accept for both jobs (
$v_{Aj} \gt \theta_{Aj},\;j=x,y$).
With equal chances, (1) Ann applies for job x and Bob applies for job y, or (2) Ann applies for job y and Bob applies for job x. That means the states of the world are in
$\Omega=\{(\mathbf{AB}), (\mathbf{BA})\}$ where first letter indicates which worker applies to job x. Under this uncertainty, the DM must decide one wage for each job, namely, wx and wy. Like in the one job case, a worker will accept a wage offer as long as it is greater than or equal to their willingness to accept.Footnote 10 The payoff of the DM is the sum of their earnings from the two jobs.
Even if the number of states is the same as the one job case, the number of contingencies is different. After ruling out non-feasible contingencies, there are nine contingencies:
$\{$Both accept job x, Only Ann accepts job x, Both rejects job x
$\} \times \{$Both accept job y, Only Ann accepts job y, Both rejects job y
$\}$.Footnote 11 Assuming an expected utility representation, we have the following form:

By a similar logic as the one job case, we can rule out wages
$w_j \lt \theta_{Aj}$ for any job j as they are dominated by
$w=\theta_{Aj}$. Then the only possible optimal wages are either θAj or θBj, for
$j \in\{x, y\}$. Thus, the contingency-optimal wage profiles
$(w_x, w_y)$ are given by
$\mathrm{W}^*=\{(\theta_{Ax},\theta_{Ay})$,
$(\theta_{Ax},\theta_{By})$,
$ (\theta_{Bx},\theta_{Ay})$,
$(\theta_{Bx},\theta_{By})\}$. In other words, after eliminating trivially dominated cases, the number of relevant contingencies is four.
Like in the one job problem, the range of parameters can be divided into three cases: (1)
$0 \leq v_{Bj}-\theta_{Bj} \leq v_{Aj}-\theta_{Bj} \leq v_{Aj}-\theta_{Aj}$ for all
$j \in \{x,y\}$, (2)
$v_{Aj}-\theta_{Aj} \geq v_{Aj}-\theta_{Bj}$ for all
$j \in \{x,y\}$, and (3)
$\min\{v_{Aj}-\theta_{Aj}, 0\} \geq \max\{v_{Aj}-\theta_{Bj}, v_{Bj}-\theta_{Bj}\}$ for all
$j \in \{x,y\}$. The first case does not have a dominant choice, thus an optimal choice depends on the DM’s risk preference. Again, how we elicit a risk preference will be explained in Subsection 2.2. The second case is one with a dominant choice, and the third case is one with an obviously dominant choice. In these two cases,
$(\theta_{Ax},\theta_{Ay})$ is the optimal wage profile.
2.1.5. Power of certainty
Martínez-Marquina et al. Reference Martínez-Marquina, Niederle and Vespa(2019) show that if a decision problem is represented in two different ways, one in a probabilistic version and the other one in a deterministic version, then failures in contingent reasoning increase in the probabilistic version. They call the difference in failure rates due to the existence of uncertainty the Power of Certainty. Even though the PoC might not be directly related to complexity, we also test whether the PoC exists in our setting with a simple modification.
The PoC environment offers the deterministic version of the wage decision problem we have discussed thus far. The PoC environment is equivalent to the main model in terms of the contingency and (expected) outcomes but has no randomness in the states of the world. We implement this by having two jobs, one for Ann and one for Bob. A worker will accept a wage as long as it is greater than or equal to their willingness to accept. The DM chooses only one wage that is used for both positions. In other words, the DM does not choose different wages for Ann and Bob. They choose only one wage, and Ann and Bob decide whether to accept the offer independently. Thus, they can hire two workers, one worker, or no worker depending on the chosen wage w. If w is greater than or equal to a worker’s willingness to accept, the worker accepts the offer. The DM is paid the sum of their earnings from both jobs. To have consistency in payoffs with the one job problem, we divide total profits by two. We refer to this as the one job PoC problem since it mimics the original one job problem but features deterministic problems.
Contingent reasoning requires choosing a wage from the set
$\{\theta_A,\theta_B\}$. With an expected utility representation, we have

This is equivalent to the original one job problem not only in a contingency sense but also in a utility sense if the DM is risk-neutral.
The two jobs PoC problem is also modified in a similar way. There are two kinds of jobs, job x and job y, and there are two available positions for each job. Both Ann and Bob apply for both jobs, and they can work for both jobs if they accept both jobs. Each worker’s willingness to accept and the revenue they bring are known to the DM. The DM submits one wage for each job, i.e., one wage for job x, and one wage for job y. Depending on the wage chosen, the DM can hire two workers, one worker, or no workers for each job. For consistency, we again divide total profits by two. The contingency-optimal wages are
$W^*=\{(\theta_{Ax},\theta_{Ay}),(\theta_{Ax},\theta_{By}), (\theta_{Bx},\theta_{Ay}),(\theta_{Bx},\theta_{By})\}$. Thus, the relevant contingencies and preferences are the same as the original problem if the DM is risk-neutral. We can derive the expected utility from similar arguments. Note that since there is no uncertainty, risk preferences do not affect optimal choices in either PoC treatment.
The PoC treatment cannot include reducible problems because certainty is a key feature of that design. Thus, we exclude reducible problems.
2.2. Experimental implementation and hypotheses
2.2.1. Main treatment
Table 1 summarizes our design and hypotheses. The main treatment of the experiment employs a within-subjects design consisting of two parts: (1) wage choice problems and (2) lottery choice problems. Each subject experiences both. The first part of the experiment contains the labor market setting where a subject chooses a wage(s) for a job(s) as complexity in decision problems varies. In total, there are nine types of questions: {one job, two jobs, reducible} × {no dominant choice, dominant choice, obviously dominant choice}. Each question type has three questions with different parameters; thus, there is a total of 27 questions. That means each cell of Table 1 has three questions. After the 27 wage decision problems, subjects have 27 lottery choice problems that are equivalent to the wage decision problems. The purpose of the lottery choice problems is to measure subjects’ risk preferences. By doing so, we can identify subjects’ optimal wage choices for the problems where there is no dominant choice in the equivalent wage choice problem.
Table 1. Complexity and hypotheses

From this structure, we say that a subject fails in contingent reasoning if they choose (1) any wage other than workers’ minimum willingness to accept (for example, in the one job case, any wage other than θA and θB) or (2) a wage in the original problem that does not correspond to their chosen lottery in the lottery choice problem. With this concept of failures in contingent reasoning, we have the following three hypotheses.
Hypothesis 1.
Failures in contingent reasoning increase in the number of contingencies.
The first hypothesis can be tested by comparing the first (one job problems) and second row (two jobs problems) results in Table 1. That means if a subject is more successful in the nine questions from the first row compared to the nine questions from the second row, we can argue that failures in contingent reasoning increase in the number of contingencies.
Hypothesis 2.
Failures in contingent reasoning are more common in the reducible than in the equivalent reduced problem.
This hypothesis is also related to the reducibility argument in Oprea Reference Oprea(2020). The questions in the third row (reducible problems) in Table 1 can be further reduced to the first row (one job (reduced) problems). Thus, if subjects perform better in questions corresponding to the first row compared to the third row, then this hypothesis is supported.
Hypothesis 3.
Failures in contingent reasoning decrease when a dominant choice exists. It further decreases when an obviously dominant choice exists.
This hypothesis follows from Li Reference Li(2017) and can be tested by comparing frequencies of choosing optimal choices in the first (obviously dominant choice), second (dominant choice), and third (no dominant choice) columns of Table 1.
The second part of the experiment, lottery choices, is devised to identify optimal choices in problems without a dominant choice. As stated in the previous section, when there is no dominant choice, there is no objectively optimal wage from the set of contingency-optimal choices. Instead, the utility-maximizing choice depends on the DM’s risk preferences.
Rather than assuming subjects are risk averse, we introduce the lottery choice problems that are induced by the contingency-optimal wages from part 1 and check whether choices are consistent between the two. That means we create lotteries using only
$\{\theta_A,\theta_B\}$ or
${\{(\theta_{Ax},\theta_{Ay}),(\theta_{Ax},\theta_{By}),(\theta_{Bx},\theta_{Ay}),(\theta_{Bx},\theta_{By}) \}}$. Let
$\mathscr{L}(p_1,z_1;\dots;p_k,z_k)$ be a simple lottery that gives each zj with probability pj. Similarly, let
$\bar{\mathscr{L}}(p_1,\mathscr{L}_1; \dots;p_k,\mathscr{L}_k)$ be a compound lottery that gives each simple lottery
$\mathscr{L}_j$ with probability pj. Then, each lottery choice problem has one of the following three types:
(1) A choice between
$\mathscr{L}(\frac{1}{2},v_A-\theta_A ;\frac{1}{2},0)$ and
$\mathscr{L}(\frac{1}{2},v_A-\theta_B ;\frac{1}{2},v_B-\theta_B)$, for the one job problems, which corresponds to
$w=\theta_A$ and
$w=\theta_B$ respectively,
(2) A choice between
$\bar{\mathscr{L}}( \frac{1}{3},\mathscr{L}_1(\frac{1}{2},v_A-\theta_A; \frac{1}{2},0); \frac{1}{3},\mathscr{L}_2(1,v_A-\theta_A; 0,0); \frac{1}{3},\mathscr{L}_3(0,v_A-\theta_A;1,0)$ and
$\bar{\mathscr{L}}( \frac{1}{3},\mathscr{L}_1(\frac{1}{2},v_A-\theta_B; \frac{1}{2},v_B-\theta_B); \frac{1}{3},\mathscr{L}_2(1,v_A-\theta_B; 0,v_B-\theta_B); \frac{1}{3},\mathscr{L}_3(0,v_A-\theta_B;1,v_B-\theta_B)$ for the reducible problems, corresponding to
$w=\theta_A$ and
$w=\theta_B$ respectively,
(3) A choice among
$\mathscr{L}(\frac{1}{2},v_{Ax}-\theta_{Ax} ;\frac{1}{2},v_{Ay}-\theta_{Ay})$,
$\mathscr{L}(\frac{1}{2},v_{Ax}-\theta_{Ax}+v_{By}-\theta_{By} ;\frac{1}{2},v_{Ay}-\theta_{By})$,
$\mathscr{L}(\frac{1}{2},v_{Ax}-\theta_{Ay};\frac{1}{2},v_{Ay}-\theta_{Ay}+v_{Bx}-\theta_{Bx})$, and
$\mathscr{L}(\frac{1}{2},v_{Ax}-\theta_{Bx}+v_{By}-\theta_{By};\frac{1}{2},v_{Ay}-\theta_{By}+v_{Bx}-\theta_{Bx})$ for the two jobs problems, corresponding to
$(w_x,w_y)=(\theta_{Ax},\theta_{Ay})$,
$(w_x,w_y)=(\theta_{Ax},\theta_{By})$,
$(w_x,w_y)=(\theta_{Bx},\theta_{Ay})$, and
$(w_x,w_y)=(\theta_{Bx},\theta_{By})$ respectively.
In problems without a dominant choice, the success of the contingent reasoning behavior is determined by the consistency between the wage decision problem and the lottery choice problem. If a subject chooses a wage other than one of the contingency-optimal wages, then regardless of their lottery choice, the subject is said to fail in contingent reasoning. If the subject chooses one of the contingent-optimal wages but chooses a lottery that is induced by a different wage in that set, then it is still a failure in contingent reasoning since the subject does not have a consistent preference. The subject succeeds in contingent reasoning only if they choose one of the contingency-optimal wages in the wage choice problem to accept and the equivalent lottery in the correspondent lottery choice problem.
Even though problems with a dominant choice and an obviously dominant choice have optimal solutions regardless of the DM’s risk preference, we still ask subjects to choose between two/four lotteries to see whether they behave consistently.
2.2.2. Power of certainty treatment
For the Power of Certainty treatment, each subject has 18 questions presented from the first two rows (one job and two jobs problems) of Table 1 with a deterministic representation (one job PoC and two jobs PoC problems). For consistency, we present payoffs as
$\frac{1}{2}(v_i-w)$ if a worker i accepts wage offer w. Since optimal wages are determined regardless of the DM’s risk preference, subjects in this treatment do not have lottery choice problems. Following Martínez-Marquina et al. Reference Martínez-Marquina, Niederle and Vespa(2019), we hypothesize that the PoC treatment decreases failures in contingent reasoning.
Hypothesis 4.
Failures in contingent reasoning decrease in the deterministic problems.
2.2.3. Procedure
At the start of the experiment, subjects were given instructions and comprehension check problems. Next, they faced 27 wage decision problems in random order and then 27 lottery choice problems in random order. In the PoC treatment, subjects instead answered 18 wage decision problems in random order, but were not given lottery choice problems.
In total, 162 Ohio State undergraduate students participated, 113 subjects for the main treatment and 49 subjects for the Power of Certainty treatment.Footnote 12 Subjects were recruited using ORSEE (Greiner, Reference Greiner2015). The participation fee was $6 for the main treatment and $3 for the PoC treatment. Subjects also received an additional payment from one randomly selected question. Each point in the selected question is converted to $0.30. The experiment was programmed by oTree (Chen et al., Reference Chen, Schonger and Wickens2016).
3. Results
This section shows the results of the experiment. There was no learning effect, so we include all decision problems for the analysis. Appendix A provides a graph for the test of the learning effect.
3.1. The number of contingencies
The questions corresponding to the first row (one job problems) and second row (two jobs problems) of Table 1 are different only in the number of contingencies. Each row has three questions for each dominance property of choices (without dominance, dominance, and obvious dominance). Thus, comparing the results from the nine questions in the first row and the nine questions in the second row allows for testing the effect of the number of contingencies (Hypothesis 1) on failures in contingent reasoning.
We graphically illustrate our results. Figure 2 illustrates the results for the number of contingencies. It shows the cumulative distribution functions (CDFs) of the number of questions answered incorrectly, either with wages that correspond to inconsistent preferences or wages outside of the contingency-optimal set. As the CDFs show, subjects submit incorrect answers more often in the two jobs problems. This implies that subjects fail more in contingent reasoning with a higher number of contingencies. The one-sided Wilcoxon signed-rank test confirms that the difference in two distributions is statistically significant with more failures in the two jobs problems (p-value < 0.0001). Also, the Kolmogorov-Smirnov test confirms that the entire distributions are different (p-value < 0.0001).

Fig. 2 CDFs of the number of incorrect answers
Our regression results further confirm that the number of contingencies affects the failures in contingent reasoning. For each class of question, if a subject submits correct choices for all nine questions, then they are defined as a rational type. For one job problems, 26.5% of subjects are classified as a rational type (p-value < 0.01). On the other hand, for two jobs problems, only
$3.5\%$ of subjects are classified as a rational type (p-value = 0.27). In addition, we conduct Fisher’s exact test, and the test result is consistent, with p-values less than 0.01. Detailed tables are in Appendix A.
We, therefore, confirm that a larger number of contingencies increases the failure in contingent reasoning.
Result 1.
Increasing the number of contingencies increases the failures in contingent reasoning.
3.2. Reducibility
By the same logic applied for testing the effect of the number of contingencies, we compare the first row (one job (reduced) problems) and the third row (reducible problems) of Table 1 to test the effect of reducibility.
As the CDFs in Figure 3 depict, the two distributions are very close. Specifically, the one-sided and two-sided Wilcoxon-signed rank tests show that there is no statistically significant difference (p-value = 0.7174 and p-value = 0.7174, respectively). Also, the Kolmogorov-Smirnov test confirms that there is no difference in the shape of distributions (p-value = 0.7189). This suggests that reducibility does not increase the failures in contingent reasoning.

Fig. 3 CDFs of the number of incorrect answers
The regressions corroborate the result. In particular, the contrast p-value is 0.767, meaning there is no statistical difference between the two. Fisher’s exact test result also shows that the difference between reduced and reducible problems is insignificant. Detailed result tables for these tests are in Appendix A.
Result 2.
There is no difference in the failures in contingent reasoning between the reducible problem and the equivalent reduced problem.
3.3. Various notions of dominance
The following result shows the effect of complexity due to the dominance property of choices: without a dominant choice, with a dominant choice, and with an obviously dominant choice. How this dimension of complexity affects a decision can be tested by comparing the three columns in Table 1.
Figure 4 shows the distributions of the number of incorrect answers across dominance property of choices. As we hypothesized, subjects perform the worst without dominant choices. For example, the percentage of subjects who submit incorrect answers for more than five questions (over half of the total) is 23% for obviously dominant (Obv) choices and 28% for dominant (Dom) choices. However, this percentage increases drastically to 48% when no dominant choices are present (NoDom). Moreover, a difference across the three distributions is statistically significant (p-value < 0.001 with a Kruskal-Wallis test). For the Kolmogorov-Smirnov test, the obviously dominant choices and dominant choices are different at the marginal (p-value=0.0513), and without dominant choices are significantly different (p-value < 0.001). Thus, the CDFs verify that the complexity due to the dominance property of choices affects the rate of failures in contingent reasoning, as we hypothesized.

Fig. 4 CDFs of the number of incorrect answers
Regression results confirm these conclusions. Subjects fail in contingent reasoning the most when a dominant choice does not exist (0.9% rational type with p-value = 0.82). The difference between a dominant choice (34.5% rational type with p-value < 0.01) and an obviously dominant choice (50.4% rational type with p-value < 0.01) is statistically significant, but the magnitude of difference is smaller than the difference between without dominant choice and dominant choice. The regression table and the Fisher’s exact test result, which is consistent with all other results, are in Appendix A.
Result 3.
Failures in contingent reasoning increase when there is no dominant choice. The improvement due to a dominant choice compared to an obviously dominant choice exists but is smaller.
3.4. Power of certainty
We test whether the PoC persists in our setting. For consistency, we exclude reducible problems from the main treatment and compare the remaining 18 questions with the PoC treatment, which represents the same problems in a deterministic way. We ran a between-subject design for testing the PoC.
Figure 5a illustrates the difference in distributions between the main treatment and the PoC treatment. The difference is statistically significant (Wilcoxon rank sum test, p-value < 0.001; Kolmogorov-Smirnov test, p-value < 0.001) with subjects submitting incorrect answers more often in the main treatment. Thus, this shows the existence of the PoC in our setting.

Fig. 5 PoC CDFs. a) Overall b) One job problems c) Two jobs problems
We further investigate the PoC by separating the one job and two jobs problems. Figure 5b and Figure 5c show one job and two jobs problem distributions, respectively. There is no difference between the two treatments when we restrict our attention to the one job problems (Wilcoxon rank sum test, p-value = 0.87 and Kolmogorov-Smirnov test, p-value = 0.5602). On the other hand, in the two jobs problems, subjects in the main treatment perform worse than those in the PoC treatment. Thus, the PoC is driven by the two jobs problems (p-value < 0.001 for both Wilcoxon rank sum test and Kolomogorov-Smirnov test). Regressions uphold the results as well. The regression tables are in Appendix A.
Result 4.
The Power of Certainty exists but is only driven by the two jobs cases.
3.5. Contingency-optimal but inconsistent wages
One concern that can arise would be a distinction between contingent reasoning and consistency in preference: If a subject chooses one of the contingency-optimal wages that is not consistent with their lottery choice (we call this inconsistent wage from now), then it is unclear whether this is due to failure in contingent reasoning, inconsistency in preference, or both. Alternatively, such inconsistency could be due to a preference for randomization over lotteries (Agranov & Ortoleva, Reference Agranov and Ortoleva2017; Agranov et al., Reference Agranov, Healy and Nielsen2023). Even though we cannot distinguish between these explanations, we can still look at frequency of subjects choosing contingency-optimal wages as a minimal measure of contingent reasoning.Footnote 13
The first column of Table 2 shows the percentage of subjects who always choose a contingency-optimal wage but choose an inconsistent wage at least once. These numbers suggest that substantial subjects can at least rule out wages that fall outside of the set of contingency-optimal wages. The second column is a reiteration of the previous sections and indicates the percentage of subjects who always choose a contingency-optimal and consistent wage. The third column is the absolute failure, representing the percentage of subjects who do not choose contingency-optimal wages at least once.
Table 2. Inconsitent and consistent wages

From Table 2, we can see that there is a substantial portion of subjects who unquestionably fail in contingent reasoning. As for the effect of complexity, the result is ambiguous: The 65.49% who fail in dominant choices represent all possible failures of contingent thinking, while without dominant choices, the failures of contingent reasoning are spread between the “inconsistent” and “absolute failure” categories. Whether without dominant choices (the more complex treatment) has a higher rate of failures in contingent reasoning thus depends on what share of “inconsistent” subjects are, in fact, failing in contingent reasoning.
4. Discussion
In this paper, we study how the complexity of a decision problem can affect failures in contingent reasoning. For the three dimensions of complexity we consider, we find that reducible states do not affect the failures in contingent reasoning, while the other two have an effect.
Specifically, we find that an increase in the number of contingencies increases the failures in contingent reasoning. We also find that dominance affects the failures in contingent reasoning. A dominant choice reduces the failures compared to when a dominant choice does not exist, and obviously dominant choices further reduce the failures. However, the difference between obviously dominant and dominant choices is not as significant as the existing literature in strategic settings suggests. For example, Kagel et al. Reference Kagel, Harstad and Levin(1987) show that in an auction setting, subjects barely deviate from equilibrium strategies when there is an obviously dominant strategy, but persistently deviate when an auction has a dominant but not obviously dominant strategy. Li Reference Li(2017) also shows that subjects’ performance is improved substantially with an obviously dominant strategy compared to a dominant strategy. The difference in degree between this paper and the existing literature suggests that even if an (obvious) dominant choice is a parallel concept to an (obviously) dominant strategy, the impact on rational decision-making can differ depending on the context.
Lastly, we show that the PoC exists but is only significant in the two jobs problems. The magnitude of the PoC in our paper is smaller than that found by Martínez-Marquina et al. Reference Martínez-Marquina, Niederle and Vespa(2019). The difference could be due to several reasons. One possibility is a population difference: OSU undergraduates and MTurk subjects might have different sophistication levels. It is possible that as the level of sophistication of the population increases, the PoC is more likely to be relevant for problems with a higher number of contingencies. Thus, there appear to be interesting interactions between sophistication and the number of contingencies. The difference in results could also be driven by other design factors, such as how risk preferences are measured or a difference in the structure of the decision problems.
In future studies, we can extend the experimental design to capture the different degrees of complexities. One possible direction is to test s-complexity aversion – suggested by Oprea Reference Oprea(2020) – in a contingent reasoning environment by asking whether subjects are willing to bear a cost to avoid a larger number of contingencies. This would test whether failures in contingent reasoning are due to cognitive limitations or preferences when the number of contingencies vary. Also, we can dive deeper into reproducibility by changing the distribution in reducible problems. Our current experiment uses only uniform distributions. We could change the distribution to one that requires more calculations but is still equivalent to a reduced problem. If it requires significantly more calculation and the results change, then that would isolate a different level of complexity in the same family of problems. These directions would help further understand how complexity and failures in contingent reasoning are related.
Supplementary material
The supplementary material for this article can be found at https://doi.org/10.1017/eec.2025.7.
Replication packages
To obtain replication material for this article, https://doi.org/10.17605/OSF.IO/QEMG4.
Acknowledgements
The author thanks Paul J. Healy, John Rehbeck, Yaron Azrieli, Leeat Yariv, participants at the ESA Global Meetings (2022), FUR (2022), MEA 86th Annual Meetings, members of the OSU Theory/Experimental Reading Group, two anonymous referees, and the editor, for valuable comments.
Funding statement
Funding for this project was provided by the JMCB Small Research Grant at OSU. This project was determined exempt from IRB review under protocol 2021E0330.