Predictably intransitive preferences

David J. Butler; Ganna Pogrebna

doi:10.1017/S193029750000766X

Predictably intransitive preferences

Published online by Cambridge University Press: 01 January 2023

David J. Butler and

Ganna Pogrebna

Show author details

David J. Butler: Affiliation:
Department of Accounting, Finance and Economics, Griffith Business School, Griffith University, Gold Coast, Queensland
Ganna Pogrebna*: Affiliation:
Alan Turing Institute, 96 Euston Road, Kings Cross, London, NW1 2DB
*: † Corresponding author: Department of Economics, Birmingham Business School, University of Birmingham, JG Smith Building, Birmingham, B15 2TT, Email: [email protected].

Article contents

Abstract
Introduction
Inferences for Lottery Design and Experiment
Results
Conclusion: Predictably Intransitive?
Footnotes
References

Rights & Permissions

Abstract

The transitivity axiom is common to nearly all descriptive and normative utility theories of choice under risk. Contrary to both intuition and common assumption, the little-known ’Steinhaus-Trybula paradox’ shows the relation ’stochastically greater than’ will not always be transitive, in contradiction of Weak Stochastic Transitivity. We bespoke-design pairs of lotteries inspired by the paradox, over which individual preferences might cycle. We run an experiment to look for evidence of cycles, and violations of expansion/contraction consistency between choice sets. Even after considering possible stochastic but transitive explanations, we show that cycles can be the modal preference pattern over these simple lotteries, and we find systematic violations of expansion/contraction consistency.

Keywords

intransitivity cycles lotteries experiment expansion consistency

Type: Research Article
Information: Judgment and Decision Making , Volume 13 , Issue 3 , May 2018 , pp. 217 - 236

DOI: https://doi.org/10.1017/S193029750000766X [Opens in a new window]
Creative Commons: The authors license this article under the terms of the Creative Commons Attribution 3.0 License.
Copyright: Copyright © The Authors [2018] This is an Open Access article, distributed under the terms of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/), which permits unrestricted re-use, distribution, and reproduction in any medium, provided the original work is properly cited.

1 Introduction

Researchers have questioned the adequacy of Expected Utility Theory (EUT) as an account of choice under risk since Allais (1953) presented his famous ’paradox’ examples. Economists question one axiom of EUT less than most: transitivity. Bar-Hillel & Margalit (1988) quote Luce & Raiffa’s (1957) definition of transitivity as "if A is preferred in the paired comparison (A, B) and B is preferred in the paired comparison (B. C), then A is preferred in the paired comparison (A, C)" (Luce & Raiffa, 1975, p. 16). Notice that the binary comparison A, C is therefore superfluous: a rational chooser can rely on transitivity to deliver the best option. Choice cycles cannot occur unless this chooser is exactly indifferent between A, B and C, or makes a mistake. In short, economists regard transitivity as a defining characteristic of rational choice.

In light of this consensus, Reference Butler and BlavatskyyButler & Blavatskyy (2018) propose the following scenario. A fund manager offers a reward to the broker who selects one portfolio that outperforms the others over the following year. The decision maker’s (DM) preference then is to maximise the probability of earning the greater sum. Suppose there are three (statistically independent) portfolios; portfolio A yields $4m with probability 2/3 and $1m with probability 1/3; portfolio B yields $3m for sure; and portfolio C yields $5m with probability 1/3 and $2m with probability 2/3.

Suppose the fund manager begins by comparing {A,B}; she will choose portfolio A because A yields a higher outcome than B with probability 2/3. Next she compares {B,C}; she chooses portfolio B because B will yield a higher outcome than C also with probability 2/3. Then as a rational decision maker relying on transitivity for choice set {A,C}, she selects portfolio A over C. However, her faith in transitivity is disadvantageous. Had she not relied upon transitivity, her revealed preference in {A,C} would have been for C, because C yields a higher outcome than A with probability 5/9, or 55.5%. The advantageous preference ordering across the set of pairwise choices is the cycle A ≻ B ≻ C ≻ A, contradicting the transitivity axiom.Footnote ¹ While some may say this makes probable winner preferences whether induced or elicited unreasonable (e.g., Pratt, 1972), this paper takes a different view (e.g., Blyth, 1972; Reference Bar-Hillel and MargalitBar-Hillel & Margalit, 1988). First, businesses do employ these kinds of incentives, which for pairwise decisions can lead to the preference (and choice) cycle in our example. Second, no utility theories with a transitivity axiom currently come with a warning that they cannot account for probable winner or related preferences, in which case it is not clear why we should deem their preferences unworthy of maximization.

This cycle in our example is an illustration of a paradox first described by Steinhaus and Trybula (Reference Steinhaus and TrybulaSteinhaus & Trybula, 1959). As a mathematical puzzle their paradox, which we denote STP for short, has inspired a small literature in applied statistics (e.g., from Usiskin 1964 to Conrey et al., 2016). We may state it as follows: let choice objects A, B, C be independent random variables and let Pr(A ≻ B) denote the probability of choosing A over B. It is possible for Pr(A ≻ B), Pr(B ≻ C) and Pr(C ≻ A) to all exceed 50%, given a preference for the winner, contrary to Weak Stochastic Transitivity (WST).

Steinhaus & Trybula proved that, for three choice objects, each with three equiprobable attributes, the theoretical maximum ’minimum’ (max-min) winning probability is (√5−1)/2 or 61.8%. It is because this value exceeds 50% that preference cycles may arise. In our earlier example, the smallest of the three ’winning’ probabilities is 55.5%. While we focus in the rest of this paper on preferences over simple lotteries, we should not forget the relevance of these objects to real economic decisions for choice under risk. Steinhaus & Trybula gave an application to testing the relative strength of randomly selected steel bars A, B, C, for which successive comparisons could exhibit a cycle: A stochastically stronger than B, B than C but with C stochastically stronger than A. Other examples are not difficult to imagine.

Given the STP relies on a preference for the most probable winner, as incentivised by the fund manager in our earlier example, does it have much relevance for decision theory and individual preferences more generally? This paper suggests that the answer is yes, even though the STP has passed mostly unnoticed in the decision theory literature (exceptions include Reference Butler and HeyButler & Hey, 1987; Reference AnandAnand, 1993; Reference BlavatskyyBlavatskyy, 2006; Reference Rubinstein and SegalRubinstein & Segal, 2012). Even if very few individual preferences over lottery pairs are simply for the probable winner, the STP still serves as an important demonstration that imposing transitivity on an unrestricted domain of preference profiles will sometimes result in an inferior choice. We conjecture the STP can also serve as a heuristic in constructing new lotteries, over which a broader range of preferences may cycle, a claim we return to in section on Inferences for Lottery Design and an Experiment.

Kahneman (2012) reminds us that "The errors of a theory are rarely found in what it asserts explicitly; they hide in what it ignores or tacitly assumes". Transitivity must hold either if a value attaches to each option without reference to other alternatives (choice-set independence), or if an equivalent value results after comparing and contrasting the attributes of the available choice options. The latter process points us towards the flaw in how transitivity is applied to multi-attribute choice. This process will produce an equivalent value only if utility is sufficiently ’linear in the differences’ between the options’ attribute values; see Tversky (1969); Fishburn (1982); or Loomes & Sugden, (1982) for details. The STP relies on an extreme example of a non-linear additive difference choice rule, for which a larger difference in an attribute’s magnitude carries no extra weight.

However, individuals often form valuations of options in a comparative, context-dependent manner rather than attracting a context-independent value (Reference Russo and DosherRusso & Dosher, 1983; Arieli et al, 2009; Reference Loomes and SugdenNoguchi & Stewart, 2014). Evidence from eye-tracking experiments shows clear empirical evidence against choice-set independence, at least when expected utilities are sufficiently ’close’ to prompt a DM to compare the attributes of the alternatives.Footnote ² Non-linearity sufficient to produce a preference cycle (dependent on the relative size of the attributes) may then occur.

The rest of the paper is organised as follows. In the next Section, we discuss inferences for lottery design and describe our experiment. After that, we present the experimental results and conclude with a general discussion

2 Inferences for Lottery Design and Experiment

2.1 General Implications

Now we transition to using the STP objects to design our own lotteries without either inducing or assuming probable winner preferences. Let us consider decision making under risk when choice alternatives are lotteries — i.e. probability distributions over a nonempty finite set of outcomes. A decision maker faces a set of choice alternatives that contains at least two distinct elements. Next, the DM chooses the choice alternative that yields a strictly greater, context independent, expected utility. Often, no such choice alternative is present, so she compares the attributes of the available options to recognize where her preference lies. This step is required to avoid arbitrary choice; a growing body of evidence shows preferences are often known imperfectly (inter alia, Reference Butler and LoomesButler & Loomes, 2007).

Descriptively, the consensus is that true intransitive preference cycles are vanishingly infrequent. Evidence once taken to indicate systematic intransitivity (Reference Tversky and RussoTversky, 1969; Reference Loomes, Starmer and SugdenLoomes, Starmer & Sugden, 1991) has since been either reinterpreted as not reflecting fundamental intransitivity (inter alia: Reference Starmer and SugdenStarmer & Sugden, 1993) or found by newer statistical methods to be compatible with noisy but transitive responses (Baillon et al, 2015; Reference Birnbaum and DiecidueBirnbaum & Diecidue, 2015). In his highly influential 1969 article ’Intransitivity of Preference’, Amos Tversky lamented "...in the absence of a model that guides the construction of the alternatives, one is unlikely to detect consistent violations of weak stochastic transitivity (WST)".

One reason why experiments to date have only rarely found evidence of intransitive behaviour is lack of guidance from theory to select suitable lottery parameters. This lack of guidance is probably a result of the assumption any utility function must apply to all lottery pairs. However, this presumes there is no ’black hole’ in parameter space from which question sets can trigger preferences of a different kind. Drawing on the STP as a heuristic, we may address the problem Tversky faced and bespoke design candidate lotteries.

Let each consequence represent a sum of money, in £, a very familiar, directly comparable outcome for which magnitudes are easily interpretable by our subjects. For simplicity and comparability, the probability of each outcome in our design is always 1/3, 2/3 or 1. We imposed some filters to guide our selection of triples. Since expected value differences between each choice object, within a given triple, reach a maximum of £4 ²/3 ± £1 ²/3 ± 35.7%, our first filter is to focus on sets with larger EV differences (see Figure 1 for the distribution). We avoid constructing triples informed by sets with equal or near equal EV’s to avoid tipping the balance towards a cycle simply through noise.

Figure 1: Experimental Flow.

Next, the second filter reduces cognitive load by requiring each of the three STP objects to reuse integers such that there are no more than two different amounts as consequences (e.g., 5, 2 and 2); we therefore exclude any triple with three different money consequences on any lottery. It is also important to keep the presentation of the number of attributes in each object equal rather than coalescing identical outcomes. This is because past experiments have found the contrast between coalesced and non-coalesced outcomes (also known as event-splitting; Reference Starmer and SugdenStarmer and Sugden, 1993) can be confused as evidence for intransitivity (Reference Birnbaum and SchmidtBirnbaum & Schmidt, 2008; Baillon et al, 2015). To control for this we display each consequence even when all three lead to the same sum. We then made a number of modifications that increased the prizes on offer and then allowed for risk aversion in our experiment (risk-aversion plays no role for probable winner preferences). We make no claim this step in the parameter selection process involves more than a mix of informed guesswork and personal judgment.

Finally, as a third filter, to mimic the preference reversal (PR) phenomenon problems (Reference Lichtenstein and SlovicLichtenstein & Slovic, 1971) we decided to focus on the STP triples which have expected values strictly in the following order: $ ≻ P ≻ CE. Here, $ is a dollar bet (lottery which yields a large outcome with low probability), P is a probability bet (lottery which gives a small outcome with large probability) and CE is certainty. If we have 3 lotteries: X, Y and Z, we will assume that Z stands for the dollar bet, X - for the probability bet, and Y for certainty (degenerate lottery).

An important implication of our design choices is that the direction of cycles may be subject to two opposing forces, in aggregate, because the direction of cycles for standard PR lotteries is opposite to the ’probable winner’ cycle. However, this should not stop systematic cycles appearing at the individual level, if some people exhibit one tendency more strongly. This in mind, we can now put forward our first testable hypothesis. Reference Birnbaum and SchmidtBirnbaum & Schmidt (2008) succinctly state the currently dominant view regarding the evidence for intransitive preferences: "...we think the burden of proof should shift to those who argue that intransitive models are descriptive of more than five percent of the population".

Hypothesis 1: Drawing on the STP ingredients, we can design sets of lotteries for which cycles will occur with significantly greater frequency than 5%.

We can see the lotteries we designed in the left hand columns (columns 1–6) of Table 1.

Table 1: Binary choices and intransitivity.

In the spirit of Allais’ famous example, consider one such ’bespoke’ lottery set, and assume your preferred lottery is incentivized. The three choice objects are statistically independent lotteries: each outcome is a monetary amount with a one-third probability attached. For each of the binary choice sets {X,Y}; {Y,Z}; {Z,X}, viewed separately, we ask the reader to consider her preference, ideally looking only at each decision in isolation. In combination, there are eight possible binary preference patterns, of which just two are intransitive. Consider X versus Y, where X provides £15 with probability 1/3, £15 with probability 1/3, or £3 with probability 1/3; and Y yields £10 for certain - i.e., £10 with probability 1/3, £10 with probability 1/3, or £10 with probability 1/3 (see Table 1). Suppose Y ≻ X. Now compare Y which gives £10 for certain and Z which provides £27 with probability 1/3, £5 with probability 1/3, or £5 with probability 1/3. Perhaps, here Z ≻ Y. Finally, compare Z which yields £27 with probability 1/3, £5 with probability 1/3, or £5 with probability 1/3 and X which provides £15 with probability 1/3, £15 with probability 1/3, or £3 with probability 1/3.Footnote ³

In this case, maybe you found X ≻ Z. If you prefer Y ≻ X, Z ≻ Y and X ≻ Z, you have exhibited the preference cycle X ≻ Z ≻ Y ≻ X, the modal preference pattern for our subjects. The opposite cycle here is X ≻ Y ≻ Z ≻ X; we found these two intransitive patterns together exceeded, by a small majority, the six transitive patterns combined. Referring back to our opening example, suppose the consequences on each of X, Y and Z refer to investment returns on three portfolios and the probabilities are the historical frequencies. A consumer’s binary preferences over those risk-return combinations might potentially cycle also with implications for the structure of portfolios in finance.

2.2 Design Implications from Models of Probabilistic Choice

Although choice is often stochastic, and an intransitive cycle may arise from transitive latent preference due to noise, distinguishing structurally intransitive latent preferences from stochastic transitivity in experiments is not straightforward. How frequently can intransitive cycles arise for individuals with transitive preferences, but who choose probabilistically? For example, individuals may have transitive core preferences but choice probabilities are determined by embedding these preferences into a model of random errors (e.g., Blavatskyy, 2014). Such a modelling approach can generate a statistically significant asymmetry between the two possible cycle directions, but it cannot generate a proportion of intransitive cycles above 25% of all observed choice patterns, for any triple.

A more promising model of probabilistic choice for rationalizing intransitive cycles is the random preference approach (e.g., Loomes & Sugden, 1995). As the extreme example, let us consider an individual who has three transitive preference orderings X ≻ Y ≻ Z, Z ≻ X ≻ Y and Y ≻ Z ≻ X with each ordering equally likely to be drawn when a choice is to be made. It is straightforward to see that in a direct binary choice between X and Y, this individual chooses X with probability 2/3. Likewise, in a direct binary choice between Y and Z, this individual chooses Y with probability 2/3. Finally, in a direct binary choice between X and Z, this individual chooses Z with probability 2/3, thereby violating weak stochastic transitivity. Thus, a model of random transitive preferences generates a maximum of (2³)/(3³) = 8/27 (29.6%) intransitive choice cycles. This limit involves a strong asymmetry between the two possible intransitive patterns; the maximum frequency of a particular cycle given random sampling is 1/4; see Reference Rubinstein and SegalRubinstein & Segal (2012) for proofs of these propositions.

However, a model of random transitive preferences has another testable implication so far overlooked by a literature focused on binary choice sets. When comparing binary choice data with the choice data from a ternary set, we derive a new set of constraints that any stochastic but exclusively transitive preferences must meet. In such models of stochastic choice, the probability of choosing X from the ternary set {X,Y,Z} is given by the probability that a decision maker draws a preference order in which X is preferred to Y and X is preferred to Z. In contrast, for a direct binary choice between X and Y, this decision maker chooses X with a probability that is equal to the probability that he or she draws a preference order in which X is preferred to Y (but X may or may not be preferred to Z). Similarly, for a direct binary choice between X and Z, this decision maker chooses X with a probability that is equal to the probability that he or she draws a preference order in which X is preferred to Z (but X may or may not be preferred to Y). Hence, any model of random transitive preferences must make the following testable hypotheses. If any one of the three hypotheses fails to hold, no model of stochastic transitive preferences can be consistent with the data.

Hypothesis 2: The probability of choosing X from the ternary set {X,Y,Z} cannot exceed

(1)

Hypothesis 3: The probability of choosing Y from the ternary set {X,Y,Z} cannot exceed

(2)

Hypothesis 4: The probability of choosing Z from the ternary set {X,Y,Z} cannot exceed

(3)

Since, by definition, the probabilities of choosing X, Y and Z from the ternary set {X,Y,Z} must sum up to one, we have the following implication of any model of stochastic but transitive preferences:

(4)

A decision maker who violates weak stochastic transitivity, such that P(X,Y)>0.5, P(Y,Z)>0.5 and P(Z,X)>0.5, must still satisfy the inequality

(5)

which can be simplified as a triangle inequality

(6)

The triangle inequalities (5) and (6), it is usually argued, produce a stronger test than WST to separate genuine intransitivity from stochastic transitivity. However, Birnbaum (2011) showed that the triangle inequalities could be satisfied even by underlying intransitive preferences. Furthermore, recent work by Müller-Trede et al. (2015) demonstrates how these inequalities may be violated even when underlying preferences are 100% transitive. Their experiment also shows clear violations of these inequalities.

In other words, the triangle inequalities for stochastic transitive preferences can be satisfied when preferences are intransitive and violated when preferences are transitive, raising a concern that they are not as useful for identifying true intransitive preference cycles as generally believed, though see Cavagnaro & Davis-Stober (2014) for an alternative view. For these reasons, among others, our experiment was not designed specifically to test the triangle inequalities, which ideally would require multiple repetitions of the same lottery pairs for every person. However, we can test H2-H4 below, for each triple violating WST. We also follow Reference Birnbaum and DiecidueBirnbaum & Diecidue (2015) and repeat each set of choices once after a distractor task, which facilitates additional methods of separating noise from true preferences.

Consider a decision maker who: a) makes a direct binary choice between choice alternatives X and Y; and b) ranks choice alternatives X and Y as part of a ternary choice set in terms of their desirability. Ignoring the ranking of Z, this decision maker can reveal four different preference patterns:

• X ≻ Y and X is ranked more desirable than Y (revealed preferences i);
• Y ≻ X and Y is ranked more desirable than X (revealed preferences ii);
• X ≻ Y and Y is ranked more desirable than X (revealed preferences iii);
• Y ≻ X and X is ranked more desirable than Y (revealed preferences iv).

Revealed preferences i and ii are both consistent with the independence of irrelevant alternatives axiom. Revealed preferences iii and iv are inconsistent with this axiom. Thus, if preferences of a decision maker satisfy contraction and expansion consistency, we should observe patterns i and ii and not patterns iii and iv. Our experiment investigates.

2.3 Experimental Design

In total 100 subjects (all undergraduate students at the University of Warwick) were invited to take part in the experiment. We programmed the experiment using the Qualtrics software and consisted of 100 questions divided into 5 parts (see Figure 1).

Earlier tests for preference cycles primarily used state-contingent consequences in matrix-style displays. Those displays facilitate between-act comparisons and enhance the possibility of, for instance, anticipated regret when consequences are state-contingent and thus the potential for cycles. Our design maintains statistical independence between the choice objects such that any observed preference cycles are more likely to be rooted in description-invariant, intransitive latent preferences. Finally, we include a ’standard PR’ control set (Triple 9) to compare to our ’new PR’ gambles that is the focus of our experiment.

In Part 1, we broke up the 11 triples into binary choices between individual lotteries and asked subjects to answer 33 questions (3 binary choice questions per each triple). Table 1 provides a detailed list of all triples. We present each binary choice in the format shown in Figure 2 with two options — Left and Right. Each option shows a lottery with 3 equiprobable outcomes.

Figure 2: Binary choice display.

All binary choice questions gave subjects four different options on a slider. The initial starting point for each slider was "No preference". However, subjects were not able to proceed by leaving the slider in the original position (i.e., the choice of "No preference" was not allowed). Subjects were able to move the slider to the right and opt for "Slightly prefer Right" or "Strongly prefer Right" or, alternatively, to move the slider to the left and choose "Slightly prefer Left" or "Strongly prefer Left". Irrespective of whether a subject indicated slight or strong preference, we used only revealed preferences for "Left" or "Right" in the payoff calculations and we do not report the strength of preference results here. We randomized all 33 questions for each individual separately. In Part three, we repeated all 33 binary questions again but presented them in a different random order to each subject.

In Part 2, we asked subjects to make two choices in each ternary set, for the most and next most preferred object for each of the 11 triples (see Figure 3).

Figure 3: Ternary Choice Display.

We sought to maximize the similarity to the binary choice task and so we incentivized the choice of best and next best in each ternary set to obtain the full ordering in the triple. We explained that we would draw two of the three lotteries at random and they would play out whichever of the two they had positioned higher, if the ternary set is selected for payment. We use the standard ’random lottery incentive system’ for the binary choices. Our aim was to keep to a minimum any extraneous ’choice versus ranking’ disparities, to allow as clean a test as possible of the expansion consistency property and the transitive random preference restrictions, H2–4.

We randomized the order in which the ternary sets appeared, as well as the order in which the lotteries appeared on each screen. To avoid lazy acceptance of the default ordering, it was not possible to accept the default ranking. If the default was by chance preferred, they first had to move away from the default then move back to it by deliberate choice.

In Part 3, subjects were offered a Distractor Task in order to create a break between the two repetitions of binary and ternary choice tasks. The distractor task consisted of 12 risky choices using a different display, the results of which are not reported here. In Part 4, all 11 ternary choice set problems were repeated in a different random order.

We asked subjects to complete all tasks in the experiment online in a 5-day window. All 100 subjects received an invitation to the laboratory to play out their decisions for real money. Each subject drew a question number (between 1 and 100) at random and received payment based on his/her choice in that question. In each question, we looked at the lottery option chosen by the subject and played out that lottery according to the description on the experimental display (e.g., Figure 2 and Figure 3). Immediately after the draw, subjects received their payoff in cash. All 100 subjects turned up to play out the lottery and receive their winnings. Finally, all subjects completed a detailed online survey covering questions such as domain-specific risk attitudes and a variety of demographic variables not reported on here.

3 Results

3.1 Descriptive Statistics

Table 2 reports the frequency of intransitive cycles for each of the eleven triples obtained from binary choices. Other than the PR control (Triple 9) we find the proportion of preference cycles averaged across both repetitions ranges from a low of 18% (Triples 2 and 6) to a high of 59% (Triple 4).Footnote ⁴

Table 2: Frequency of intransitive cycles obtained from binary choices, in percent.

Notes: R1 — Repetition 1; R2 — Repetition 2; INRT - intransitive preferences; TR - transitive preferences.

In contrast the PR ’control’ triple found just 6% intransitive patterns. Other than the control triple, the proportion of cycles for each triple is strongly significantly greater than 5%, using Fisher’s exact test, supporting H1. The average proportion intransitive in the first block was 27.1% followed by 25.8% for the second block, giving an overall intransitive proportion of 26.5%. If learning occurred between blocks, it did not reduce the occurrence of cycles noticeably, giving our first clue that error may not be the main cause of cycles. Of the eight possible preference orderings (two intransitive, six transitive), the mode was intransitive for three of the ten triples and an intransitive pattern was runner-up in a further five triples.

At the individual level, between 30% and 85% of subjects cycled in each of the ten triples either once or on both repetitions. For instance, in triple 4 alone, 85 of the 100 individuals cycled at least once and 34 cycled on both occasions. It appears that a significant minority, a plurality, even an occasional majority, exhibit intransitive choice cycles, for these statistically independent pairs of simple, incentivized lotteries. Across the ten triples, every single one of the 100 subjects cycled at least once. A Spearman correlation of the number of cycles by individual between repetitions was +0.93, suggesting again that the cycles we observe are not simply random errors but latent intransitive preferences. Figure 4 shows the histogram of cycles by individual.

Figure 4: Histogram of Cycle Frequency by Individual.

The predominant direction of cycles for our lotteries is consistent with ’reverse’ cycles, rather than the ’probable winner’ cycles.Footnote ⁵ The 26% average figure breaks down 19:7 in favour of the ’reverse’ direction. For the eight triples where an intransitive pattern is either the mode or runner-up, six follow the ’reverse’ direction and two the ’probable winner’ direction. This likely reflects our choice of expected value rankings for X, Y and Z, as we noted previously. Unlike the STP, the random lottery incentive system does not induce a particular preference pattern; it elicits preferences rather than prescribes them.

3.2 Noisy Transitivity or Noisy Intransitivity?

A reasonable approximation to a true, ’error free’ proportion for each preference pattern is to identify those subjects making the same three binary choices within a triple on both repetitions. To do so means they avoid six possible choice errors, for each ’true’ preference pattern. Across the ten sets of triples (excluding the control), the diagonals for each triple in Table 2 reveal this occurs on 366 occasions out of 1000. Of these, 117 were of one of the two intransitive orderings and 249 were for one of the six transitive orderings. Thus, the share of revealed consistently intransitive preference patterns among all revealed consistent preference patterns was 32% (117 out of 366). To get a sense of how striking this finding is, a recent and unusually careful and thorough investigation of intransitive choice patterns was able to conclude: "...very few people repeat the same intransitive pattern on two replications of the same test. In other words, most violations that have been observed can be attributed to error rather than to true intransitivity" (Reference Birnbaum and DiecidueBirnbaum & Diecidue, 2015). Yet we find that on average a typical intransitive pattern is 41% (117/2 vs 249/6) more likely to replicate than a typical transitive pattern.

Delving deeper, we see from the middle panels in Table 2 that the intransitive proportion of consistently revealed patterns across the triples varies from 17.8% (triple 8) to 83.9% (triple 4). The modal consistently revealed patterns are intransitive for triples 4, 7, 10 and 11 and runner-up for Triples 1, 3, 5 and 6; that is, eight of the ten triples have an intransitive modal or second modal consistently revealed preference pattern. Therefore, consistently intransitive preferences appear to be revealed relatively more frequently than intransitive preferences that may or may not replicate. This result suggests that noise diminishes (rather than increases) intransitive preferences in revealed choice patterns. In other words, as the noise washes out, cyclical choice patterns increase their share of the total, a finding replicated at the individual level, as we next show. As a comparison, in our control, triple 9, we find the opposite: just 1 consistently intransitive person but 51 consistently transitive people, a 98% transitive share, in line with the consensus view on the rarity of cycles.

Another way to check whether fundamental intransitivity or noise is driving the data is to divide the subjects into two equal-size groups by rate of choice switching between repetitions. On inspection, we find a threshold of 10 or fewer inconsistencies separates 51 individuals with fewer and 49 individuals with more inconsistencies (or stochastic preferences). In Figure 5, we plot the number of cycles exhibited by each individual against the number of his or her choice switches between repetitions.

Figure 5: Individual Inconsistency versus Frequency of Cycles.

The graph shows a broad tendency for more cycles among the more consistent individuals, with cycles decreasing as the rate of choice switching increases. The most consistent group of 51 subjects committed an average 6.16 cycles (out of a maximum possible 20), which is 33% more than the group of 49 noisier individuals who averaged 4.63 cycles. This conclusion is not dependent on the threshold of 10, as is clear from Figure 5, offering further evidence that true intransitivity rather than noise is responsible for most of the observed cycles.Footnote ⁶

Weak stochastic transitivity (WST), requires Pr (C ≻ A) to be at least as large as the minimum of Pr (B ≻ A) and Pr (C ≻ B), a requirement diametrically at odds with the STP. As noted earlier, violations of WST can also result if subjects have random preferences over exclusively transitive preference orders. However, we also showed an overlooked implication of this claim is the set of constraints we identified in H2-H4. The frequency with which each lottery is chosen from the respective ternary choice set must satisfy each of H2-H4. In other words, any random preference model over transitive orderings capable of violating WST must meet H2-H4. If it does not, latent intransitive preferences are presumably the only remaining explanation. Our data shows that WST is violated for Triples 1, 3, 4, 5, and 7 (see Table 3).Footnote ⁷

Table 3: Intransitive cycles across binary and ternary choices.

Notes: In each Triple in Columns 6-11: row 1 shows data for binary choice task; row 2 shows data for ternary choice task; row 3 shows (data from ternary/data from binary) in %.

Triple 4 exhibits the strongest violation followed closely by Triple 3. Averaged across both repetitions, in triple 4 we found: Pr (Y ≻ X)= 71.5%; Pr (Z ≻ Y) = 64.5% and also that Pr (Z ≻ X) = 27%. At 37.5 percentage points, this is a strikingly strong violation of WST. It is also a violation of ’simple scalability’ (Tversky & Russo, 1968). Table 4 shows the results of the hypothesis tests.

Table 4: Testing Hypotheses 2–4.

Notes: Shaded cells highlight cases when a hypothesis is rejected by the data.

In summary, for each triple where WST did not hold for the binary choice sets, any transitive random preference model must satisfy H2-H4 in the ternary sets. This is essential if the WST violations in the binary sets were a result of stochastic but transitive latent preferences. Taken as a whole, the tests reported above to separate noisy but transitive latent preferences from underlying intransitivity lean heavily in favor of the latter proposition.

Finally, triple 8 was one of two triples where intransitive patterns were relatively infrequent. The lotteries comprising triple 8 were designed to be a test of one intriguing ’ingredient’ in the STP recipe: a higher minimum consequence for Z than for X. A small pilot experiment had previously identified triple 4 as particularly prone to exhibit cycles (19 of 27 subjects cycled). We decided to make as few changes as possible to the lottery pairs of triple 4 when swapping the lowest payoff in X with that in Z. This change then required an increase in the maximum payoff in Z to keep the expected value above that for X.

The combined effect of these two changes is to drastically reduce the number of observed cycles, from 119 of 200 in triple 4 down to 42 of 200 in triple 8 (triple 8 resembles, but does not strictly satisfy, the STP). Even more striking is the reduction in the intransitive share of consistently revealed patterns, from 83.9% in triple 4 to 17.8% in triple 8. We conjecture that if researchers adopt this new PR design, they may find even stronger reversals than those that have comprised the PR paradox to date.

3.3 Testing Expansion Consistency and IIA

The second purpose for eliciting preferences in the respective ternary choice sets is to test for expansion consistency. Table 3 presents the results for all decisions, contrasting the binary and ternary choices, by triple. Table 5 reports separately the most preferred elements in all the ternary sets.

Table 5: Average ternary top preferences by triple.

The rightmost panels in Table 3 show the following. The binary choices in repetition 1 were consistent with binary choices in repetition 2 for 2043/3000 or 68.1% of decisions (excluding the control), which we take as the benchmark. Aggregating across these 10 sets of triples, for those choosing X ≻ Z in the binary set, 852 of 1240, or 68.7% maintain the ranking in the ternary set. The binary/ternary comparison for X ≻ Z is the same as the benchmark consistency supporting (stochastic) expansion consistency when Y is included. However, this is just one of six binary preference ranks tested and is the only one to match the binary/binary consistency.

For the binary preference Y ≻ X, 699 of 1204 decisions, maintain this ranking in the ternary set, or just 58.1%. This is statistically significantly lower than 68.1%. Finally there are 993 binary choices of Z ≻ Y; of these just 469, or 47.2%, maintain that rank in the respective ternary comparisons so expansion consistency is rejected. This is the lowest consistency rate of the six binary ranks. It seems that preference for the riskiest option over the certainty reverses when the intermediate option is included in the set.

The other four comparisons are all around 58% consistent. Hence, most binary/ternary comparisons show clear evidence, beyond noise, of set-dependent preferences, inconsistent with expansion to, or contraction from, the ternary set. While some of the violations reflect the well-known compromise effect, the largest inconsistency above cannot be explained by any of the three known effects: attraction, similarity or compromise. We can also test whether preferences revealed from a binary choice are consistent with preferences revealed from rankings within a ternary set at a more fine-grained level using the Conlisk (1989) test. This test compares the relative frequency of two inconsistent choice patterns. The first pattern is when a decision maker chooses X over Y in a binary choice but ranks Y over X in a ternary set. The second pattern is when a decision maker chooses Y over X in a binary choice but ranks X over Y in a ternary set. If these two choice patterns are due to indifferences, random errors, imprecision, noise or indecisiveness, they should occur, a priori, with equal or similar frequencies. In contrast, a decision maker who systematically reveals one choice pattern significantly more often than the other (for example, by following the most probable winner) is unlikely to simply reflect indifference or noise. The Conlisk’s (1989) test formalizes this idea.

Table 6 presents the results of Conlisk (1989) test comparing the consistency of binary choices with ternary rankings.

Table 6: Conlisk z (p-value) comparing binary choices X vs Y, Y vs Z and X vs Z with the corresponding rankings from a ternary {X,Y,Z}.

A significant positive or negative z value indicates that inconsistencies between binary choices and ternary rankings are not due to indifferences, random errors, imprecision or noise. In triples 7-10 binary choice is always statistically significantly different (at 5% significance level) from ternary rankings for all three comparisons (X vs Y, Y vs Z and X vs Z). For triple 3 binary choices are not statistically significantly different from ternary rankings (for all three comparisons). For other triples, there are significant differences for some but not all combinations (typically X vs Y and Y vs Z are significantly different but X vs Z is not). Thus, for all triples but triple 3 there appear to be significant inconsistencies between binary choice and ternary rankings that go beyond imprecision or noise.

More specifically, there is an asymmetry when Z is involved; when the binary choice favors Z (over X or over Y) the ranking is more likely to be overturned in the ternary set than if Z is disfavored in the binary set (see Figure 6). These results appear to confirm a preponderance of set-dependent preferences, in violation of expansion-consistency, as predicted earlier.

Figure 6: Set-Dependent Preferences.

The effect captured in Figure 6 suggests that when there is a straight choice between our modified $-bet and certainty, the majority of subjects (129 of 200 people) prefer the new $-bet. Of those 129, 97 reversed their preference when new P-bet was included in the choice set. The set-dependent anchoring and adjustment effect (Reference Slovic and LichtensteinSlovic & Lichtenstein, 1983) may help explain Figure 6. Specifically, when people consider the new $-bet versus certainty, they tend to focus on the stake rather than probability and concentrate on the most extreme outcome.Footnote ⁸ However, when we add the new P-bet to the choice set, our results suggest that subjects tend to concentrate on probabilities rather than stakes. They pay attention to the most extreme probability of 1 (certainty) and, hence, tend to choose certainty over the new $-bet.

4 Conclusion: Predictably Intransitive?

Where do our arguments leave the most popular utility theories of choice under risk (which all include transitivity)? The Steinhaus-Trybula Paradox shows that for multi-attribute risky choice objects, which we evaluate in binary and ternary choice sets, relying on transitivity can fail to select the most advantageous lotteries. The domain of application for transitive theories excludes choice-rules that are common and cognitively plausible and can violate transitivity where utility differences between options are not too large. It is likely that ternary choice sets using smaller differences in expected values than we used would show even greater rates of intransitive binary comparisons.

An innovative feature of our experiment was eliciting preferences in the ternary choice sets as well as the constituent binary sets. This allows us to investigate the predicted set-dependence of preferences and test transitive but random preference theory as a possible explanation of cycles. Results support our conjectures that the cycles reflect latent intransitive preference rather than noisy implementation of transitive preferences. We saw that although very little solid evidence for true intransitive preferences existed prior to our experiment, this ’absence of evidence’ should not be mistaken for ’evidence of the absence’ of preference cycles. This paper has identified limits to the descriptive invocation of transitivity. Our findings also point to a deeper underlying process at work in choice under risk, part of the growing evidence (Reference Louie, Glimcher and WebbLouie, Glimcher & Webb, 2015) that choices often bear the stamp of other options in the choice set, as well as latent preferences (Reference Loomes and SugdenNoguchi & Stewart, 2014).

One implication of our arguments and our experimental results is that rather than modelling individuals as possessing a core utility function (transitive or intransitive), many typically transitive individuals are the same people who violate transitivity in the circumstances we identify. This suggests neither a transitive nor intransitive ’core’ utility function can accurately describe preferences over all lotteries a person may encounter. Reference Stewart, Reimers and HarrisStewart, Reimers and Harris (2015) recently concluded, "The shape of the revealed utility ... function is, at least in part, a property of the question set and not the individual", in line with a constructed-preference paradigm. Our results point towards the constructed-preference paradigm as the more promising way forward.

Appendix

Figure A1: Experimental Instructions: Screenshot 1.

Figure A2: Experimental Instructions: Screenshot 2.

Figure A3: Experimental Instructions: Screenshot 3.

Figure A4: Experimental Instructions: Screenshot 4.

Figure A5: Experimental Instructions: Screenshot 5.

Footnotes

We would like to thank seminar participants at Warwick Business School, University of East Anglia, University of California, Irvine, Cal State Fullerton, University of Arizona, Curtin University, Murdoch University, Sydney University, Griffith University, University of Queensland and Queensland University of Technology for comments on earlier versions of this manuscript. David Butler acknowledges the support of the Australian Research Council (grant: DP1095681). Ganna Pogrebna acknowledges financial support from RCUK/EPSRC grants EP/N028422/1 and EP/P011896/1.

¹ We discuss her preference order over the ternary choice set {A,B,C} in the next section.

² Evidence from response time’s shows fast decisions when one option is clearly better. Response times lengthen, as the DM needs to accumulate more evidence to trigger a choice.

³ In order to make sure that subjects in our experiment understand probabilities, we use display with 3 differently colored marbles for each lottery option. We provide screenshots of this display in later sections of this paper.

⁴ Table A in the Online Supplementary Material provides detailed summary statistics including the frequency of intransitive cycles for each of the eleven triples.

⁵ The ’probable winner’ cycle refers to a case when CE ≻ $−bet ≻ P ≻ CE and the ’reverse’ refers to a case when $−bet ≻ CE ≻ P−bet ≻ $−bet.

⁶ The caveat is there may be a modest uptick of cycles for the most inconsistent of all; but inspection of the graph shows just two out of 100 individuals drive the uptick, so it may not be reliable.

⁷ See Table B in the Online Supplementary Material for more details.

⁸ Reference Kim, Seligman and KableKim, Seligman & Kable (2012) show in an eye-tracking study that when either $-bet or P-bet is compared to certainty; people tend to pay more attention to stakes rather than probabilities which is consistent with our findings. Kim et al. (2012) also find that when $-bet is compared with P-bet in a straight binary choice, people tend to focus on probabilities rather than stakes.

References

Allais, M. (1953). Le comportement de l’homme rationnel devant le risque: critique des postulats et axiomes de l’ecole Americaine. Econometrica, 21, 503-546.CrossRef Google Scholar

Anand, P. (1993). The philosophy of intransitive preference. Economic Journal, 103, 337-346.CrossRef Google Scholar

Arieli, A., Ben-Ami, Y., & Rubinstein, A. (2009). Fairness motivations and procedures of choice between lotteries as revealed through eye movements. Unpublished manuscript.Google Scholar

Baillon, A., Bleichrodt, H., & Cillo, A. (2015). A tailor-made test of intransitive choice. Operations Research, 63, 198–211.CrossRef Google Scholar

Bar-Hillel, M., & Margalit, A. (1988). How vicious are cycles of intransitive choice? Theory and Decision, 24(2), 119–145.CrossRef Google Scholar

Birnbaum, M. H. (2011). Testing mixture models of transitive preference: comment on Regenwetter, Dana and Davis-Stober. Psychological Review, 118, 675–683.CrossRef Google Scholar PubMed

Birnbaum, M. H., & Diecidue, E. (2015). Testing a class of models that includes majority rule and regret theories: transitivity, recycling and restricted branch independence. Decision, 1, 145–190.CrossRef Google Scholar

Birnbaum, M. H., & Schmidt, U. (2008). An experimental investigation of violations of transitivity in choice under uncertainty. Journal of Risk and Uncertainty, 37, 77–91.CrossRef Google Scholar

Blavatskyy, P. R. (2006). Axiomatization of a preference for a probable winner. Theory and Decision, 60, 17–33.CrossRef Google Scholar

Blavatskyy, P. R. (2014). Stronger utility. Theory and Decision, 76, 265–286.CrossRef Google Scholar

Blyth, C. R. (1972). Some probability paradoxes in choice from among random alternatives. Journal of the American Statistical Association, 67(338), 366–373.CrossRef Google Scholar

Butler, D. J. & Blavatskyy, P. R. (2018). Normative limits to expected utility theory’s transitivity axiom. Unpublished manuscript.Google Scholar

Butler, D. J., & Hey, J. (1987). Experimental economics: an introductory survey. Empirica: Austrian Economic Papers, 2, 157–186.CrossRef Google Scholar

Butler, D. J., & Loomes, G. (2007). Imprecision as an account of the preference reversal phenomenon. American Economic Review, 97(1), 277–297.CrossRef Google Scholar

Cavagnaro, D. R., & Davis-Stober, C. P. (2014). Transitive in our preferences, but transitive in different ways: an analysis of choice variability. Decision, 1, 102–122.CrossRef Google Scholar

Conlisk, J. (1989) Three variants on the Allais example. American Economic Review 79(3), 392–407Google Scholar

Conrey, B., Gabbard, J., Grant, K., Liu, A., & Morrison, K. E (2016). Intransitive dice. Mathematics Magazine, 89,133–143.CrossRef Google Scholar

Fishburn, P. C. (1982). Non-transitive measurable utility. Journal of Mathematical Psychology, 26, 31–67.CrossRef Google Scholar

Kahneman, D. (2012). Thinking, fast and slow. Penguin, London & New York.Google Scholar

Kim, B. E., Seligman, D., & Kable, J. W. (2012). Preference reversals in decision making under risk are accompanied by changes in attention to different attributes. Frontiers in Neuroscience, 6(109), 1–10.CrossRef Google Scholar PubMed

Lichtenstein, S. & Slovic, P., (1971). Reversals of preference between bids and choices in gambling decisions. Journal of Experimental Psychology, 89, 46–55.CrossRef Google Scholar

Loomes, G., & Sugden, R. (1982). Regret theory: an alternative theory of rational choice under uncertainty. Economic Journal, 92, 805–824.CrossRef Google Scholar

Loomes, G., & Sugden, R. (1987). Some implications of a more general form of regret theory. Journal of Economic Theory, 41, 270–287.CrossRef Google Scholar

Loomes, G., Starmer, C., & Sugden, R. (1991). Observing violations of transitivity by experimental methods. Econometrica, 59, 425–439.CrossRef Google Scholar

Louie, K., Glimcher, P. W., & Webb, R. (2015). Adaptive neural coding: from biological to behavioural decision-making. Current Opinion in Behavioral Sciences, 5, 91–99.CrossRef Google Scholar

Luce, R. D., & Raiffa, H. (1957). Games and Decisions. Wiley, New York.Google Scholar

Müller-Trede, J., Sher, S., & McKenzie, C. R. M. (2015). Transitivity in context: a rational analysis of intransitive choice and context-sensitive preference. Decision, 1, 1–26.Google Scholar

Noguchi, T., & Stewart, N. (2014). In the attraction, compromise and similarity effects, alternatives are repeatedly compared in pairs on single dimensions. Cognition, 132, 44–56.CrossRef Google Scholar PubMed

Pratt, J. W. (1972) Comment. Journal of the American Statistical Association, 67, 378–379.Google Scholar

Rubinstein, A., & Segal, U. (2012). On the likelihood of cyclic comparisons. Journal of Economic Theory, 147, 2483–2491.CrossRef Google Scholar

Russo, J. E., & Dosher, B. A. (1983). Strategies for multi-attribute binary choice. Journal of Experimental Psychology: Learning, Memory and Cognition, 9, 676–696.Google Scholar

Slovic, P., & Lichtenstein, S. (1983). Preference reversals: a broader perspective. American Economic Review, 73(4), 596–605.Google Scholar

Starmer, C., & Sugden, R. (1993). Testing for juxtaposition and event-splitting effects. Journal of Risk and Uncertainty, 6, 235–254.CrossRef Google Scholar

Steinhaus, H., & Trybula, S. (1959). On a paradox in applied probabilities. Bulletin of the Polish Academy of Sciences, 7, 67–69.Google Scholar

Stewart, N., Reimers, S., & Harris, A. J. L. (2015). On the origin of utility, weighting and discount functions: how they get their shapes and how to change their shapes. Management Science, 61, 687–705.CrossRef Google Scholar

Tversky, A. (1969). Intransitivity of preferences. Psychological Review, 76, 31–48.CrossRef Google Scholar

Tversky, A., & Russo, J. E. (1969). Substitutability and similarity in binary choices. Journal of Mathematical Psychology, 6, 1–12.CrossRef Google Scholar