Hostname: page-component-cd9895bd7-8ctnn Total loading time: 0 Render date: 2025-01-04T14:48:41.226Z Has data issue: false hasContentIssue false

Knowing me, knowing you: an experiment on mutual payoff information in the stag hunt and Prisoner's Dilemma

Published online by Cambridge University Press:  01 January 2025

Hazem Alshaikhmubarak*
Affiliation:
King Faisal University, Al Hofuf, Saudi Arabia
David Hales
Affiliation:
Global Innovations Bank, Kiester, MN, USA
Maria Kogelnik
Affiliation:
Yale University, New Haven, CT, USA
Molly Schwarz
Affiliation:
Federal Communications Commission, Washington, DC, USA
C. Kent Strauss
Affiliation:
University of California Santa Barbara, Santa Barbara, CA, USA
Rights & Permissions [Opens in a new window]

Abstract

We experimentally study how mutual payoff information affects strategic play. Subjects play the Prisoner's Dilemma or Stag Hunt game against randomly re-matched opponents under two information treatments. In our partial-information treatment, subjects are shown only their own payoff structure, while in our full-information treatment they are shown both their own and their opponent's payoff structure. In both treatments, they receive feedback on their opponent's action after each round. We find that mutual payoff information initially facilitates reaching the socially optimal outcome in both games. Play in the Prisoner's Dilemma converges toward the unique Nash equilibrium of the game under both information treatments, while in the Stag Hunt mutual payoff information has a substantial impact on play and equilibrium selection in all rounds of the game. Belief-learning model estimations and simulations suggest these effects are driven by both initial play and the way subjects learn.

Type
Original Paper
Copyright
Copyright © The Author(s), under exclusive licence to Economic Science Association 2024.

1 Introduction

Game-theoretic models are typically motivated by the idea that players reason about the behavior of others and choose their strategies accordingly. This reasoning can be informed directly by observing the payoff structure of the game or indirectly by observing and learning from the actions of other players. If information about another player's payoffs plays a pivotal role in affecting an individual's action choices, then varying the availability of this information may result in insightful differences in play. This paper addresses the following question: how does the common knowledge of all players’ payoffs, relative to only knowing one's private payoffs (and common knowledge thereof), impact play in strategic interactions? For brevity, we henceforth refer to this as the effect of mutual payoff information.

On the one hand, mutual payoff information may greatly impact individuals’ action choices. The type of introspective reasoning supported by directly observing others’ payoffs is often embedded in models of strategic decision making, such as higher-level reasoning in level-k models (Stahl and Wilson, Reference Stahl and Wilson1994; Nagel, Reference Nagel1995), which assume that players know the whole payoff matrix. Additionally, experiments using eye tracking have found that subjects devote a sizable amount of attention to the payoffs of other players (Knoepfle et al., Reference Knoepfle, Wang and Camerer2009; Polonio and Coricelli, Reference Polonio and Coricelli2019), and it has been documented that subjects engage in higher-level reasoning when other players’ payoffs can be observed (e.g., Kneeland, Reference Kneeland2015). Thus, varying the availability of mutual payoff information may result in vastly different action choices among players.

On the other hand, the absence of mutual payoff information may have no impact on individuals’ choices. In providing an interpretation for his seminal equilibrium concept, Nash (Reference Nash1950) makes it explicit that, “it is unnecessary to assume that the participants have full knowledge of the total structure of the game, or the ability and inclination to go through any complex reasoning processes.” Similarly, theoretical models of learning explore how equilibria can be reached and selected through processes of learning, adaptation, and/or imitation rather than introspection (Fudenberg and Levine, Reference Fudenberg and Levine2009), and uncoupled learning models (e.g., Hart and Mas-Colell, Reference Hart and Mas-Colell2006; Foster and Young, Reference Foster and Young2006; Young, Reference Young2009; Babichenko, Reference Babichenko2010) describe how equilibria can be reached in the absence of information about other players’ incentives or even their existence. Thus, the degree of payoff information available to subjects may cause no change in play.

We present the first experiment designed to study how mutual payoff information affects play in canonical two-by-two games. Subjects play one-shot stage games repeatedly with randomly re-matched opponents each round.Footnote 1 In our partial-information treatment, subjects observe their own payoffs and the action of their opponent after each round, but never observe the other's payoffs. Comparing this partial-information version to the full-information baseline treatment in which subjects observe the whole payoff matrix (in addition to actions) allows us to detect differences in play that arise due to the presence of mutual payoff information.

We explore play using the Prisoner's Dilemma (PD) and the Stag Hunt (SH). Mutual payoff information can reveal opportunities to coordinate on socially optimal outcomes; however, being aware of an opportunity to cooperate can increase the tension of a game if the cooperative outcomes are associated with actions that are dominated for at least one player. The appeal of contrasting the PD with the SH in our experiment is grounded in our conjecture that mutual payoff information affects behavior differently in these two games. The SH exhibits a tension between a mutually desirable outcome and avoiding personal risk. Knowledge of the other's payoffs arguably reduces the tension of the game by revealing a mutually beneficial outcome. The PD, on the other hand, exhibits a tension between a socially optimal outcome and personal gain. There is little reason not to choose the payoff-dominant action in the absence of mutual payoff information; however, introducing this information arguably increases the tension by making players aware that a socially optimal outcome can be reached at personal expense.

To our knowledge, this is the first experiment that employs these information treatments and matching protocol to the PD and the SH game. In Feltovich and Oda (Reference Feltovich and Oda2014), subjects play partial-information versions of the SH and PD as well as four other games, but no full-information treatments are run for comparison. The latter is essential for studying the effect of mutual payoff information. A detailed review of related experimental studies can be found in Appendix A.

We present three novel insights. First, the fraction of subjects who initially cooperate in the PD or who coordinate on the payoff-dominant equilibrium in the SH is substantially higher under full-information than under partial-information.Footnote 2 Second, to our knowledge we present the first evidence that mutual payoff information can affect equilibrium selection in the SH throughout all rounds: The vast majority of subjects choose the action consistent with the payoff-dominant equilibrium of the SH in the full-information treatment, while choosing the risk-dominant action under partial-information.Footnote 3 Third, we find that play in the PD converges toward the unique NE of the game under both information treatments. Even in the absence of mutual payoff information, most subjects eventually choose actions that correspond to Nash equilibria in both games. Taken together, the effect of mutual payoff information on play is strong in both games.

To investigate whether the information treatment effect operates through initial play, learning, or both, we estimate a special case of an experience-weighted attraction (EWA) model (Camerer and Ho, Reference Camerer and Ho1999). We find significant differences not only in the estimates of the initial attractions for each action, but also in the parameters pertaining to the ongoing learning process. Simulations based on these estimates suggest that the treatment effects in both games are driven not only by how subjects perceive the game initially but also by ongoing learning.

2 Experimental design

Overview. Subjects played one-shot stage games of the SH and PD repeatedly in randomly re-matched pairs with two information treatments per game. In the “Full” information treatment, subjects were shown the complete payoff matrix, while in the “Partial” information treatment they were shown only their own payoffs. Players made simultaneous choices and were notified of their opponent's action and their own resulting payoff at the end of each round.

Games, information treatments, and matching protocol. Fig. 1 shows payoff matrices and the available payoff information for each game and treatment. The SH has two pure-strategy equilibria; one is payoff dominant (X, X) and one is risk dominant (Y, Y). The PD has one strictly dominant action and equilibrium, (Y, Y). Treatments have the same payoffs (though they are partially hidden in partial), thus keeping equilibria and best-response correspondences constant.Footnote 4 Appendix figures D1 and D2 show screenshots of the interface.

Subjects were randomly and anonymously re-matched with other subjects each round.Footnote 5 The information treatment was common knowledge and the same for all subjects within a session. That is, in the Full treatment, it was common knowledge that subjects were being re-matched with other subjects who could also observe the whole payoff matrix. Similarly, in the Partial treatment, it was common knowledge that subjects were being re-matched with other subjects who could only observe their own payoffs.

We employed a two-population matching mechanism to ensure that subjects in the Partial treatment could not infer the full symmetric payoff structure. At the beginning of a treatment, subjects were randomly assigned to one of two groups (labeled A and B) and all subjects within a group had the the same payoff structure. Throughout the treatment, subjects were exclusively matched with subjects of the opposite group. This procedure was announced at the start of each session, so while subjects could not infer their opponents’ payoffs by observing their own, they were aware that their opponents would always have the same payoff structure. This two-population matching mechanism was used in both information treatments for consistency.

Each experimental session consisted of two blocks of 40 rounds each, one with a Full treatment and the other with a Partial treatment, for a total of 80 rounds of play. Having multiple rounds allowed subjects to learn about the game. Table 1 provides an overview of how treatments were allocated across sessions. Subjects never played a Full before a Partial treatment of the same game to avoid inference that the payoffs in the second game were the same as in the first game. Table 2 describes the between- and within-subjects analyses, which we use to test for order effects.

Implementation. Instructions and comprehension questions are provided in the Appendices. Instructions were handed out and read aloud before each 40-round block. Subjects had to correctly answer a comprehension quiz before participating.Footnote 6 We programmed the interface using Z-Tree (Fischbacher, Reference Fischbacher2007), conducted sessions in April and September 2018 at the Experimental and Behavioral Economics Laboratory (EBEL) at UCSB, and recruited 194 subjects through ORSEE (Greiner, Reference Greiner2015). Subjects had a median age of 20, and 16% of them indicated Economics as their major or intended major. Sessions lasted 45–55 min. Subjects received payoffs from a randomly selected round, plus a $7.00 show-up fee. The average total payment was $13.22 (min. $8.00, max. $20.00).

3 Results

We pool both the within- and between-subjects data to investigate the main results. Results in Appendix Tables D1 and D2 are qualitatively similar using alternative samples. Additionally, we rule out large differences in play due to order effects in Appendix Figure D3 and Tables D3 and D4.

Our main interest is in examining the impact of the information treatment on choosing action X, which is associated with the socially optimal outcome in both games. Panels (a) and (b) of Fig. 2 illustrate the average rate of choosing X for each game and information treatment. In the SH, there is a significant difference in play between treatments throughout all rounds. In the PD, there is initially a substantial difference in the rate of choosing action X, which diminishes towards the final rounds. To investigate the treatment effect more formally, we estimate regressions of the following form, separately for the SH and the PD:

(1) 1 { Choose Action X } i , t , s = α + β P a r t i a l + γ C s + u i , t , s ,

where 1 { Choose Action X } i , t , s is a binary indicator for subject i choosing action X in round t of session s. The vector C s is a set of dummy variables that flexibly controls for session size.Footnote 7 The variable Partial equals one if the action choice is made under the Partial treatment and zero otherwise. Thus, the estimated coefficient β ^ can be interpreted as the percentage point difference in the probability of choosing X under the Partial treatment compared to the Full treatment. Table 3 presents the results of estimating equation 1 using ordinary least squares and standard errors clustered at the subject-session level.Footnote 8

Initial play Mutual payoff information has a large effect on initial play in both games. In SH-Full, 86.5% choose X in the first round, compared to 32.0% in SH-Partial. For the PD, the corresponding rates are 64.3% and 17.0%. The same qualitative pattern emerges in the early rounds of the games when estimating regression results. Column (2) of Table 3 indicates that in SH-Full, subjects choose action X about 66.9 percentage points (pp) (88.1%) more often than in SH-Partial, and 30.4pp (70.2%) more often in PD-Full than in PD-Partial.

Result 1

In both games, a substantially higher proportion of subjects initially choose X (the action supporting socially optimal outcomes) in the Full than in the Partial treatment.

Equilibrium selection and convergence. Next, we analyze how play evolves across the 40 rounds of a game. In the SH, the initial effect is remarkably persistent: Across all rounds, subjects are 67.6pp ( 84.4 % ) more likely to choose action X in SH-Full than in SH-Partial, as column (1) in panel (a) of Table 3 indicates.

These results directly impact equilibrium selection and efficiency in the SH. Panels (c) and (d) of Fig. 2 show that subjects tend to reach the risk-dominant equilibrium in SH-Partial and the payoff-dominant equilibrium in SH-Full.Footnote 9 We estimate Eq. 1 using a binary indicator for reaching a pure strategy Nash equilibrium as the outcome, which for the SH is (X, X) or (Y, Y), and report the results in Table 4.Footnote 10 Column (1) of panel (a) indicates that a pure Nash equilibrium is reached about 80% of the time in SH-Full, and 10pp less often in SH-Partial. Consequently, outcomes are more efficient with mutual payoff information in the SH, as the payoff-dominant equilibrium is more efficient than the risk-dominant one. Appendix Table D7 presents estimates of Eq. 1 when using the efficiency ratio as the outcome.Footnote 11 In SH-Partial, the efficiency ratio is on average 0.40pp ( 45.8 % ) lower than in SH-Full (see column (1) of panel (a)). In sum, while equilibria tend to be achieved under both treatments, mutual payoff information crucially affects which equilibrium arises. This novel insight contributes to the literature on equilibrium selection in the SH, see Appendix A.

Result 2

Throughout all rounds of play, most subjects select action X (corresponding to the payoff-dominant equilibrium) in SH-Full and action Y (corresponding to the risk-dominant equilibrium) in SH-Partial.

In the PD, on the other hand, play converges toward the unique Nash equilibrium of the game under both information treatments, as panel (b) of Fig. 2 shows. We define convergence as the round of play where, on average across all sessions, at least 80% of subjects consistently choose the deviating action Y for the remaining rounds of the game. This occurs at round 3 in PD-Partial and at round 24 in PD-Full. Panel (b) of Table 3 shows that the treatment effect diminished greatly across rounds of play. Since more subjects tend to reach the defecting equilibrium in both treatments (see panel (b) of Table 4), the gap in efficiency ratios between treatments also becomes smaller over time (see panel (b) of Appendix Table D7).

Result 3

In the PD, play in both treatments converges toward the unique Nash equilibrium of the game.

4 Initial play versus learning: model and simulations

With the aim of understanding if the effect of mutual payoff information operates through initial play, learning, or both, we estimate a weighted fictitious play model of belief learning, a special case of the EWA learning model (Camerer and Ho, Reference Camerer and Ho1999). We discuss our model choice in Appendix B.1.

During each round, players choose X or Y based on their attractions (expected payoffs conditional on beliefs), which depend on past observations and a prior attraction. We assume all subjects use the same learning and decision-making mechanism, but may choose different actions due to different observed histories of play. Player i's probability of choosing action j { X , Y } in the next round is

(2) P i j ( t + 1 ) = e λ · A i j ( t ) e λ · A i X ( t ) + e λ · A i Y ( t ) ,

where the sensitivity parameter λ ranges from 0 (a uniform random choice) to (always choosing the action with the highest attraction). The attraction of action j at the end of round t is defined as

(3) A i j ( t ; ϕ ) = ϕ t · A i j ( 0 ) + m = 0 t - 1 ϕ m · π i ( a i j , a - i ( t - m ) ) n = 0 t ϕ n ,

where a i ( t ) and a - i ( t ) are player i and their opponent's chosen actions in t. Player i's realized payoff is π i ( a i ( t ) , a - i ( t ) ) : = π i ( t ) and their hypothetical payoff (had they chosen j in t) is π i ( a i j , a - i ( t ) ) . Beliefs are weighted averages of the observed history of play and initial attractions A i j ( 0 ) – subjects’ expected payoffs from either action before the first round, given their beliefs about their opponent's action, conditional on their own actions.Footnote 12 Beliefs are defined over two states: opponent playing X given oneself playing X, and opponent playing X given oneself playing Y.Footnote 13 The weighting decay parameter ϕ captures how much weight is put on observations of previous rounds, relative to the most recent round, and ranges from 0 (only previous action is weighted) to (only initial attractions are weighted).Footnote 14

Using maximum-likelihood techniques, we estimate four parameters for each game and information treatment: λ , ϕ , A X ( 0 ) , and A Y ( 0 ) . Table 5 and Appendix Tables D8 and D9 present the results, and demonstrate significant treatment effects on the value of initial attractions A X ( 0 ) and A Y ( 0 ) , and on parameter λ .Footnote 15 As λ affects both initial and ongoing play, these treatment effects are consistent with a hypothesis that both initial play and ongoing learning are affected by the presence of mutual payoff information.

To examine if the model fits the data well, we conduct simulations based on parameter estimates. Appendix Figure D4 and Appendix Table D10 indicate very similar simulated and observed mean action rates. See Appendix B.2 for details on estimation and simulation techniques.

Initial play is captured by the initial attractions A X ( 0 ) and A Y ( 0 ) and by λ , while learning is captured by ϕ , λ , as well as history-dependent attractions A i X ( t ) and A i Y ( t ) . To investigate if the treatment effect operates through initial play or learning, we swap parameter values between treatments of the same game. For example, to study if the treatment affects initial play in SH-Full, we simulate behavior using the SH-Full parameters, except we use the initial attraction parameters from SH-Partial. This helps understand the economic meaningfulness beyond hypothesizing the direction of changes from parameter estimates.

Appendix Figure D5 compares model simulations with estimated parameters (solid lines) and swapped parameters (dotted lines). As shown in panels (a) and (b), swapping initial attraction estimates dramatically affects simulated data in the SH, while differences in the PD disappear by round 15. The fraction playing X changes in opposite directions across information treatments, consistent with the estimates of A X ( 0 ) in Appendix Table D8. Swapping λ results in a lower fraction of playing X in both games, as shown in panels (c) and (d) of Appendix Figure D5. In PD-Full, the attraction is initially higher for X but then shifts to Y, explaining why the simulation with swapped ϕ estimates initially lies above the original but then falls below it, as shown in panel (f). Swapping ϕ has little effect except in SH-Full, as is seen in panel (e); here small changes greatly affect coordination on the payoff-dominant Nash equilibrium. Lower ϕ values cause rapid depreciation of weights on earlier round observations, increasing sensitivity to behavior volatility. Taken together, swapping parameter estimates highlights their importance in explaining the treatment effect, except for ϕ in the PD.

Result 4

Learning model parameter estimates and simulations suggest that the information treatment effect can be attributed to differences in both initial play and learning in both games.

From examining Fig. 2 and descriptive statistics, one might have concluded that the information treatment effect stems primarily from subjects’ initial perception of the game. Result 4 provides further nuance, and highlights the importance of ongoing learning in addition to initial play.

5 Discussion

We experimentally vary subjects’ access to opponents’ payoffs to investigate the effect of mutual payoff information on strategic play and find multiple statistically and economically significant results. We see a variety of opportunities for further research. Our results highlight the effects for a very limited set of environments, and it is not clear whether similar effects would be seen in different games or with different payoff structures. Note that equilibrium selection in games similar to SH-Full has been found to depend on the payoffs chosen (e.g.,Battalio et al.,Reference Battalio, Samuelson and Van Huyck2001). Additionally, there is room for investigating the underlying causes of our treatment effect.Footnote 16 For example, while common knowledge of the payoff structure allows for players to consider opponents’ payoffs when engaging in introspective reasoning, it also enables players to include consideration of opponents’ payoffs as part of their own preferences. Our experiment was not designed to isolate a specific cause for the effects of mutual payoff information but rather document the overall trends that exist in our games. Future studies could tackle both these limitation by examining whether such effects persist across other environments and detailing additional nuances and drivers for the effects of mutual payoff information on players’ decision making.

Fig. 1 The games and information treatments from the row player's perspective

Fig. 2 Mean action X rates and resulting SH equilibrium

Subplots (a) and (b) show the share of subjects playing action X by game and treatment. Faded lines represent the mean rate by round for each session separately. Action X is associated with the socially optimal outcome in both games (payoff-dominant equilibrium in SH and cooperation in PD). Subplots (c) and (d) show the proportion of subject pairs that played an equilibrium for each round in the SH. Not all subject pairs played a Nash equilibrium and the sum of the blue (dotted red) lines add up to less than one. Averaged across all rounds, subject pairs failed to reach equilibria in 19% (25%) of plays in the full (partial).

Table 1 Treatments by session

Sessions

Part 1

Part 2

# Subjects per session

1–3

SH - Partial

SH - Full

16, 16, 20

4–6

PD - Partial

PD - Full

16, 16, 18

7–9

SH - Full

PD - Partial

16, 14, 14

10–12

PD - Full

SH - Partial

16, 18, 14

Note: in each of the two parts, 40 rounds of a game were played

Table 2 Between- vs. within-subjects analysis

Analysis

Game

Data

Between subjects

SH

First part sessions 1–3, first part sessions 7–9

Between subjects

PD

First part sessions 4–6, first part sessions 10–12

Within subjects

SH

Sessions 1–3 (first and second part)

Within subjects

PD

Sessions 4–6 (first and second part)

Table 3 Effect of partial-information treatment for selection of action X

(1)

(2)

(3)

(4)

(5)

Overall

1–10

11–20

21–30

31–40

a) Stag Hunt

Partial-information

- 0.676

- 0.669

- 0.683

- 0.677

- 0.673

(0.034)

(0.030)

(0.041)

(0.047)

(0.040)

Cluster p-value

0.000

0.000

0.000

0.000

0.000

Full-information mean

0.844

0.881

0.836

0.822

0.838

Number of clusters

144

144

144

144

144

N

7840

1960

1960

1960

1960

b) Prisoner's Dilemma

Partial-information

- 0.173

- 0.304

- 0.147

- 0.156

- 0.086

(0.022)

(0.037)

(0.028)

(0.031)

(0.026)

Cluster p-value

0.000

0.000

0.000

0.000

0.001

Full-information mean

0.232

0.433

0.205

0.176

0.113

Number of clusters

142

142

142

142

142

N

7680

1920

1920

1920

1920

Note: The sample uses the pooled data. Action X is associated with the socially optimal outcome in both games. The regressions include controls for session size. Standard errors presented in parentheses are calculated using the cluster-robust method allowing for correlation between observations within a cluster. Clustering is at the session-subject level. Cluster p value indicates the p value from a two-sided t test of the null hypothesis that the treatment effect is zero using the cluster-robust standard error

Table 4 Effect of partial-information treatment for reaching an equilibrium outcome

(1)

(2)

(3)

(4)

(5)

Overall

1–10

11–20

21–30

31–40

(a) Stag Hunt

Partial-information

- 0.100

- 0.174

- 0.050

- 0.076

- 0.099

(0.028)

(0.036)

(0.036)

(0.033)

(0.033)

Cluster p-value

0.001

0.000

0.167

0.023

0.004

Full-information mean

0.811

0.808

0.798

0.831

0.808

Number of clusters

144

144

144

144

144

N

7840

1960

1960

1960

1960

(b) Prisoner's Dilemma

Partial-information

0.266

0.405

0.241

0.258

0.158

(0.019)

(0.032)

(0.030)

(0.033)

(0.027)

Cluster p-value

0.000

0.000

0.000

0.000

0.000

Full-information mean

0.618

0.347

0.643

0.696

0.788

Number of clusters

142

142

142

142

142

N

7680

1920

1920

1920

1920

Note: The sample uses the pooled data. The regressions include controls for session size. Standard errors presented in parentheses are calculated using the cluster-robust method allowing for correlation between observations within a cluster. Clustering is at the session-subject level. Cluster p value indicates the p value from a two-sided t test of the null hypothesis that the treatment effect is zero using the cluster-robust standard error

Table 5 Learning model parameter estimates and 95% confidence intervals

Parameter

SH−Full

SH−Partial

PD−Full

PD−Partial

λ

1.5831

0.6516

0.4297

0.7426

( 1.2159 - 1.8776 )

( 0.5184 - 0.7673 )

( 0.3347 - 0.5242 )

( 0.6615 - 0.9146 )

ϕ

0.9649

0.8290

0.8011

0.8769

( 0.9011 - 1.0223 )

( 0.7112 - 0.9911 )

( 0.3009 - 0.9398 )

( 0.6019 - 85.013 )

A X ( 0 )

8.5765

5.9218

9.4480

7.1258

( 7.8948 - 9.7541 )

( 5.1815 - 6.4317 )

( 8.4351 - 10.5367 )

( 6.9829 - 8.5193 )

A Y ( 0 )

6.4400

7.0199

6.4995

8.5930

( 5.8775 - 6.6354 )

( 6.8786 - 7.2157 )

( 5.5125 - 7.3160 )

( 8.5141 - 12.9087 )

L ( λ )

- 1227.06

- 1542.83

- 1934.22

- 882.90

n

3840

4000

3920

3760

P X ( 1 )

0.9672

0.3284

0.7802

0.2517

(learning model)

( 0.8002 - 0.9999 )

( 0.1636 - 0.5069 )

( 0.1991 - 0.9334 )

( 0.0614 - 0.3999 )

p X ( 1 )

0.8646

0.3200

0.6429

0.1702

(binomial model)

( 0.8038 - 0.9358 )

( 0.2537 - 0.4366 )

( 0.5641 - 0.7486 )

( 0.1231 - 0.2825 )

Note: Results of tests of significance of the information treatment effect on parameters estimates for λ and ϕ in SH and PD are reported in Appendix Tables D8 and D9, respectively. Differences in values for λ are significant for both SH and PD ( p < 0.05 ), while difference in values of ϕ are not significant for PD, and are only weakly significant for SH. For estimates of p X ( 1 ) , we employ Agresti–Coull binomial confidence intervals (Agresti and Coull, Reference Agresti and Coull1998; Brown et al., Reference Brown, Cai and DasGupta2001)

Funding

University of California, Santa Barbara.

Footnotes

Molly Schwarz: The opinions expressed in this article are those of the author and do not necessarily represent the views of the Federal Communications Commission or the United States Government.

A previous version of this paper was circulated as “Knowing Me, Knowing You: An Experiment on Mutual Payoff Information and Strategic Uncertainty.” We thank Ryan Oprea, Emanuel Vespa, and Sevgi Yuksel for their guidance and encouragement. We are grateful to participants of the UCSB Experimental Economics Seminar, Terri Kneeland, and Yi Zheng for helpful comments and suggestions. Emanuel Vespa inspired and instigated this work by noting that partial-information games are understudied. Funding from the UCSB Department of Economics is gratefully acknowledged. This study obtained IRB approval at UCSB. The replication material for the study is available at http://doi.org/10.17605/OSF.IO/Y2JS7.

Supplementary Information The online version contains supplementary material available at https://doi.org/10.1007/s40881-024-00167-5.

1 By playing multiple rounds, subjects could learn about the game, but not their opponents.

2 For consistency with prior literature, we use the words “cooperate" and “defect" to describe the action choices in the PD with acknowledgement that they are only well-defined from the player's perspective in the presence of mutual payoff information.

3 Strictly speaking, following Harsanyi and Selten (Reference Harsanyi and Selten1988)'s canonical definition of risk dominance an equilibrium cannot be “risk-dominant” for a player who does not have access to full payoff information. For the sake of clarity and consistency, we will refer to the “risk-dominant” and “payoff-dominant” actions for both treatments in the SH game.

4 The basin of attraction for pure strategies X and Y thus remains constant across treatments in the SH. According to Embrey et al. (Reference Embrey, Fréchette and Yuksel2017), the basin of attraction size for a strategy is positively correlated with its selection frequency.

5 Ghidoni et al. (Reference Ghidoni, Cleave and Suetens2019) find that cooperation rates in a PD game with ten rounds are very similar when subjects are randomly re-matched in groups of 6 or with a new opponent each round.

6 Five of the 196 subjects initially answered some quiz questions incorrectly, but passed on the second attempt after receiving feedback and a new quiz version with different matrix entries.

7 We do not use session fixed effects in our main specification, since the estimate β ^ would only exploit the within-subjects data.

8 Results robust to using logit are found in Appendix Table D5.

9 The exact number of equilibria reached each round may partly depend on the random matching of pairs.

10 Our SH game also has a mixed strategy Nash equilibrium where subjects play X two-thirds and Y one-third of the time. Appendix Table D6 shows the share of subjects whose mix of actions are within 10pp of p X = 0.667 .

11 The efficiency ratio compares the total payoffs of both subjects in a round to the total payoffs of the efficient outcome. Random re-matching of subjects introduces variation in this ratio.

12 Attractions are constrained to the range of possible payoffs of an action, for consistency and easier interpretation. For a given expected probability of one's counterpart playing X in the first round, P X ( 1 ) , there is a one-to-one correspondence between the expected value of X and the expected value of Y. For example, if a player assesses in our SH game that E ( X ) = 8 in the first round, this implies P X ( 1 ) = 0.7 (note that 0.7 ∗ 11 + ( 1 - 0.7 ) ∗ 1 = 8 ) , which in turn implies that E ( Y ) = 0.7 ∗ 9 + ( 1 - 0.7 ) ∗ 5 = 7.8 . We allow both initial attractions to range independently of one another.

13 We allow for the possibility that symmetric outcomes (i.e., (X, X) or (Y, Y)) may be perceived as more likely.

14 ϕ = 1 would indicate that all observed actions are weighted equally. Camerer and Ho (Reference Camerer and Ho1999) comment that values of ϕ are “presumably between zero and one.” While we do not constrain the value of ϕ in our maximum-likelihood estimations, we note that estimated values of ϕ all lie within this range.

15 The treatment effect is also (weakly) significant for parameter ϕ , but only for the SH. We thank an anonymous referee for pointing out that learning model parameter estimates can be heavily biased if subjects within a cell have parameter heterogeneity (Wilcox, Reference Wilcox2006) We cannot test for such heterogeneity in our data.

16 In Alshaikhmubarak et al. (Reference Alshaikhmubarak, Hales, Kogelnik, Schwarz and Strauss2021), a previous and more expansive version of this paper, we discuss how the interplay of social preferences and strategic uncertainty may play a role in shaping the information treatment effect.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

References

Agresti, A, Coull, BA. (1998). Approximate is better than “exact" for interval estimation of binomial proportions. The American Statistician, 52, 2, 119126.Google Scholar
Alshaikhmubarak, H., Hales, D., Kogelnik, M., Schwarz, M., & Strauss, K. (2021). Knowing me, knowing you: An experiment on mutual payoff information and strategic uncertainty. Available at SSRN 3915018.CrossRefGoogle Scholar
Babichenko, Y. (2010). Uncoupled automata and pure Nash equilibria. International Journal of Game Theory, 39, 3, 483502. 10.1007/s00182-010-0227-9.CrossRefGoogle Scholar
Battalio, R, Samuelson, L, Van Huyck, J. (2001). Optimization incentives and coordination failure in laboratory Stag Hunt games. Econometrica, 69, 3, 749764. 10.1111/1468-0262.00212.CrossRefGoogle Scholar
Brown, LD, Cai, T, DasGupta, A. (2001). Interval estimation for a binomial proportion. Institute of Mathematical Statistics, 16, 2, 101117.Google Scholar
Camerer, C, Ho, T-H. (1999). Experience-weighted attraction learning in normal form games. Econometrica, 67, 4, 827874. 10.1111/1468-0262.00054.CrossRefGoogle Scholar
Embrey, M, Fréchette, GR, Yuksel, S. (2017). Cooperation in the finitely repeated Prisoner's Dilemma. The Quarterly Journal of Economics, 133, 1, 509551. 10.1093/qje/qjx033.CrossRefGoogle Scholar
Feltovich, N, Oda, SH. (2014). Effect of matching mechanism on learning in games played under limited information. Pacific Economic Review, 3, 260277. 10.1111/1468-0106.12065.CrossRefGoogle Scholar
Fischbacher, U. (2007). z-Tree: Zurich toolbox for ready-made economic experiments. Experimental Economics, 10, 171178. 10.1007/s10683-006-9159-4.CrossRefGoogle Scholar
Foster, D, Young, H. (2006). Regret testing leads to Nash Equilibrium. Theoretical Economics, 1, 341367.Google Scholar
Fudenberg, D, Levine, DK. (2009). Self-confirming equilibrium and the Lucas critique. Journal of Economic Theory, 144, 6, 23542371. 10.1016/j.jet.2008.07.007.CrossRefGoogle Scholar
Ghidoni, R, Cleave, BL, Suetens, S. (2019). Perfect and imperfect strangers in social dilemmas. European Economic Review, 116, 148159. 10.1016/j.euroecorev.2019.04.002.CrossRefGoogle Scholar
Greiner, B. (2015). Subject pool recruitment procedures: Organizing experiments with ORSEE. Journal of the Economic Science Association, 1, 114125. 10.1007/s40881-015-0004-4.CrossRefGoogle Scholar
Harsanyi, J, Selten, R. (1988). A General Theory of Equilibrium Selection in Games, MIT Press.Google Scholar
Hart, S, Mas-Colell, A. (2006). Stochastic uncoupled dynamics and Nash Equilibrium. Games and Economic Behavior, 57, 2, 286303. 10.1016/j.geb.2005.09.007.CrossRefGoogle Scholar
Kneeland, T. (2015). Identifying higher-order rationality. Econometrica, 83, 5, 20652079. 10.3982/ECTA11983.CrossRefGoogle Scholar
Knoepfle, DT, Wang, JT-Y, Camerer, CF. (2009). Studying learning in games using eye-tracking. Journal of the European Economic Association, 7, 2–3, 388398. 10.1162/JEEA.2009.7.2-3.388.CrossRefGoogle Scholar
Nagel, R. (1995). Unraveling in guessing games: An experimental study. American Economic Review, 85, 5, 13131326.Google Scholar
Nash, J. (1950). Non-cooperative games. Ph.D. Dissertation, Princeton University.Google Scholar
Polonio, L, Coricelli, G. (2019). Testing the level of consistency between choices and beliefs in games using eye-tracking. Games and Economic Behavior, 113, 566586. 10.1016/j.geb.2018.11.003.CrossRefGoogle Scholar
Stahl, D, Wilson, P. (1994). Experimental evidence on players? Models of other players. Journal of Economic Behavior & Organization, 25, 3, 309327. 10.1016/0167-2681(94)90103-1.CrossRefGoogle Scholar
Wilcox, NT. (2006). Theories of learning in games and heterogeneity bias. Econometrica, 74, 5, 12711292. 10.1111/j.1468-0262.2006.00704.x.CrossRefGoogle Scholar
Young, HP. (2009). Learning by trial and error. Games and Economic Behavior, 65, 2, 626643. 10.1016/j.geb.2008.02.011.CrossRefGoogle Scholar
Figure 0

Fig. 1 The games and information treatments from the row player's perspective

Figure 1

Fig. 2 Mean action X rates and resulting SH equilibrium

Figure 2

Table 1 Treatments by session

Figure 3

Table 2 Between- vs. within-subjects analysis

Figure 4

Table 3 Effect of partial-information treatment for selection of action X

Figure 5

Table 4 Effect of partial-information treatment for reaching an equilibrium outcome

Figure 6

Table 5 Learning model parameter estimates and 95% confidence intervals

Supplementary material: File

Alshaikhmubarak et al. supplementary material

Alshaikhmubarak et al. supplementary material
Download Alshaikhmubarak et al. supplementary material(File)
File 2.9 MB