Published online by Cambridge University Press: 25 September 2020
The Nash counterfactual considers the question: what would happen were I to change my behaviour assuming no one else does. By contrast, the Kantian counterfactual considers the question: what would happen were everyone to deviate from some behaviour. We present a model that endogenizes the decision to engage in this type of Kantian reasoning. Autonomous agents using this moral framework receive psychic payoffs equivalent to the cooperate-cooperate payoff in Prisoner’s Dilemma regardless of the other player’s action. Moreover, if both interacting agents play Prisoner’s Dilemma using this moral framework, their material outcomes are a Pareto improvement over the Nash equilibrium.