Most prior research on the external validity of mixed-motive games has studied only one single game version and/or one specific type of real-life prosocial behavior. The present study employs a different approach. We used multiple game trials — with different payoff structures — to measure participants’ behavior in the Prisoner’s Dilemma, the Commons Dilemma, and the Public Goods Dilemma. We then examined the associations between these aggregated game behaviors and a wide set of self-reported prosocial behaviors such as donations, commuting, and environmental behaviors. We also related these prosocial behavior measures to a dispositional measure of prosociality, social value orientation. We report evidence that the weak statistical relationships routinely observed in prior studies are at least partially a consequence of failures to aggregate. More specifically, our results show that aggregation over multiple game trials was especially effective for the Prisoner’s Dilemma, whereas it was somewhat effective for the Public Goods Dilemma. Yet, aggregation on the side of the prosocial behaviors was effective for both these games, as well as for social value orientation. The Commons Dilemma, however, turned out to yield invariably poor relationships with prosocial behavior, regardless of the level of aggregation. Based on these findings, we conclude that the use of multiple instances of game behavior and prosocial behavior is preferable to the use of only a single measurement.