The Prisoner's Dilemma is one of the earliest (1950) "games" developed in game theory. Game theory, broadly put, is a branch of study that enables us to analyze behavior when individuals can make choices that give them some control -- but not complete control -- over what happens to them. The preponderance of that circumstance in our lives makes game theory a powerful tool for understanding behavior. Simulating the Prisoner's Dilemma is an excellent way of studying the issues of conflict vs. cooperation between individuals and among nations.
Because game theory is so broad, it has been used in everything from anthropology to economics to military strategy to zoology. The Prisoner's Dilemma, just one example of a game, is itself very broad in application. Harvesting fish on the high seas, shooting or not shooting at an enemy, engaging in resistance or civil disobedience that requires some threshold level of participation in order to be successful, and even deciding whether or not to study in a class that's graded on a curve, can all be described by the Prisoner's Dilemma. The game is useful for any class that explores problems of social coordination. (At the bottom of this page are buttons that will take you do these different scenarios, but I strongly suggest reading this page through first.)
The Prisoner's Dilemma is a fascinating game because it is one for which the optimal outcome, the one that would be best for both players, is not the one that those individual players will reach. The "invisible hand" fails. But: if the game is played over and over, the optimum might prevail. How? To see this, we examine the game in its "classic" form: the arrest.
Two people have been arrested separately, and are held in separate cells. They are not allowed to communicate. Each is told the following:
The "normal form" of a game organizes the information in a table such as the one below, listing behaviors and outcomes. One player is labeled the "Row Player." Her choices ("strategies") are listed in the far left column of the table. When she selects a strategy, she determines which row the game is played in. The other is called the "Column Player." His strategies are listed in the top row. When he selects a strategy, he determines which column the game is played in. (Thus, as indicated above, each player has some -- but not total -- control over the outcome.)
In the below table, the Row Player's strategies and outcomes are shown in Red; the Column Player's strategies and choices are in Dark Blue Italic.
|2 years, 2 years||Go Free, 40 years|
|40 years, Go Free||30 days, 30 days|
A risk-averse person will always Confess. Confessing is better if the other person confesses (that way, you get 2 years instead of 40), and confessing is also better if the other person does not confess (that way, you get to go free instead of having to serve 30 days). Thus the equilibrium -- the behavior we expect to see -- is that each player confesses. The optimum, however, is clearly for each player not to confess, since each player then gets only 30 days.
A first thought is that an innocent person might protest her innocence and refuse to confess, but that person has to reflect that -- since she is innocent -- she doesn't know whether the other person is guilty. It would occur to the innocent person that the other detainee, if guilty, would jump at the chance to go free and implicate someone else. With 40 years at stake, confession has nothing to do with whether one is guilty. It has only to do with what is at risk. This "classic" form of the game can be used to discuss many issues in criminal justice: for example, can we ever assume that a confession indicates guilt? What truth value does a confession have? What risks do we run when we offer plea bargains in multi-defendant cases?
Following Axelrod's specification in The Evolution of Cooperation (New York: Basic Books, 1984), a Prisoner's Dilemma is any game, between any two players or any person vs. n other symmetrically situated players where each must choose to "Cooperate" or "Defect," and the payoffs are as follows:
The last item in the above list says that the best position to be in is that of the Defector when the other player cooperates. The next best is to be a Cooperator when the other cooperates. The next best is to be a Defector when the other defects. The worst is to be a Cooperator when the other defects. The condition (T+S)/2 > R is important if the game can be repeated. It means that both individuals together would be better off cooperating with each other than they could be by taking turns shafting each other.
The Prisoner's Dilemma would be a completely depressing game if not for the possibility that cooperation can evolve in the long run even though in the short run it seems always better to strike and run (to defect). But this is not what we always see.
Robert Axelrod in The Evolution of Cooperation conducted a series of experiments in which the game was played repeatedly. Participants were invited to submit strategies for playing this game over and over, over some undefined horizon. He found that cooperation can arise if the game is played over and over again either eternally or over an unknown period, if the players involved have a large enough chance of meeting again. This repeat meeting lets individuals punish defection or reward cooperation that happened in earlier periods.
Of course, if there are only two players, meeting again is certain. The uncertain or infinite time horizon is important: if it were known, for example, that the game was going to be played 5 times, each player would defect the 5th time, because they could not be punished later on. And, knowing that everyone was going to defect on the 5th time, each player would see no reason not to defect on the 4th time, so since everyone will defect on the 4th time, there's no harm defecting on the 3rd time, etc., and the entire possibility for cooperation would evaporate.
Axelrod's book, which would be an excellent supplementary reading for a number of courses, goes over the structure of the game. It also gives two fascinating real-world examples of "Cooperation Without Friendship or Foresight": "The Live-and-Let-Live System in Trench Warfare in World War I," (described in sub-pages to this document) and "The Evolution of Cooperation in Biological Systems."
Most important, Axelrod invited game theorists to play the multi-round Prisoner's Dilemma by computer by submitting a strategy plan. Axelrod presented the fascinating conclusion that the most successful strategy for playing the Prisoner's Dilemma repeatedly is TIT FOR TAT. This strategy starts out "nice"; that is, it Cooperates in the first round of play. But it is observant, and it punishes, and it forgives. In each subsequent round of play it does whatever the other player did in the previous round. This combination of initial niceness, watchfulness, punishment, and forgiveness means the strategy can exploit every opportunity of getting the R payment, while minimizing the times it gets the S payment. This strategy over the long run did better than any other strategy submitted to Axelrod's tournament, no matter what strategy it was paired against. Since TIT FOR TAT is Cooperative in an environment when others cooperate, but not a victim when others defect, it did best. Thus Axelrod demonstrates that cooperation can evolve, even without communication, through self-interest, under the right circumstances. It is a hopeful conclusion.
In the Prisoner's Dilemma game above T, R, P and S are expressed in years in prison, but they can be expressed in utility (for economists), money, fitness levels (for zoologists), or any other unit. In a classroom simulation, they can be expressed in points. It's possible to do it in extra credit points. I've also had success -- and much less angst and controversy -- by doing it in M&Ms and paying the students off during the next class period.
It is usually best to do this simulation after the students have seen the game in "normal" (table) form. Sometimes I have done it "morally neutral"; that is, labeled the strategies X and Y rather than Cooperate and Defect, so the choices are not emotionally charged. I list the payoffs but don't identify them by the (equally emotionally charged) letters T, R, P and S. You can also do it, however, by giving the students the scenario above, or one of the alternate scenarios suggested by the other tables. Or you can give the morally neutral version first and then switch to one of the others.To conduct the simulation, you will need:
|Neutral Form||Student's Dilemma||WWI: Live & Let Live|
|Open Sea Fishing||Trade vs. Raid||Holocaust Resistance|
|History of Game Theory|
|Back to the Curriculum Workshop||The Voices and Dreams Home Page|