Prisoner's dilemma

This article is about game theory. For the 1988 novel, see Prisoner's Dilemma (novel). For the Doctor Who audiobook, see The Prisoner's Dilemma. For the 2001 play, see The Prisoner's Dilemma (play).

The prisoner’s dilemma is a canonical example of a game, analyzed in game theory that shows why two individuals might not cooperate, even if it appears that it is in their best interest to do so. It was originally framed by Merrill Flood and Melvin Dresher working at RAND in 1950. Albert W. Tucker formalized the game with prison sentence payoffs and gave it the "prisoner's dilemma" name (Poundstone, 1992). A classic example of the prisoner's dilemma (PD) is presented as follows:

Two men are arrested, but the police do not possess enough information for a conviction. Following the separation of the two men, the police offer both a similar deal- if one testifies against his partner (defects / betrays), and the other remains silent (cooperates / assists), the betrayer goes free and the cooperator receives the full one-year sentence. If both remain silent, both are sentenced to only one month in jail for a minor charge. If each 'rats out' the other, each receives a three-month sentence. Each prisoner must choose either to betray or remain silent; the decision of each is kept quiet. What should they do?

If it is supposed here that each player is only concerned with lessening his time in jail, the game becomes a non-zero sum game where the two players may either assist or betray the other. In the game, the sole worry of the prisoners seems to be increasing his own reward. The interesting symmetry of this problem is that the logical decision leads both to betray the other, even though their individual ‘prize’ would be greater if they cooperated.

In the regular version of this game, collaboration is dominated by betraying, and as a result, the only possible outcome of the game is for both prisoners to betray the other. Regardless of what the other prisoner chooses, one will always gain a greater payoff by betraying the other. Because betraying is always more beneficial than cooperating, all objective prisoners would seemingly betray the other.

In the extended form game, the game is played over and over, and consequently, both prisoners continuously have an opportunity to penalize the other for the previous decision. If the number of times the game will be played is known, the finite aspect of the game means that by backward induction, the two prisoners will betray each other repeatedly.

In casual usage, the label "prisoner's dilemma" may be applied to situations not strictly matching the formal criteria of the classic or iterative games, for instance, those in which two entities could gain important benefits from cooperating or suffer from the failure to do so, but find it merely difficult or expensive, not necessarily impossible, to coordinate their activities to achieve cooperation.

Strategy for the classic prisoner's dilemma

The normal game is shown below:

	Prisoner B stays silent (cooperates)	Prisoner B confesses (defects)
Prisoner A stays silent (cooperates)	Each serves 1 month	Prisoner A: 1 year Prisoner B: goes free
Prisoner A confesses (defects)	Prisoner A: goes free Prisoner B: 1 year	Each serves 3 months

Here, regardless of what the other decides, each prisoner gets a higher pay-off by betraying the other. For example, Prisoner A can, with close certainty, state that no matter what prisoner B chooses, prisoner A is better off 'ratting him out' (defecting) than staying silent (cooperating). As a result, solely for his own benefit, prisoner A should logically betray him. On the other hand, if prisoner B, acts the same way, then they both have acted the same way, and both receive a lower reward than if both were to stay quiet. Seemingly logical decisions result in both players being worse off than if each chose to lessen the sentence of his accomplice at the cost of spending more time in jail himself.

Although they are not permitted to communicate, if the prisoners trust each other, they can both rationally choose to remain silent, lessening the penalty for both of them.

Generalized form

We can expose the framework of the traditional Prisoners’ Dilemma by removing its original prisoner setting, presented as the following:

There are two players and an impartial third party. Each player holds two cards, one with the word ‘collaborate’, and the other with ‘hinder’. Each player gives one card to the third person, thereby getting rid of the possibility of the player’s knowing the other’s decision in advance. At the end of the turn, payments are given based on the cards played.

Based on the rules of a typical understanding of the prisoner’s dilemma, if the two players are represented by colors, red and blue, and the choices made are assigned point values it becomes clear that if the red player plays betrayal and the blue player assists the other, red gets the T prize of 5 points while blue doesn't get payoff at all. If both cooperate they get the R payoff of 3 points each, while if they both betray they get the P payoff of 1 point. The payoffs are shown below.

Example PD payoff matrix
	Cooperate	Defect
Cooperate	3, 3	0, 5
Defect	5, 0	1, 1

In simple terms, the matrix looks like this:

	Cooperate	Defect
Cooperate	win-win	lose more-win more
Defect	win more-lose more	lose-lose

It is then possible to make general the point values:

Canonical PD payoff matrix
	Cooperate	Defect
Cooperate	R, R	S, T
Defect	T, S	P, P

Where T means the desire to betray, R for the Repayment for total unity, P for the Punishment for total betrayal and S for No reward. To be a prisoner's dilemma, the following must be true:

T > R > P > S

The above form guarantees that the balanced outcome is betrayal, but that collaboration rules the sense of middle-play. In addition to the above condition, if the game is repeated more than once, the following should be included:^[1]

2 R > T + S

If the above is not true, togetherness is not always necessary, as the players are, in actuality, better off by having each player alternate between Cooperatation and Betrayal.

These rules were established by cognitive scientist Douglas Hofstadter and form the formal canonical description of a typical game of prisoner's dilemma.

A simple special case occurs when the advantage of defection over cooperation is independent of what the co-player does and cost of the co-player's defection is independent of one's own action, i.e. T+S = P+R.

The iterated prisoner's dilemma

If two players play prisoner's dilemma more than once in succession and they remember previous actions of their opponent and change their strategy accordingly, the game is called iterated prisoner's dilemma.

The iterated prisoner's dilemma game is fundamental to certain theories of human cooperation and trust. On the assumption that the game can model transactions between two people requiring trust, cooperative behaviour in populations may be modeled by a multi-player, iterated, version of the game. It has, consequently, fascinated many scholars over the years. In 1975, Grofman and Pool estimated the count of scholarly articles devoted to it at over 2,000. The iterated prisoner's dilemma has also been referred to as the "Peace-War game".^[2]

If the game is played exactly N times and both players know this, then it is always game theoretically optimal to defect in all rounds. The only possible Nash equilibrium is to always defect. The proof is inductive: one might as well defect on the last turn, since the opponent will not have a chance to punish the player. Therefore, both will defect on the last turn. Thus, the player might as well defect on the second-to-last turn, since the opponent will defect on the last no matter what is done, and so on. The same applies if the game length is unknown but has a known upper limit.

Unlike the standard prisoner's dilemma, in the iterated prisoner's dilemma the defection strategy is counter-intuitive and fails badly to predict the behavior of human players. Within standard economic theory, though, this is the only correct answer. The superrational strategy in the iterated prisoners dilemma with fixed N is to cooperate against a superrational opponent, and in the limit of large N, experimental results on strategies agree with the superrational version, not the game-theoretic rational one.

For cooperation to emerge between game theoretic rational players, the total number of rounds N must be random, or at least unknown to the players. In this case always defect may no longer be a strictly dominant strategy, only a Nash equilibrium. Amongst results shown by Robert Aumann in a 1959 paper, rational players repeatedly interacting for indefinitely long games can sustain the cooperative outcome.

Strategy for the classic prisoner's dilemma

Interest in the iterated prisoners dilemma (IPD) was kindled by Robert Axelrod in his book The Evolution of Cooperation (1984). In it he reports on a tournament he organized of the N step prisoner dilemma (with N fixed) in which participants have to choose their mutual strategy again and again, and have memory of their previous encounters. Axelrod invited academic colleagues all over the world to devise computer strategies to compete in an IPD tournament. The programs that were entered varied widely in algorithmic complexity, initial hostility, capacity for forgiveness, and so forth.

Axelrod discovered that when these encounters were repeated over a long period of time with many players, each with different strategies, greedy strategies tended to do very poorly in the long run while more altruistic strategies did better, as judged purely by self-interest. He used this to show a possible mechanism for the evolution of altruistic behaviour from mechanisms that are initially purely selfish, by natural selection.

The best deterministic strategy was found to be tit for tat, which Anatol Rapoport developed and entered into the tournament. It was the simplest of any program entered, containing only four lines of BASIC, and won the contest. The strategy is simply to cooperate on the first iteration of the game; after that, the player does what his or her opponent did on the previous move. Depending on the situation, a slightly better strategy can be "tit for tat with forgiveness." When the opponent defects, on the next move, the player sometimes cooperates anyway, with a small probability (around 1–5%). This allows for occasional recovery from getting trapped in a cycle of defections. The exact probability depends on the line-up of opponents.^{[citation needed]}

By analysing the top-scoring strategies, Axelrod^{[citation needed]} stated several conditions necessary for a strategy to be successful.

Nice: The most important condition is that the strategy must be "nice", that is, it will not defect before its opponent does (this is sometimes referred to as an "optimistic" algorithm). Almost all of the top-scoring strategies were nice; therefore a purely selfish strategy will not "cheat" on its opponent, for purely self-interested reasons first.
Retaliating: However, Axelrod contended, the successful strategy must not be a blind optimist. It must sometimes retaliate. An example of a non-retaliating strategy is Always Cooperate. This is a very bad choice, as "nasty" strategies will ruthlessly exploit such players.
Forgiving: Successful strategies must also be forgiving. Though players will retaliate, they will once again fall back to cooperating if the opponent does not continue to defect. This stops long runs of revenge and counter-revenge, maximizing points.
Non-envious: The last quality is being non-envious, that is not striving to score more than the opponent (note that a "nice" strategy can never score more than the opponent).

The optimal (points-maximizing) strategy for the one-time PD game is simply defection; as explained above, this is true whatever the composition of opponents may be. However, in the iterated-PD game the optimal strategy depends upon the strategies of likely opponents, and how they will react to defections and cooperations. For example, consider a population where everyone defects every time, except for a single individual following the tit for tat strategy. That individual is at a slight disadvantage because of the loss on the first turn. In such a population, the optimal strategy for that individual is to defect every time. In a population with a certain percentage of always-defectors and the rest being tit for tat players, the optimal strategy for an individual depends on the percentage, and on the length of the game.

A strategy called Pavlov (an example of Win-Stay, Lose-Switch) cooperates at the first iteration and whenever the player and co-player did the same thing at the previous iteration; Pavlov defects when the player and co-player did different things at the previous iteration. For a certain range of parameters, Pavlov beats all other strategies by giving preferential treatment to co-players which resemble Pavlov.

Deriving the optimal strategy is generally done in two ways:

Bayesian Nash Equilibrium: If the statistical distribution of opposing strategies can be determined (e.g. 50% tit for tat, 50% always cooperate) an optimal counter-strategy can be derived analytically.^[3]
Monte Carlo simulations of populations have been made, where individuals with low scores die off, and those with high scores reproduce (a genetic algorithm for finding an optimal strategy). The mix of algorithms in the final population generally depends on the mix in the initial population. The introduction of mutation (random variation during reproduction) lessens the dependency on the initial population; empirical experiments with such systems tend to produce tit for tat players (see for instance Chess 1988), but there is no analytic proof that this will always occur.

Although tit for tat is considered to be the most robust basic strategy, a team from Southampton University in England (led by Professor Nicholas Jennings [1] and consisting of Rajdeep Dash, Sarvapali Ramchurn, Alex Rogers, Perukrishnen Vytelingum) introduced a new strategy at the 20th-anniversary iterated prisoner's dilemma competition, which proved to be more successful than tit for tat. This strategy relied on cooperation between programs to achieve the highest number of points for a single program. The University submitted 60 programs to the competition, which were designed to recognize each other through a series of five to ten moves at the start. Once this recognition was made, one program would always cooperate and the other would always defect, assuring the maximum number of points for the defector. If the program realized that it was playing a non-Southampton player, it would continuously defect in an attempt to minimize the score of the competing program. As a result,^[4] this strategy ended up taking the top three positions in the competition, as well as a number of positions towards the bottom.

This strategy takes advantage of the fact that multiple entries were allowed in this particular competition, and that the performance of a team was measured by that of the highest-scoring player (meaning that the use of self-sacrificing players was a form of minmaxing). In a competition where one has control of only a single player, tit for tat is certainly a better strategy. Because of this new rule, this competition also has little theoretical significance when analysing single agent strategies as compared to Axelrod's seminal tournament. However, it provided the framework for analysing how to achieve cooperative strategies in multi-agent frameworks, especially in the presence of noise. In fact, long before this new-rules tournament was played, Richard Dawkins in his book The Selfish Gene pointed out the possibility of such strategies winning if multiple entries were allowed, but remarked that most probably Axelrod would not have allowed them if they had been submitted. It also relies on circumventing rules about the prisoner's dilemma in that there is no communication allowed between the two players. When the Southampton programs engage in an opening "ten move dance" to recognize one another, this only reinforces just how valuable communication can be in shifting the balance of the game.

Continuous iterated prisoner's dilemma

Most work on the iterated prisoner's dilemma has focused on the discrete case, in which players either cooperate or defect, because this model is relatively simple to analyze. However, some researchers have looked at models of the continuous iterated prisoner's dilemma, in which players are able to make a variable contribution to the other player. Le and Boyd^[5] found that in such situations, cooperation is much harder to evolve than in the discrete iterated prisoner's dilemma. The basic intuition for this result is straightforward: in a continuous prisoner's dilemma, if a population starts off in a non-cooperative equilibrium, players who are only marginally more cooperative than non-cooperators get little benefit from assorting with one another. By contrast, in a discrete prisoner's dilemma, tit for tat cooperators get a big payoff boost from assorting with one another in a non-cooperative equilibrium, relative to non-cooperators. Since nature arguably offers more opportunities for variable cooperation rather than a strict dichotomy of cooperation or defection, the continuous prisoner's dilemma may help explain why real-life examples of tit for tat-like cooperation are extremely rare in nature (ex. Hammerstein^[6]) even though tit for tat seems robust in theoretical models.

Morality

While it is sometimes thought that morality must involve the constraint of self-interest, David Gauthier famously argues that co-operating in the prisoners dilemma on moral principles is consistent with self-interest and the axioms of game theory.^[7] In his opinion, it is most prudent to give up straight-forward maximizing and instead adopt a disposition of constrained maximization, according to which one resolves to cooperate in the belief that the opponent will respond with the same choice, while in the classical PD it is explicitly stipulated that the response of the opponent does not depend on the player's choice. This form of contractarianism claims that good moral thinking is just an elevated and subtly strategic version of basic means-end reasoning.

Douglas Hofstadter expresses a strong personal belief^{[citation needed]} that the mathematical symmetry is reinforced by a moral symmetry, along the lines of the Kantian categorical imperative: defecting in the hope that the other player cooperates is morally indefensible. If players treat each other as they would treat themselves, then they will cooperate.

Real-life examples

These particular examples, involving prisoners and bag switching and so forth, may seem contrived, but there are in fact many examples in human interaction as well as interactions in nature that have the same payoff matrix. The prisoner's dilemma is therefore of interest to the social sciences such as economics, politics and sociology, as well as to the biological sciences such as ethology and evolutionary biology. Many natural processes have been abstracted into models in which living beings are engaged in endless games of prisoner's dilemma. This wide applicability of the PD gives the game its substantial importance.

In politics

In political science, for instance, the PD scenario is often used to illustrate the problem of two states engaged in an arms race. Both will reason that they have two options, either to increase military expenditure or to make an agreement to reduce weapons. Either state will benefit from military expansion regardless of what the other state does; therefore, they both incline towards military expansion. The paradox is that both states are acting rationally, but producing an apparently irrational result. This could be considered a corollary to deterrence theory.

In environmental studies

In environmental studies, the PD is evident in crises such as global climate change. All countries will benefit from a stable climate, but any single country is often hesitant to curb CO₂ emissions. The immediate benefit to an individual country to maintain current behavior is perceived to be greater than the eventual benefit to all countries if behavior was changed, therefore explaining the current impasse concerning climate change.^[8]

In psychology

In addiction research/behavioral economics, George Ainslie points out^[9] that addiction can be cast as an intertemporal PD problem between the present and future selves of the addict. In this case, defecting means relapsing, and it is easy to see that not defecting both today and in the future is by far the best outcome, and that defecting both today and in the future is the worst outcome. The case where one abstains today but relapses in the future is clearly a bad outcome—in some sense the discipline and self-sacrifice involved in abstaining today have been "wasted" because the future relapse means that the addict is right back where he started and will have to start over (which is quite demoralizing, and makes starting over more difficult). The final case, where one engages in the addictive behavior today while abstaining "tomorrow" will be familiar to anyone who has struggled with an addiction. The problem here is that (as in other PDs) there is an obvious benefit to defecting "today", but tomorrow one will face the same PD, and the same obvious benefit will be present then, ultimately leading to an endless string of defections.

In economics

Advertising is sometimes cited as a real life example of the prisoner’s dilemma. When cigarette advertising was legal in the United States, competing cigarette manufacturers had to decide how much money to spend on advertising. The effectiveness of Firm A’s advertising was partially determined by the advertising conducted by Firm B. Likewise, the profit derived from advertising for Firm B is affected by the advertising conducted by Firm A. If both Firm A and Firm B chose to advertise during a given period the advertising cancels out, receipts remain constant, and expenses increase due to the cost of advertising. Both firms would benefit from a reduction in advertising. However, should Firm B choose not to advertise, Firm A could benefit greatly by advertising. Nevertheless, the optimal amount of advertising by one firm depends on how much advertising the other undertakes. As the best strategy is dependent on what the other firm chooses there is no dominant strategy and this is not a prisoner's dilemma but rather is an example of a stag hunt. The outcome is similar, though, in that both firms would be better off were they to advertise less than in the equilibrium. Sometimes cooperative behaviors do emerge in business situations. For instance, cigarette manufacturers endorsed the creation of laws banning cigarette advertising, understanding that this would reduce costs and increase profits across the industry.^[10] This analysis is likely to be pertinent in many other business situations involving advertising.

Without enforceable agreements, members of a cartel are also involved in a (multi-player) prisoners' dilemma.^[11] 'Cooperating' typically means keeping prices at a pre-agreed minimum level. 'Defecting' means selling under this minimum level, instantly stealing business (and profits) from other cartel members. Anti-trust authorities want potential cartel members to mutually defect, ensuring the lowest possible prices for consumers.

In law

The theoretical conclusion of PD is one reason why, in many countries, plea bargaining is forbidden. Often, precisely the PD scenario applies: it is in the interest of both suspects to confess and testify against the other prisoner/suspect, even if each is innocent of the alleged crime.^{[citation needed]}

Multiplayer dilemmas

Many real-life dilemmas involve multiple players. Although metaphorical, Hardin's tragedy of the commons may be viewed as an example of a multi-player generalization of the PD: Each villager makes a choice for personal gain or restraint. The collective reward for unanimous (or even frequent) defection is very low payoffs (representing the destruction of the "commons"). The commons are not always exploited: William Poundstone, in a book about the prisoner's dilemma (see References below), describes a situation in New Zealand where newspaper boxes are left unlocked. It is possible for people to take a paper without paying (defecting) but very few do, feeling that if they do not pay then neither will others, destroying the system. Subsequent research by Elinor Ostrom [Ostrom], winner of the 2009 Nobel Prize in Economics, proved that the tragedy of the commons is oversimplified, with the negative outcome influenced by outside influences. Without complicating pressures, groups communicate and manage the commons among themselves for their mutual benefit, enforcing social norms to preserve the resource and achieve the maximun good for the group, an example of effecting the best case outcome for PD. ^[12] ^[13]

Related games

Closed-bag exchange

Hofstadter^[14] once suggested that people often find problems such as the PD problem easier to understand when it is illustrated in the form of a simple game, or trade-off. One of several examples he used was "closed bag exchange":

Two people meet and exchange closed bags, with the understanding that one of them contains money, and the other contains a purchase. Either player can choose to honor the deal by putting into his or her bag what he or she agreed, or he or she can defect by handing over an empty bag.

In this game, defection is always the best course, implying that rational agents will never play. However, in this case both players cooperating and both players defecting actually give the same result, assuming there are no gains from trade, so chances of mutual cooperation, even in repeated games, are few.

Friend or Foe?

Friend or Foe? is a game show that aired from 2002 to 2005 on the Game Show Network in the United States. It is an example of the prisoner's dilemma game tested by real people, but in an artificial setting. On the game show, three pairs of people compete. As each pair is eliminated, it plays a game similar to the prisoner's dilemma to determine how the winnings are split. If they both cooperate (Friend), they share the winnings 50–50. If one cooperates and the other defects (Foe), the defector gets all the winnings and the cooperator gets nothing. If both defect, both leave with nothing. Notice that the payoff matrix is slightly different from the standard one given above, as the payouts for the "both defect" and the "cooperate while the opponent defects" cases are identical. This makes the "both defect" case a weak equilibrium, compared with being a strict equilibrium in the standard prisoner's dilemma. If you know your opponent is going to vote Foe, then your choice does not affect your winnings. In a certain sense, Friend or Foe has a payoff model between prisoner's dilemma and the game of Chicken.

The payoff matrix is

	Cooperate	Defect
Cooperate	1, 1	0, 2
Defect	2, 0	0, 0

This payoff matrix was later used on the British television programmes Shafted and Golden Balls. The latter show has been analyzed by a team of economists. See: Split or Steal? Cooperative Behavior When the Stakes are Large.

It was also used earlier in the UK Channel 4 gameshow Trust Me, hosted by Nick Bateman, in 2000.

Notes

^ Dawkins, Richard (1989). The Selfish Gene. Oxford University Press. ISBN 0-19-286092-5. Page: 204 of Paperback edition
^ Shy, O., 1996, Industrial Organization: Theory and Applications, Cambridge, Mass.: The MIT Press.
^ For example see the 2003 study “Bayesian Nash equilibrium; a statistical test of the hypothesis” for discussion of the concept and whether it can apply in real economic or strategic situations (from Tel Aviv University).
^ The 2004 Prisoner's Dilemma Tournament Results show University of Southampton's strategies in the first three places, despite having fewer wins and many more losses than the GRIM strategy. (Note that in a PD tournament, the aim of the game is not to “win” matches — that can easily be achieved by frequent defection). It should also be pointed out that even without implicit collusion between software strategies (exploited by the Southampton team) tit for tat is not always the absolute winner of any given tournament; it would be more precise to say that its long run results over a series of tournaments outperform its rivals. (In any one event a given strategy can be slightly better adjusted to the competition than tit for tat, but tit for tat is more robust). The same applies for the tit for tat with forgiveness variant, and other optimal strategies: on any given day they might not 'win' against a specific mix of counter-strategies.An alternative way of putting it is using the Darwinian ESS simulation. In such a simulation, tit for tat will almost always come to dominate, though nasty strategies will drift in and out of the population because a tit for tat population is penetrable by non-retaliating nice strategies, which in turn are easy prey for the nasty strategies. Richard Dawkins showed that here, no static mix of strategies form a stable equilibrium and the system will always oscillate between bounds.
^ Le, S. and R. Boyd (2007) "Evolutionary Dynamics of the Continuous Iterated Prisoner's Dilemma" Journal of Theoretical Biology, Volume 245, 258–267.
^ Hammerstein, P. (2003). Why is reciprocity so rare in social animals? A protestant appeal. In: P. Hammerstein, Editor, Genetic and Cultural Evolution of Cooperation, MIT Press. pp. 83–94.
^ Contractarianism First published Sun Jun 18, 2000; substantive revision Wed Apr 4, 2007. Stanford Encyclopedia of Philosophy.
^ "Markets & Data". The Economist. 2007-09-27. http://www.economist.com/finance/displaystory.cfm?story_id=9867020.
^ George Ainslie (2001). Breakdown of Will. ISBN 0-521-59694-7.
^ This argument for the development of cooperation through trust is given in The Wisdom of Crowds , where it is argued that long-distance capitalism was able to form around a nucleus of Quakers, who always dealt honourably with their business partners. (Rather than defecting and reneging on promises — a phenomenon that had discouraged earlier long-term unenforceable overseas contracts). It is argued that dealings with reliable merchants allowed the meme for cooperation to spread to other traders, who spread it further until a high degree of cooperation became a profitable strategy in general commerce
^ Nicholson, Walter (2000). Intermediate Microeconomics (8th ed.). Harcourt.
^ http://en.wikipedia.org/wiki/Tragedy_of_the_commons
^ http://volokh.com/2009/10/12/elinor-ostrom-and-the-tragedy-of-the-commons/
^ Hofstadter, Douglas R. (1985). Metamagical Themas: questing for the essence of mind and pattern. Bantam Dell Pub Group. ISBN 0-465-04566-9. – see Ch.29 The Prisoner's Dilemma Computer Tournaments and the Evolution of Cooperation.

References

Robert Aumann, “Acceptable points in general cooperative n-person games”, in R. D. Luce and A. W. Tucker (eds.), Contributions to the Theory 23 of Games IV, Annals of Mathematics Study 40, 287–324, Princeton University Press, Princeton NJ.
Axelrod, R. (1984). The Evolution of Cooperation. ISBN 0-465-02121-2
Bicchieri, Cristina (1993). Rationality and Coordination. Cambridge University Press.
Kenneth Binmore, Fun and Games.
David M. Chess (1988). Simulating the evolution of behavior: the iterated prisoners' dilemma problem. Complex Systems, 2:663–670.
Dresher, M. (1961). The Mathematics of Games of Strategy: Theory and Applications Prentice-Hall, Englewood Cliffs, NJ.
Flood, M.M. (1952). Some experimental games. Research memorandum RM-789. RAND Corporation, Santa Monica, CA.
Kaminski, Marek M. (2004) Games Prisoners Play Princeton University Press. ISBN 0-691-11721-7
Poundstone, W. (1992) Prisoner's Dilemma Doubleday, NY NY.
Greif, A. (2006). Institutions and the Path to the Modern Economy: Lessons from Medieval Trade. Cambridge University Press, Cambridge, UK.
Rapoport, Anatol and Albert M. Chammah (1965). Prisoner's Dilemma. University of Michigan Press.
S. Le and R. Boyd (2007) "Evolutionary Dynamics of the Continuous Iterated Prisoner's Dilemma" Journal of Theoretical Biology, Volume 245, 258–267. Full text
A. Rogers, R. K. Dash, S. D. Ramchurn, P. Vytelingum and N. R. Jennings (2007) “Coordinating team players within a noisy iterated Prisoner’s Dilemma tournament” Theoretical Computer Science 377 (1–3) 243–259.
M.J. van den Assem, D. van Dolder and R.H. Thaler (2010). "Split or Steal? Cooperative Behavior When the Stakes are Large"

External links

Listen to this article (info/dl)

This audio file was created from a revision of Prisoner's dilemma dated 2007-06-25, and does not reflect subsequent edits to the article. (Audio help)

More spoken articles

Prisoner's Dilemma (Stanford Encyclopedia of Philosophy)
Effects of Tryptophan Depletion on the Performance of an Iterated Prisoner's Dilemma Game in Healthy Adults – Nature Neuropsychopharmacology
Is there a "dilemma" in Prisoner's Dilemma by Elmer G. Wiens
"Games Prisoners Play" – game-theoretic analysis of interactions among actual prisoners, including PD.
Iterated prisoner's dilemma game
Another version of the iterated prisoner's dilemma game
Another version of the iterated prisoner's dilemma game
Iterated prisoner's dilemma game applied to Big Brother TV show situation.
The Bowerbird's Dilemma The Prisoner's Dilemma in ornithology — mathematical cartoon by Larry Gonnick.
Examples of Prisoners' dilemma
Multiplayer game based on prisoner dilemma Play prisoner's dilemma over IRC — by Axiologic Research.
Prisoner's Dilemma Party Game A party game based on the prisoner's dilemma
The Edge cites Robert Axelrod's book and discusses the success of U2 following the principles of IPD.
Classical and Quantum Contents of Solvable Game Theory on Hilbert Space

v · d · eTopics in game theory

Definitions	Normal-form game · Extensive-form game · Cooperative game · Succinct game · Information set · Preference

Equilibrium concepts	Nash equilibrium · Subgame perfection · Bayesian-Nash · Perfect Bayesian · Trembling hand · Proper equilibrium · Epsilon-equilibrium · Correlated equilibrium · Sequential equilibrium · Quasi-perfect equilibrium · Evolutionarily stable strategy · Risk dominance · Pareto efficiency · Quantal response equilibrium · Self-confirming equilibrium · Strong Nash equilibrium · Markov perfect equilibrium

Strategies	Dominant strategies · Pure strategy · Mixed strategy · Tit for tat · Grim trigger · Collusion · Backward induction · Markov strategy

Classes of games	Symmetric game · Perfect information · Simultaneous game · Sequential game · Repeated game · Signaling game · Cheap talk · Zero–sum game · Mechanism design · Bargaining problem · Stochastic game · Large poisson game · Nontransitive game · Global games

Games	Prisoner's dilemma · Traveler's dilemma · Coordination game · Chicken · Centipede game · Volunteer's dilemma · Dollar auction · Battle of the sexes · Stag hunt · Matching pennies · Ultimatum game · Rock-paper-scissors · Pirate game · Dictator game · Public goods game · Blotto games · War of attrition · El Farol Bar problem · Cake cutting · Cournot game · Deadlock · Diner's dilemma · Guess 2/3 of the average · Kuhn poker · Nash bargaining game · Screening game · Trust game · Princess and monster game · Monty Hall problem

Theorems	Minimax theorem · Nash's theorem · Purification theorem · Folk theorem · Revelation principle · Arrow's impossibility theorem

See also	Tragedy of the commons · Tyranny of small decisions · All-pay auction · List of games in game theory

Categories:

Game theory
Thought experiments

Wikimedia Foundation. 2010.

Игры ⚽ Поможем сделать НИР

Look at other dictionaries:

prisoner's dilemma — n. in game theory, a situation in which, if each of the individuals involved chooses the most rational option for gaining his or her own ends, the least desirable outcome for all will necessarily result * * * Imaginary situation employed in game… … Universalium
prisoner's dilemma — n. in game theory, a situation in which, if each of the individuals involved chooses the most rational option for gaining his or her own ends, the least desirable outcome for all will necessarily result … English World dictionary
Prisoner's Dilemma — A paradigmatic instance in game theory , which takes its name from a story of two prisoners, who are interrogated separately and cannot communicate with each other. There is insufficient evidence for the police to convict either prisoner of armed … Dictionary of sociology
Prisoner's Dilemma — A paradox in decision analysis in which two individuals acting in their own best interest pursue a course of action that does not result in the ideal outcome. The typical prisoner’s dilemma is set up in such a way that both parties choose… … Investment dictionary
prisoner’s dilemma — A branch of *game theory that systematically analyzes strategies for the optimal selection of alternative courses of action in competitive conditions. The prisoner’s dilemma focuses on the complexities of making competitive choices. In a… … Auditor's dictionary
Prisoner's Dilemma (novel) — Infobox Book | name = Prisoner s Dilemma title orig = image caption = Cover of a reprint edition. author = Richard Powers illustrator = cover artist = country = United States language = English genre = Novel publisher = Beech Tree Books release… … Wikipedia
Prisoner's dilemma — Das Gefangenendilemma ist ein Paradoxon, das zentraler Bestandteil der Spieltheorie ist. Es ist nicht zu verwechseln mit dem Gefangenenparadoxon über bedingte Wahrscheinlichkeiten. Bei dem Dilemma handelt es sich um ein klassisches symmetrisches… … Deutsch Wikipedia
prisoner's dilemma — noun (in game theory) a situation in which two players each have two options whose outcome depends crucially on the other s simultaneous choice, exemplified by two prisoners separately deciding whether to confess to a crime … English new terms dictionary
prisoner's dilemma — … Useful english dictionary
Iterated Prisoner's Dilemma — A normal prisoner’s dilemma played repeatedly by the same participants. An iterated prisoner’s dilemma differs from the original concept of a prisoner’s dilemma because participants can learn about the behavioral tendencies of… … Investment dictionary

Academic Dictionaries and Encyclopedias

Prisoner's dilemma

Contents

Strategy for the classic prisoner's dilemma

Generalized form

The iterated prisoner's dilemma