Game theory -  A Tutorial

Infinitely Repeated Prisoner’s Dilemma

Say that the reward of Alice is a(n) at step n, for n = 0, 1, 2,…. We define the reward of Alice as

A = a(0) + a(1)b + a(2)b2 + a(3)b3 + ….

The justification is that future rewards should be discounted. ($1.00 in ten years is worth quite a bit less than $1.00 today because of inflation.)

Folk Theorem: Pick any vector (A, B) in the interior of the set S shown in the figure. There is some discount rate b < 1 such that S corresponds to some SPE.

Here is the idea. The vector (A, B) corresponds to playing the four possible pairs of strategies according to some periodic pattern.  Each player can then state the strategy “I will play that periodic pattern as long as you do; the first time you deviate from it, I will start playing D forever.” This is a threat strategy. It is credible because (D, D) is a Nash SPE, so that if Alice decides to play D forever, Bob has no option but to follow and play D as well.  Hence, the future discounted cost will be equal to 4 after the first deviation. Thus, even though Bob might decrease his cost to 1 for one step, his future cost will rise to 4 if he deviate once. However, if he stuck with the periodic pattern, Bob’s total cost would be B < 4.  If the discount rate b is close enough to 1, the small gain in deviating once is overwhelmed by the increased future cost.

Note that this argument requires that (A, B) be dominated by the equilibrium (4, 4) for the threat to be credible.