Let’s say 2 people have 2 options and which
option they each pick is going to define how much of stuff they will get with one another.
The numbers represent something that they want, like money. Or points! Everybody likes
points they want as many points as they can get.
What makes the prisoner’s dilemma the prisoner’s dilemma is really, this number being bigger
than this number, this number being bigger than this number, and this number being bigger
than this number. In a pattern like this. They can pick A or B but they have no control
over whether their opponent will pick A or B
So when faced with their choices, they see that option B always gets them more.
And it�s the same for the other guy, going option 2 is always better. But because of
the way it�s set up, both going option B is the worst for the group. And really not
one of the better situations for the individual. Because going option A is worse for the individual,
but better for the other person, and best for the group we might call it sharing or
cooperating or whatever depending on the situation. It looks like working together to get more
together. Going option B is always better for the individual,
but worse for the other person, so it looks like defecting, cheating, or betraying depending
on the situation. Their aiming for personal gain.
If they’re only going to play once, a player will always get more by defecting.
But if they are going to interact with somebody multiple times; if they play once, then they
play again and again, and we add up their scores, the strategy changes.
INTRO In one off games, defecting gives a higher
payout, doesn’t matter what the other person is doing.
And with multiple games, if the opponent cooperates and defects at random, or follows a set pattern,
always defecting still gives the best payout. Try always cooperating with them? No. Start
off cooperating then defect? No. Try to line up cooperation and defection? Doesn’t matter.
Always defecting is better. Because here, like in a one-off game, defecting
has no consequences, and it always gives a higher payout.
But if what a player picks, changes depending on what the other player did, for example:
this player starts off cooperating and always cooperates. Unless their opponent defects.
Then they switch to defecting and defect no matter what. You know it sort of gets pissed
off. We’ll call it GRUDGER. Any strategy that started off cooperating with GRUDGER, or always
cooperated would have gotten a higher score than always defect.
Because here defecting can have consequences. With multiple games there is an opportunity
to influence the other player for future games. ALWAYS DEFECT isn’t the best strategy anymore.
What is? In 1980 Robert Axelrod held a tournament where
anyone could submit a strategy. Each strategy did 200 rounds against each other strategy.
There were 14 strategy submitted. Plus the strategy 50/50 RANDOM.
These were the payoffs so if two strategies cooperated with eachother for all 200 rounds,
they would each get 600. If they both defected they would both get 200. If one cooperated
and the other defected the whole time, one would get 0 and the other 1000, the highest
and lowest possible scores. If they went back and forth, like this, they would each get
500. And these are the averaged results of the
tournament. The winner was a strategy called tit for tat.
TIT FOR TAT cooperates on the first round and from then on it just copies what the other
person did last round. Why did it win? It’s a simple strategy, but
one thing is does is, it reciprocates quickly against defectors. Any strategy against tit
for that that tries to take advantage of it, gets instantly punished and put into a bad
situation. So if the other strategy keeps defecting, or even if it tries to go back
to cooperating, it would have gained less than it would have if it had just kept cooperating
with TIT FOR TAT. And if TIT FOR TAT didn’t punish, TIT FOR TAT would have been worse
off. So we might say TIT FOR TAT is “retaliating”,
it punished defection. Which is good because it can prevent some losses, and it can disincentive
an opponent from defecting. Another thing is, TIT FOR TAT is never the
first to defect. Players would want to be in this situation as much as they can, there
is a temptation to defect when the other person is cooperating. But any responsive opponent
would quickly defect and they would both end up here. It’s risky to defect. An easy solution
is to ignore that temptation and just try to maximize long term mutual cooperation.
Start off cooperating and then never defect unless you need to punish someone.
It can end up giving better gains, especially with opponents that do it too.
A strategy never defects first kind of looks like it’s being nice, so we could say TIT
FOR TAT is a nice strategy. And it seemed to be a good trait to have in this tournament,
the top 8 strategies were nice, and the bottom 7 were not.
The least successful nice strategy was GRUDGER. Never the first to defect, but once the other
defected it never cooperated again no matter what. Which doesn’t really give a great payout.
TIT FOR TAT, allows cooperation if the other person wants to cooperate again so they can
both cooperate going forwards. The amount of punishment GRUDGER’s gives, hurts the punisher
alongside the punished. OK, so we could say TIT FOR TAT is forgiving,
gives a quick punish then allows for mutual cooperation again. And since it’s just copying,
if that other strategy keeps defecting, so does TIT FOR TAT.
These seemed to be the traits that made tit for tat so good in this tournament.
It’s nice, it’s not tempted by this risky option. It’s retaliating, it disincentives
being taken advantage of, and it’s forgiving it will allow getting back to cooperation.
Because TIT FOR TAT is just copying, it can’t ever beat an opponent. It can either tie.
Or lose. Really just depends on whether the opponent defects in the very last round where
TIT FOR TAT can’t reciprocate. The opposite is true for ALWAYS DEFECT, it
can only tie. Or win if the opponent ever tried cooperating. But that doesn’t matter.
Which strategy wins is about how many points they got, not their relative score to any
given opponent. But TIT FOR TAT can run into problems though.
JOSS is a strategy that’s basically tit for tat, but sometimes it tries defecting. Against
regular tit for tat, they would go. Hey you cheated. Hey you cheated. Hey YOU cheated.
Back and forth, a sort of defection echo. Then when it tries defecting again it becomes
all defection. There were a few strategies that could have
won this tournament if they had been entered. One was called FORGIVING TIT FOR TAT, or TIT
FOR TWO TATS. This strategy requires 2 defections before it retaliates. It would have prevented
the echo effects that hurt regular TIT FOR TAT and won the tournament.
It gained more preventing scenarios like this, than it lost letting itself be taken advantage
of once in a while. Most strategies trying to improve tit for
tat tried to do so by being less nice, trying to find a way to capitalize on defection.
Instead the opposite was the case, not even punishing every defection ended up being better
in the long run. At least in this environment. Later Axelrod held a second tournament. This
time there were a lot more entries. And they didn’t do a set 200 rounds. That way nobody
would know when the interaction would end. See the footnotes below.
In this tournament, even though FORGIVING TIT FOR TAT was entered, regular TIT FOR TAT
won again. FORGIVING TIT FOR TAT didn’t win because people
knew about it. A strategy called TESTER starts out cooperating
but tries defecting like this to see how the player reacts. If the opponent punishes, it
cooperates to apologize and prevent echo defections, then just becomes tit for tat for the rest
of the time. So this is what it would look like against TIT FOR TAT. But against easygoing
strategies like FORGIVING TIT FOR TAT, it can learn that it’s able take advantage of
them. FORGIVING TIT FOR TAT proved too forgiving. At least in this context
Let’s change the rules a bit. Let’s say we’re in a reproduction situation.
These points aren’t just points, they represent resources that could be used for reproduction.
If it gets lots of points like TIT for TAT, it will reproduce more and we’ll put more
of them into the next generation, the next tournament. If they get fewer points like
50 50 RANDOM, we’ll put fewer of them into the next generation.
TIT FOR TAT and other successful strategies reproduced well and followed upward arcs like
these. The not so successful strategies followed downward trends and went extinct. Exploitative
strategies like HARRINGTON, did well at the start but as its victims “went extinct”, its
population declined as well. The really successful strategies were ones that could work well
with other successful strategies, basically nice or otherwise cooperative strategies.
They supported one another were able to continue to reproduce.
OK, now let’s imagine another situation like this, but it’s a world of ALWAYS DEFECTORs.
It’s a cruel world with otherwise the same rules. Can a “nice” mutation establish itself?
can something like TIT FOR TAT invade a group of ALWAYS DEFECTORS? If it was just one individual,
maybe not so well . As a nice strategy it’s constantly getting
taken advantage of by the native defectors. The natives get better scores with one another
than TIT FOR TAT gets with them. TIT FOR TAT has nobody to cooperate with and just comes
away as the worst reproducer. But if there were a couple TIT FOR TATS, then
they could gain more from one another than they lose to the defectors like they do in
the tournaments. And eventually they would end up taking over.
And once established, it would be really hard for a non-nice strategy to invade TIT FOR
TAT. Because TIT FOR TAT is retaliating, any non-nice strategy is going to get a lower
score with a TIT FOR TAT than TIT FOR TATs get with other TIT FOR TATs. Here the non-nice
strategies would get the lowest scores and be the worst reproducers.
Anyway you can play around with these models all day.
Like what if there were random mistakes. Sometimes nice strategies accidentally defect or look
like they defect. Then there may be lots of defection echo problems and then variations
on FORGIVING TIT FOR TAT would dominate. Or what if the players were able to learn
and change their strategy? Then you might seem, for example, cooperation spreading to
a bunch of defectors as they learn they can get more from it.
And so on. But the point is. For purely self-interested players, like reproducing
cells, there is more to be gained by being cooperative; being nice and forgiving. If
also retaliating. And this would be beside an “inclusive fitness
help them because it carries the same genes” sort of thing.
TIT FOR TAT does quite well in these model reproduction situations, it can invade other
strategies, and it’s difficult to be invaded. But the way TIT FOR TAT works, it’s not factoring
a larger reproduction game or how much it’s gaining. It has no foresight and almost no
memory. It just reacts to specific situations. So IF situations LIKE these were some part
of a cell’s history; if any cells that survived the gauntlet of time to still exist today,
did so at least partially in prisoner’s dilemma like situations.
Then like TIT FOR TAT, they don’t need to think about themselves or reproduction, to
be reproductively successful. They could just learn or have instinct to
be kind, to forgive, to feel cheated and want to retaliate. They could even just go “I’m
going to reciprocate whatever they do, bur bur bur”. Those actions are where the reproductive
success comes from. They don’t necessarily have to only be nice as a part of some sort
of selfish plan or selfish viewpoint. Video’s over now Oh one more thing, ThisPlace was brought to you today by, the letter G. For 10% off your first order of the letter G enter promo code thisplace at checkout