The Iterated Prisoner's Dilemma and The Evolution of Cooperation

Video Statistics and Information

Video

Captions Word Cloud

Reddit Comments

Its time to playyyyy GOLDEN BALLS!!

https://www.youtube.com/watch?v=AbXQJikHSSg

👍︎︎ 2 👤︎︎ u/[deleted] 📅︎︎ Jul 02 2016 🗫︎ replies

One of the best lessons I ever received was when an MIT math geek saw me getting zero respect from some people at a party, took me aside, and said to me "did you know that tit-for-tat is the optimal solution to the prisoner's dilemma?".

At the same time, I had a zen teacher who kept saying to me "intensely_human, if someone hits you, you need to hit them back". This was because I kept coming to him for advice on all these different interpersonal situations, and he could tell that I wasn't playing tit-for-tat. I was just always cooperating regardless of what others were doing. Hence I was constantly getting into relations with other people that would suck my energy dry while feeding them.

But this zen master saying it to me didn't make it stick. I didn't know whether I could trust his advice. But when the MIT math geek, who had been taking game theory classes, took me aside and said that thing to me, it clicked.

Thanks MIT geek. You know who you are, and I owe you one.

👍︎︎ 5 👤︎︎ u/intensely_human 📅︎︎ Jul 02 2016 🗫︎ replies

.... Wtf. Reposting it within minutes of the last post

👍︎︎ 2 👤︎︎ u/gagnonca 📅︎︎ Jul 02 2016 🗫︎ replies

Awesome video!

👍︎︎ 1 👤︎︎ u/porcelainfog 📅︎︎ Jul 02 2016 🗫︎ replies

Fantastic illustration. Top-tier videos from an unrecognized youtuber

👍︎︎ 1 👤︎︎ u/Danyuhl 📅︎︎ Jul 02 2016 🗫︎ replies

This was really great, /u/jessejjang. A reminder of my most interesting second-year Anthropology classes. Game theory is such a fascinating topic. Hope to see more videos from you!

👍︎︎ 1 👤︎︎ u/AcornElf 📅︎︎ Jul 02 2016 🗫︎ replies

This guy needs more exposure. He has awsome videos.

👍︎︎ 1 👤︎︎ u/Fusion2k 📅︎︎ Jul 02 2016 🗫︎ replies

Captions

Let's say 2 people have 2 options and which option they each pick is going to define how much of stuff they will get with one another. The numbers represent something that they want, like money. Or points! Everybody likes points they want as many points as they can get. What makes the prisoner's dilemma the prisoner's dilemma is really, this number being bigger than this number, this number being bigger than this number, and this number being bigger than this number. In a pattern like this. They can pick A or B but they have no control over whether their opponent will pick A or B So when faced with their choices, they see that option B always gets them more. And it�s the same for the other guy, going option 2 is always better. But because of the way it�s set up, both going option B is the worst for the group. And really not one of the better situations for the individual. Because going option A is worse for the individual, but better for the other person, and best for the group we might call it sharing or cooperating or whatever depending on the situation. It looks like working together to get more together. Going option B is always better for the individual, but worse for the other person, so it looks like defecting, cheating, or betraying depending on the situation. Their aiming for personal gain. If they're only going to play once, a player will always get more by defecting. But if they are going to interact with somebody multiple times; if they play once, then they play again and again, and we add up their scores, the strategy changes. INTRO In one off games, defecting gives a higher payout, doesn't matter what the other person is doing. And with multiple games, if the opponent cooperates and defects at random, or follows a set pattern, always defecting still gives the best payout. Try always cooperating with them? No. Start off cooperating then defect? No. Try to line up cooperation and defection? Doesn't matter. Always defecting is better. Because here, like in a one-off game, defecting has no consequences, and it always gives a higher payout. But if what a player picks, changes depending on what the other player did, for example: this player starts off cooperating and always cooperates. Unless their opponent defects. Then they switch to defecting and defect no matter what. You know it sort of gets pissed off. We'll call it GRUDGER. Any strategy that started off cooperating with GRUDGER, or always cooperated would have gotten a higher score than always defect. Because here defecting can have consequences. With multiple games there is an opportunity to influence the other player for future games. ALWAYS DEFECT isn't the best strategy anymore. What is? In 1980 Robert Axelrod held a tournament where anyone could submit a strategy. Each strategy did 200 rounds against each other strategy. There were 14 strategy submitted. Plus the strategy 50/50 RANDOM. These were the payoffs so if two strategies cooperated with eachother for all 200 rounds, they would each get 600. If they both defected they would both get 200. If one cooperated and the other defected the whole time, one would get 0 and the other 1000, the highest and lowest possible scores. If they went back and forth, like this, they would each get 500. And these are the averaged results of the tournament. The winner was a strategy called tit for tat. TIT FOR TAT cooperates on the first round and from then on it just copies what the other person did last round. Why did it win? It's a simple strategy, but one thing is does is, it reciprocates quickly against defectors. Any strategy against tit for that that tries to take advantage of it, gets instantly punished and put into a bad situation. So if the other strategy keeps defecting, or even if it tries to go back to cooperating, it would have gained less than it would have if it had just kept cooperating with TIT FOR TAT. And if TIT FOR TAT didn't punish, TIT FOR TAT would have been worse off. So we might say TIT FOR TAT is "retaliating", it punished defection. Which is good because it can prevent some losses, and it can disincentive an opponent from defecting. Another thing is, TIT FOR TAT is never the first to defect. Players would want to be in this situation as much as they can, there is a temptation to defect when the other person is cooperating. But any responsive opponent would quickly defect and they would both end up here. It's risky to defect. An easy solution is to ignore that temptation and just try to maximize long term mutual cooperation. Start off cooperating and then never defect unless you need to punish someone. It can end up giving better gains, especially with opponents that do it too. A strategy never defects first kind of looks like it's being nice, so we could say TIT FOR TAT is a nice strategy. And it seemed to be a good trait to have in this tournament, the top 8 strategies were nice, and the bottom 7 were not. The least successful nice strategy was GRUDGER. Never the first to defect, but once the other defected it never cooperated again no matter what. Which doesn't really give a great payout. TIT FOR TAT, allows cooperation if the other person wants to cooperate again so they can both cooperate going forwards. The amount of punishment GRUDGER's gives, hurts the punisher alongside the punished. OK, so we could say TIT FOR TAT is forgiving, gives a quick punish then allows for mutual cooperation again. And since it's just copying, if that other strategy keeps defecting, so does TIT FOR TAT. These seemed to be the traits that made tit for tat so good in this tournament. It's nice, it's not tempted by this risky option. It's retaliating, it disincentives being taken advantage of, and it's forgiving it will allow getting back to cooperation. Because TIT FOR TAT is just copying, it can't ever beat an opponent. It can either tie. Or lose. Really just depends on whether the opponent defects in the very last round where TIT FOR TAT can't reciprocate. The opposite is true for ALWAYS DEFECT, it can only tie. Or win if the opponent ever tried cooperating. But that doesn't matter. Which strategy wins is about how many points they got, not their relative score to any given opponent. But TIT FOR TAT can run into problems though. JOSS is a strategy that's basically tit for tat, but sometimes it tries defecting. Against regular tit for tat, they would go. Hey you cheated. Hey you cheated. Hey YOU cheated. Back and forth, a sort of defection echo. Then when it tries defecting again it becomes all defection. There were a few strategies that could have won this tournament if they had been entered. One was called FORGIVING TIT FOR TAT, or TIT FOR TWO TATS. This strategy requires 2 defections before it retaliates. It would have prevented the echo effects that hurt regular TIT FOR TAT and won the tournament. It gained more preventing scenarios like this, than it lost letting itself be taken advantage of once in a while. Most strategies trying to improve tit for tat tried to do so by being less nice, trying to find a way to capitalize on defection. Instead the opposite was the case, not even punishing every defection ended up being better in the long run. At least in this environment. Later Axelrod held a second tournament. This time there were a lot more entries. And they didn't do a set 200 rounds. That way nobody would know when the interaction would end. See the footnotes below. In this tournament, even though FORGIVING TIT FOR TAT was entered, regular TIT FOR TAT won again. FORGIVING TIT FOR TAT didn't win because people knew about it. A strategy called TESTER starts out cooperating but tries defecting like this to see how the player reacts. If the opponent punishes, it cooperates to apologize and prevent echo defections, then just becomes tit for tat for the rest of the time. So this is what it would look like against TIT FOR TAT. But against easygoing strategies like FORGIVING TIT FOR TAT, it can learn that it's able take advantage of them. FORGIVING TIT FOR TAT proved too forgiving. At least in this context Let's change the rules a bit. Let's say we're in a reproduction situation. These points aren't just points, they represent resources that could be used for reproduction. If it gets lots of points like TIT for TAT, it will reproduce more and we'll put more of them into the next generation, the next tournament. If they get fewer points like 50 50 RANDOM, we'll put fewer of them into the next generation. TIT FOR TAT and other successful strategies reproduced well and followed upward arcs like these. The not so successful strategies followed downward trends and went extinct. Exploitative strategies like HARRINGTON, did well at the start but as its victims "went extinct", its population declined as well. The really successful strategies were ones that could work well with other successful strategies, basically nice or otherwise cooperative strategies. They supported one another were able to continue to reproduce. OK, now let's imagine another situation like this, but it's a world of ALWAYS DEFECTORs. It's a cruel world with otherwise the same rules. Can a "nice" mutation establish itself? can something like TIT FOR TAT invade a group of ALWAYS DEFECTORS? If it was just one individual, maybe not so well . As a nice strategy it's constantly getting taken advantage of by the native defectors. The natives get better scores with one another than TIT FOR TAT gets with them. TIT FOR TAT has nobody to cooperate with and just comes away as the worst reproducer. But if there were a couple TIT FOR TATS, then they could gain more from one another than they lose to the defectors like they do in the tournaments. And eventually they would end up taking over. And once established, it would be really hard for a non-nice strategy to invade TIT FOR TAT. Because TIT FOR TAT is retaliating, any non-nice strategy is going to get a lower score with a TIT FOR TAT than TIT FOR TATs get with other TIT FOR TATs. Here the non-nice strategies would get the lowest scores and be the worst reproducers. Anyway you can play around with these models all day. Like what if there were random mistakes. Sometimes nice strategies accidentally defect or look like they defect. Then there may be lots of defection echo problems and then variations on FORGIVING TIT FOR TAT would dominate. Or what if the players were able to learn and change their strategy? Then you might seem, for example, cooperation spreading to a bunch of defectors as they learn they can get more from it. And so on. But the point is. For purely self-interested players, like reproducing cells, there is more to be gained by being cooperative; being nice and forgiving. If also retaliating. And this would be beside an "inclusive fitness help them because it carries the same genes" sort of thing. TIT FOR TAT does quite well in these model reproduction situations, it can invade other strategies, and it's difficult to be invaded. But the way TIT FOR TAT works, it's not factoring a larger reproduction game or how much it's gaining. It has no foresight and almost no memory. It just reacts to specific situations. So IF situations LIKE these were some part of a cell's history; if any cells that survived the gauntlet of time to still exist today, did so at least partially in prisoner's dilemma like situations. Then like TIT FOR TAT, they don't need to think about themselves or reproduction, to be reproductively successful. They could just learn or have instinct to be kind, to forgive, to feel cheated and want to retaliate. They could even just go "I'm going to reciprocate whatever they do, bur bur bur". Those actions are where the reproductive success comes from. They don't necessarily have to only be nice as a part of some sort of selfish plan or selfish viewpoint. Video's over now Oh one more thing, ThisPlace was brought to you today by, the letter G. For 10% off your first order of the letter G enter promo code thisplace at checkout

Info

Channel: This Place

Views: 602,891

Rating: 4.9332519 out of 5

Keywords: this place, this, place, environment, environmental, sustainability, thisplace, thisplacechannel, this place channel, iterated prisoner's dilemma, prisoner's dilemma, evolution of cooperation, cooperation, defection

Id: BOvAbjfJ0x0

Channel Id: undefined

Length: 9min 58sec (598 seconds)

Published: Sat Jul 02 2016