Triumph of the Golden Rule

We live in a world with other people. Almost every decision we make involves someone else in one way or another, and we face a constant choice regarding just how much we’re going to trust the person on the other side of this decision. Should we take advantage of them, go for the quick score and hope we never see them again – or should we settle for a more reasonable reward, co-operating in the hope that this peaceful relationship will continue long into the future?

We see decisions of this type everywhere, but what is less obvious is the best strategy for us to use to determine how we should act. The Golden Rule states that one should “do unto others as you would have them do unto you”. While it seems rather naive at first glance, if we run the numbers, we find something quite amazing.

A Dilemma

In order to study these types of decisions, we have to define what exactly we’re talking about. Let’s define just what a “dilemma” is. Let’s say it has two people – and they can individually decide to work together for a shared reward, or screw the other one over and take it all for themselves. If you both decide to work together, you both get a medium-sized reward. If you decide to take advantage of someone but they trust you, you’ll get a big reward (and the other person gets nothing). If you’re both jerks and decide to try to take advantage of each other, you both get a tiny fraction of what you could have. Let’s call these two people Alice and Bob – here’s a table to make things a bit more clear.

Alice cooperates
Alice defects
Bob cooperates Everyone wins! A medium-sized reward to both for mutual co-operation Poor Bob. He decided to trust Alice, who screwed him and got a big reward. Bob gets nothing.
Bob defects Poor Alice. She decided to trust Bob, who took advantage of her and got a big reward. Alice gets nothing. No honour among thieves… both Bob and Alice take the low road, and fight over the scraps of a small reward.

This specific order of rewards is referred to as the Prisoner’s Dilemma, and was formalized and studied by Melvin Dresher and Merrill Flood in 1950 while working for the RAND Corporation.

Sale, One Day Only!

Now of course the question is – if you’re in this situation, what is the best thing to do? First suppose that we’re never, ever going to see this other person again. This is a one time deal. Absent any moral consideration, your best option for the most profit is to attempt to take advantage of the other person and hope that they are clueless enough to let you, capitalism at its finest. You could attempt to cooperate, but that leaves you open to the other party screwing you. If each person acts in their own interest and is rational, they will attempt to one-up the other.

But there’s just one problem – if both people act in this way, they both get much less than they would if they simply cooperated. This seems very strange, as the economic models banks and other institutions use to model human behavior assume this type of logic – the model of the rational consumer. But this leads to nearly the worst possible option if both parties take this approach.

It seems that there is no clear ideal strategy for a one time deal. Each choice leaves you open to possible losses in different ways. At this point it’s easy to toss up your hands, leave logic behind, and take a moral stance. You’ll cooperate because you’re a good person – or you’ll take advantage of the suckers because life just isn’t fair.

And this appears to leave us where we are today – some good people, some bad people, and the mythical invisible hand of the market to sort them all out. But there’s just one little issue. We live in a world with reputations, with friends, and with foes – there are no true “one time” deals. The world is small, and people remember.

In it for the Long Run

So instead of thinking of a single dilemma, let’s think about what we should do if we get to play this game more than once. If someone screws you in the first round, you’ll remember – and probably won’t cooperate the next time. If you find someone who always cooperates, you can join them and work together for your mutual benefit – or decide that they’re an easy mark and take them for everything they’ve got.

But what is the best strategy? In an attempt to figure this out, in 1980 Robert Axelrod decided to have a contest. He sent the word out, and game theorists, scientists, and mathematicians all submitted entries for a battle royale to determine which strategy was the best.

Each entry was a computer program designed with a specific strategy for playing this dilemma multiple times against other clever entries. The programs would play this simple dilemma, deciding whether to cooperate or defect against each other, for 200 rounds. Five points for a successful deception (you defect, they cooperate), three points each for mutual cooperation, one point each if you both tried to screw each other (mutual defection), and no points if you were taken advantage of (you cooperate, they defect). Each program would play every other program as well as a copy of itself, and the program with the largest total score over all the rounds would win.

So what would some very simple programs be?

ALL-C (always cooperate) is just like it sounds. Cooperation is the only way, and this program never gets tired of being an upstanding guy.

ALL-D (always defect) is the counterpoint to this, and has one singular goal. No matter what happens, always, always, always try to screw the other person over.

RAND is the lucky dunce – don’t worry too much, just decide to cooperate or defect at random.

You can predict how these strategies might do if they played against each other. Two ALL-C strategies would endlessly cooperate in a wonderful dance of mutual benefit. Two ALL-D strategies would continually fight, endlessly grinding against each other and gaining little. ALL-C pitted against ALL-D would fare about as well as a fluffy bunny in a den of wolves – eternally cooperating and hoping for reciprocation, but always getting the shaft with ALL-D profiting.

So an environment of ALL-C would be a cooperative utopia – unless a single ALL-D strategy came in, and started bleeding them dry. But an environment entirely made of ALL-D would be a wasteland – no one would have any success due to constant fighting. And the RAND strategy is literally no better than a coin flip.

Time to Think

So what should we do? Those simple strategies don’t seem to be very good at all. If we think about it however, there’s a reason they do so poorly – they don’t remember. No matter what the other side does, they’ve already made up their minds. Intelligent strategies remember previous actions of their opponents, and act accordingly. The majority of programs submitted to Axelrod’s competition incorporated some sort of memory. For instance, if you can figure out you’re playing against ALL-C, it’s time to defect. Just like in the real world, these programs tried to figure out some concept of “reputation” that would allow them to act in the most productive manner.

And so Axelrod’s competition was on. Programs from all over the world competed against each other, each trying to maximize their personal benefit. A wide variety of strategies were implemented from some of the top minds in this new field. Disk drives chattered, monitors flickered, and eventually a champion was crowned.

And the Winner Is…

When the dust settled, the winner was clear – and the victory was both surprising and inspiring. The eventual champion seemed to be a 90 lb weakling at first glance, a mere four lines of code submitted by Anatol Rapoport, a mathematical psychologist from the University of Toronto. It was called “Tit-for-Tat”, and it did exactly that. It started every game by cooperating – and then doing exactly what the other player did in their last turn. It cooperated with the “nice” strategies, butted heads with the “mean” strategies, and managed to come out on top ahead of far more complex approaches.

The simplest and shortest strategy won, a program that precisely enforced the Golden Rule. But what precisely made Tit-for-Tat so successful? Axelrod analyzed the results of the tournament and came up with a few principles of success.

  • Don’t get greedy. Tit-for-Tat can never beat another strategy. But it never allows itself to take a beating, ensuring it skips the brutal losses of two “evil” strategies fighting against each other. It actively seeks out win-win situations instead of gambling for the higher payoff.
  • Be nice. The single best predictor of whether a strategy would do well was if they were never the first to defect. Some tried to emulate Tit-for-Tat but with a twist – throwing in the occasional defection to up the score. It didn’t work.
  • Reciprocate, and forgive. Other programs tended to cooperate with Tit-for-Tat since it consistently rewarded cooperation and punished defection. And Tit-for-Tat easily forgives – no matter how many defections it has seen, if a program decides to cooperate, it will join them and reap the rewards.
  • Don’t get too clever. Tit-for-Tat is perfectly transparent, and it becomes obvious that it is very, very difficult to beat. There are no secrets, and no hypocrisy – Tit-for-Tat gets along very well with itself, unlike strategies biased toward deception.

The contest attracted so much attention that a second one was organized, and this time every single entry was aware of the strategy and success of Tit-for-Tat. Sixty-three new entries arrived, all gunning for the top spot. And once again, Tit-for-Tat rose to the top. Axelrod used the results of these tournaments to develop ideas about how cooperative behaviour could evolve naturally, and eventually wrote a bestselling book called The Evolution of Cooperation. But his biggest accomplishment may be showing us that being nice does pay off – and giving us the numbers to prove it.

20 thoughts on “Triumph of the Golden Rule”

  1. This is an excellent description of why tit-for-tat strategies would evolve in social animals like humans.

    There is, however, one strategy that beat tit-for-tat in the 20th anniversary competition, developed by Southampton University. This one, which I like to call the “bee hive” strategy, involves multiple entries (about 60) to the game that can recognize each other after a few moves. Of the 60 entries, some were “takers” and “sacrificers”. When these two types met, one would the “sacrificer” would always cooperate and the “taker” would always defect, maximizing it’s points. When either played against a non-beehive entry they would defect, thereby minimizing their opponent’s scores. This sacrificial strategy took the 3 top places (as well as many of the bottom places).

    What this might predict in an evolutionary sense is that we might see people who are “takers” and some that are “sacrificiers” where the takers do even better than ‘tit-for-tat’ people in terms of reproductive success through acquiring resources, but only if they can find sacrificiers. One might then expect this type of behaviour to manifest itself as someone with a natural ability to exploit the psychology of others such that poorer people will give to richer. Potential examples that fit this profile are televangelists, other wealthy religions (e.g., via tithing), and charismatic scam artists. Less likely would be leaders of companies or banks because, although they do get rich off of poorer people, they don’t directly convince people to give them money but rather get it by offering something of value — a good or service that comes at great costs — in exchange, often many layers from actually interfacing with the customers.

  2. You should check out some of Alker’s work on PD… asymmetric payoffs and players chatting with each other… there is a lot more to PD than just TFT.

  3. Very interesting, and certainly inspiring. The emphasis on the golden rule bothers me a little, though. The golden rule, taken literally, would be equivalent to ALL-C. Would you want the other player to defect? Of course not. So the golden rule would dictate you cooperate. The tit for tat strategy is closer to the Judeo-Christian “eye for an eye”, which is very, very different in real world practice. To me, the most interesting thing is that the best algorithm holds no grudges. No matter how many times the other player has defected, if their latest action was cooperation, tit for tat is happy to do so as well.

    Leech, the introduction of teamwork between programs is very interesting as well. I was wondering how this would work in more complex systems, and that gives some very interesting insights indeed.

    1. Hey Blake – Definitely agree with you, I think Tit-for-Tat can best be viewed as an “enforcer” of the Golden Rule. I’m working on another post describing some more complex strategies, and Leech is definitely a very, very interesting example.

  4. Blake is right. Tit-for-Tat is not quite the same thing as the Golden Rule. The Golden Rule implicitly acknowledges a moral order based on reciprocity (i.e. you reap what you sow); but it doesn’t explicitly prescribe reciprocity. Instead of saying: “Always follow the strategy of Tit-for-Tat,” the Golden Rule seems to be saying: “Always play the game as if the other player is following the strategy of Tit-for-Tat.” If you are about to play an iterated PD against a player who you know (or strongly suspect) will be using a Tit-for-Tat strategy, then it doesn’t matter whether you use Tit-for-Tat or All-C, since the outcome will be the same in either case — and the Golden Rule seems to be advocating an All-C strategy (or something akin to it).

    If you know (or assume) that the other player is always going to mimic your behavior, then the prudent move is to cooperate. In fact, if you know that the other player is playing Tit-for-Tat then it is NEVER a good idea for you to defect, since that will provoke a retaliatory defection, which might ignite a conflict spiral of retaliation, counter-retaliation, counter-counter-retaliation, and so forth. While Tit-for-Tat works brilliantly under conditions of perfect information (i.e. no possibility of misperception) and perfect control (i.e. no possibility of accidents), it can be dangerously escalatory under conditions of imperfect information and imperfect control. What happens if you misinterpret the other player’s actions as a defection? What happens if the other player accidentally defects? Tit-for-Tat demands a response in kind; but is this wise? The problem with Tit-for-Tat is that, while it may work ideally under laboratory conditions, it has a tendency to overreact under real world conditions.

    The Golden Rule goes hand in hand with the principle of “turning the other cheek” — i.e. not responding in kind to a provocation, but instead showing mercy (at least for the first offense). This is a violation of the core principle of perfect reciprocity that Tit-for-Tat is built on. But it may help prevent conflict spirals by acknowledging that accidents and misperceptions do happen from time to time, and shouldn’t automatically trigger a Tit-for-Tat retaliatory response.

  5. Pingback: moracast » Triumph of the Golden Rule | gmilburn.ca
  6. “Dashing Leech”, I couldn’t fault what you said until the very last sentence. Bankers and executives exploit and manipulate and deceive (i.e. they’re takers) more than any other people I can conceive of. Just because they don’t directly lie and manipulate the people they end up getting their money from doesn’t make their selfish greed any more tolerable. If anything, it makes it more atrocious.

  7. Is there a site where we can play online to know what type we belong too?
    It is good to know theory but in practice all of us are quite different. We all know E = mC2(squared) but how many of us put it to real use? I think theories are good for reading and feeling good about thats all. In practice most of us suck. We suck, even if we know the right theory.

  8. Actually the same thing was discussed in one of Richard Dawkin’s Documentaries. All the things above are explained with live examples. :)

  9. The Golden Rule does not allow retaliation and is therefore not “tit for tat”, but instead a guarantee of being a sucker. For cooperation to emerge as a strategy in game theory one should never lay one’s cards on the table and promise never to retaliate in kind but always to do to the other what you hope they do to you – which means always being the hopeful cooperator.

    Game theory trashes the Golden Rule. But Christians are not supposed to worry about that since God will allegedly reward them for their suffering in Heaven.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>