In This Virtual Fish Tank, You Make the Rules

In my last post, I discussed how individuals following simple rules cause cause coordinated group behavior to arise. The boid model created by Craig Reynolds used three rules – alignment, separation, and coherence. But how much attention does each individual pay to each rule? In situations like migration, alignment might be the most important rule. If you’re being attacked by predators, sticking together and paying attention to the coherence rule might keep you from being eaten.

Clearly different situations might have different approaches. I was very interested in how different weightings of these rules might behave, so I decided to create a program that would allow you to change the relative importance of each rule at will. You can see a video of it in action below (the three sliders on the bottom left control the influence of the various rules).

Smaller screen, older computer, or just want something more simple?
Load the standard definition fish school simulation.
Big screen, fast computer, and want to give the fish some more room to swim?
Load the high definition fish school simulation.
   
Alignment: This slider adjusts how much each fish wants to head in the same direction as other fish around it.
Cohesion: This slider adjusts how close each fish wants to be to its neighbors.
Separation: This slider adjust how much each fish wants to space itself out from the others.
Click anywhere in the water to add a new fish.
Press this button to reset the simulation.
   

I hope you enjoy it – please leave me a comment if you have any questions, comments, or find any bugs.

Herds of Android Birds Mimic Ad Hoc Flocks

In winter during the late afternoon before settling down to roost, flocks of thousands of starlings will twist and turn, turning the sky black with strange curves that seem to move with a mind of their own. The flocks of up to a million strong form for warmth, for security, and for social contact.

These seething clouds of birds have no leader, no planning, and yet produce a dance that seems choreographed in its precision and beauty. So how do they do it? Some strange ability given to them, a unique skill among the animal kingdom? Well, perhaps not quite. Consider that humans seem to be able to move in crowds with the same ease, although some might say with a bit less beauty.

People seem to move in clumps, in the same direction, and manage to not trample each other or collide. There seem to be parallels in crowds of organisms, that this order emerges consistently and that somehow each animal “knows” what to do on a small scale, resulting in cohesive and sensible group movement. The question then becomes – what exactly are we doing unconsciously to organize like this?

The Boid Model

In 1986 Craig Reynolds produced a computer model describing how large groups of animals such as schools, herds, and flocks could be able to move in unison with no central coordination. He called his creations “boids”, and imagined that each would follow some simple, sensible rules to navigate around. He also knew a single animal would never be able to keep track of every other animal in the group, and assumed that they would only pay attention to their immediate neighbors in the flock.

So if you were a boid, what rules would you follow?

Alignment: Look around you. Where is everyone else going? Probably a good idea to go there too. Alignment is a rule that finds the average direction your neighbours are going, and tells you to go there too.
Cohesion:Predators look to pick off stragglers on the edge of the pack. Cohesion is a rule that finds the average position of your neighbors and tells you to go there, pulling you into the relative safety of the center of the pack.
Separation: When crowds get big, they can get dangerous. Animals can trample each other, and birds can collide. Separation is a rule that tells you to make sure to give your neighbors some space.

How do we know how to do this? Well, the tautological answer is that it simply works. The beauty of these rules is that each of them is amazingly simple, and seem to make sense to us on an intuitive level. We tend to go with the crowd, stick together, and still try to give everyone a bit of personal space. But are these rules enough to produce the complex dance of the starlings, or are we missing some detail?

A Boid Dance

It turns out these simple rules produce patterns of movement that are anything but.

We can see that it looks flowing and wonderfully organic – but it’s not quite the same as the starling flocks. We can play with the boid model, tuning its various parameters to see how the resulting crowds might react.

A school of fish trying to avoid predators might be modeled best by weighting the cohesiveness rule very heavily, attempting to keep together at all costs while not worrying too much about personal space or where precisely they’re heading as long as it’s away from what’s trying to eat you.
A flock of geese flying south for the winter will focus on heading in a certain direction (alignment), then space themselves out to ensure they can see in front of themselves (seperation) while staying close enough and on the same plane to experience the aerodynamic benefits of flying in a group (cohesion) producing the “flying V” we see so often in fall.
Migrating animals are primarily concerned with making sure they’re going in the same direction as everyone else (alignment) while ensuring that no one is trampled (seperation) and that young and weak animals are protected in the center of the herd (cohesion). For instance, the wildebeest stampede that killed Simba’s father in the Lion King was generated using the same theory as the boid model.

This simple model has a wide range of possible applications, and a huge amount of flexibility allowing it to produce a wide range of behavioural simulations. This gorgeous installation of light graffiti uses the rules we’ve just discussed to create hypnotizing patterns of artificial creatures.

The boid model has been used to animate realistic looking behavior over a huge range of media. If you’ve seen a computer generated crowd moving in a vaguely sensible manner anywhere, it likely uses the same basic theory. The power of the model comes from its construction, assuming simple behaviors competing for influence that result in a complex outcome not unlike our own consciousness or political systems. Not too shabby for a research project from over 20 years ago.

The Golden Rule in the Wild

In the previous post, we discussed the Prisoner’s Dilemma and saw how a simple strategy called Tit-for-Tat enforced the Golden Rule and won a very interesting contest. But does Tit-for-Tat always come out on top? The most confounding thing about the strategy is that it can never win – at best, it can only tie other strategies. Its success came from avoiding the bloody battles that other more deceptive strategies suffered.

The major criticism of Axelrod’s contest is its artificiality. In real life, some may say, you don’t get to encounter everyone, interact with them, and then have a tally run at the end to determine just how you did. Perhaps more deceptive strategies would do better in a more “natural” environment where losing doesn’t mean you get another chance at another opponent, but that your failures cause you to simply die off.

Artificial Nature

So now let’s look at the same game, with the same scoring system, only this time there’s a twist. Assume that this contest takes place in some sort of ecosystem that can only support a certain number of organisms, and they must fight among each other for the right to reproduce. There will be many different organisms, and they will all be members of a certain species, or specific strategy. We can then construct an artificial world where these strategies can battle it out in a manner that seems to reflect the real world a bit better.

In order to determine supremacy, we’ll play a certain number of rounds of the game, called a generation. At the end of the generation, the scores are tallied for each strategy, and a new generation of strategies is produced – with a twist. Higher scoring strategies will produce more organisms representing them in the next generation, while lower scoring strategies will produce less. Repeat this for many generations, observe the trends, and we can see how these strategies do as part of a population that can grow and shrink, rather than a single strategy that lives forever.

So let’s look at an example. Suppose we have a population that consists of the following simple strategies:

Initial
Population
Strategy Description
60% ALL-C Honest to a fault, this strategy always cooperates.
20% RAND The lucky dunce, this strategy defects or cooperates at random.
10% Tit-for-Tat This strategy mimics the previous move of the other player, every time.
10% ALL-D The bad boy of the bunch, this strategy always defects.

 

So what will happen? Was Tit-for-Tat’s dominance a result of the structure of the contest, or is it hardier than some might think? A graph of the changing populations over 50 generations may be seen below.

It’s a hard world to start. ALL-C immediately starts being decimated by the deception of ALL-D and RAND who start surging ahead, while Tit-for-Tat barely hangs on. ALL-D’s relentless deception allows it to quickly take the lead, and it starts knocking off its former partner in crime, RAND. Tit-for-Tat remains on the ropes, barely keeping its population around 10% as ALL-C and RAND are quickly eliminated around it.

And then something very interesting happens. ALL-D runs out of easy targets, and turns to the only opponents left – Tit-for-Tat and itself. Tit-for-Tat begins a slow climb as ALL-D begins to eat itself fighting over scraps. Slowly, steadily, Tit-for-Tat maintains its numbers by simply getting along with itself while allowing ALL-D to destroy each other. By 25 generations it’s all over – the easy resources exhausted, ALL-D was unable to adapt to the new environment and Tit-for-Tat takes over.

This illustrates a very important concept – that of an evolutionarily stable strategy. ALL-D was well on its way to winning, but left itself open to invasion by constant infighting. ALL-C initially had the highest population but was quickly eaten away by more deceptive strategies. Tit-for-Tat on the other hand was able to get along with itself, and defended itself against outside invaders that did not cooperate in turn. An evolutionarily stable strategy is something that can persist in this manner – once a critical mass of players start following it, it cannot be easily invaded or exploited by other strategies, including itself.

I Can’t Hear You

But there’s one critical weakness to Tit-for-Tat. We’re all aware of feuds that have gone on for ages, both sides viciously attacking the other in retaliation for the last affront, neither one precisely able to tell outsiders when it all started. And if we look at the strategies each use in a simplistic sense, it seems that they’re using Tit-for-Tat precisely. So how did it go so horribly wrong?

It went wrong because Tit-for-Tat has a horrible weakness – its memory is only one move long. If two Tit-for-Tat strategies somehow get stuck in a death spiral of defecting against each other, there’s no allowance in the strategy to realize this foolishness, and be the first to forgive. But how could this happen? Tit-for-Tat is never the first to defect after all, so why are both Tit-for-Tat strategies continually defecting?

The answer is that great force of nature, noise. A message read the wrong way, a shout misheard over the wind, an error in interpretation – all can be the impetus for this first initial defection. No matter that it was pointless and incorrect, the strategy has changed. While Tit-for-Tat’s greatest strength is that it never defects first, its greatest weakness is that it never forgives first either.

All of these simulations we’ve seen so far do not include noise, and it can have a catastrophic effect on the effectiveness of Tit-for-Tat. Its success was built on the strategy of never fighting among itself and allowing other deceptive strategies to destroy themselves by doing the same – but with noise, this advantage becomes a fatal weakness as Tit-for-Tat’s inability to be taken advantage of is turned against itself.

So what does a simulation including noise look like? You can see one below, and it contains an additional mystery strategy, Pavlov. Pavlov is very similar to Tit-for-Tat but slightly different – it forgives far more easily.

We see a similar pattern to our previous simulation. ALL-D has an initial population spike as it knocks off the easy targets, but Tit-for-Tat and Pavlov slowly climb to supremacy with ALL-D eventually eating scraps. But the influence of noise causes Tit-for-Tat to fight among itself, and Pavlov begins what previously seemed impossible – to begin to win against Tit-for-Tat.

Puppy Love

So what is Pavlov and why does it work better in a noisy environment like the real world? Well, Ivan Pavlov was the man who discovered classical conditioning. You probably remember him as the guy who fed dogs while ringing a bell, and who then just rang the bell – and discovered that the dogs salivated expecting food.

The strategy is simple – if you win, keep doing it. If you lose, change your approach. Pavlov will always cooperate with ALL-C and Tit-for-Tat. If it plays ALL-D however, it will hopefully cooperate, lose, get angry about it and defect, lose again, switch back to cooperation, and so on. Like a tiny puppy or the suitor of a crazy girlfriend, it can’t really decide what it wants to do, but it’s going to do it’s damndest to try to succeed anyways. It manages to prevent the death spiral of two Tit-for-Tat strategies continually misunderstanding each other by obeying a very simple rule – if it hurts, stop doing it. While it may be slightly more vulnerable to deceptive strategies, it never gets stuck in these self-destructive loops of behavior.

So there’s a lesson here – life is noisy, and people will never get everything correct all the time. Tit-for-Tat works very well for a wide variety of situations, but has a critical weakness where neither player in a conflict is willing or able to forgive. So the next time you’re in a situation like that, step back, use your head, and switch strategies – it’s what this little puppy would want you to do, anyways.

Triumph of the Golden Rule

We live in a world with other people. Almost every decision we make involves someone else in one way or another, and we face a constant choice regarding just how much we’re going to trust the person on the other side of this decision. Should we take advantage of them, go for the quick score and hope we never see them again – or should we settle for a more reasonable reward, co-operating in the hope that this peaceful relationship will continue long into the future?

We see decisions of this type everywhere, but what is less obvious is the best strategy for us to use to determine how we should act. The Golden Rule states that one should “do unto others as you would have them do unto you”. While it seems rather naive at first glance, if we run the numbers, we find something quite amazing.

A Dilemma

In order to study these types of decisions, we have to define what exactly we’re talking about. Let’s define just what a “dilemma” is. Let’s say it has two people – and they can individually decide to work together for a shared reward, or screw the other one over and take it all for themselves. If you both decide to work together, you both get a medium-sized reward. If you decide to take advantage of someone but they trust you, you’ll get a big reward (and the other person gets nothing). If you’re both jerks and decide to try to take advantage of each other, you both get a tiny fraction of what you could have. Let’s call these two people Alice and Bob – here’s a table to make things a bit more clear.

Alice cooperates
Alice defects
Bob cooperates Everyone wins! A medium-sized reward to both for mutual co-operation Poor Bob. He decided to trust Alice, who screwed him and got a big reward. Bob gets nothing.
Bob defects Poor Alice. She decided to trust Bob, who took advantage of her and got a big reward. Alice gets nothing. No honour among thieves… both Bob and Alice take the low road, and fight over the scraps of a small reward.

This specific order of rewards is referred to as the Prisoner’s Dilemma, and was formalized and studied by Melvin Dresher and Merrill Flood in 1950 while working for the RAND Corporation.

Sale, One Day Only!

Now of course the question is – if you’re in this situation, what is the best thing to do? First suppose that we’re never, ever going to see this other person again. This is a one time deal. Absent any moral consideration, your best option for the most profit is to attempt to take advantage of the other person and hope that they are clueless enough to let you, capitalism at its finest. You could attempt to cooperate, but that leaves you open to the other party screwing you. If each person acts in their own interest and is rational, they will attempt to one-up the other.

But there’s just one problem – if both people act in this way, they both get much less than they would if they simply cooperated. This seems very strange, as the economic models banks and other institutions use to model human behavior assume this type of logic – the model of the rational consumer. But this leads to nearly the worst possible option if both parties take this approach.

It seems that there is no clear ideal strategy for a one time deal. Each choice leaves you open to possible losses in different ways. At this point it’s easy to toss up your hands, leave logic behind, and take a moral stance. You’ll cooperate because you’re a good person – or you’ll take advantage of the suckers because life just isn’t fair.

And this appears to leave us where we are today – some good people, some bad people, and the mythical invisible hand of the market to sort them all out. But there’s just one little issue. We live in a world with reputations, with friends, and with foes – there are no true “one time” deals. The world is small, and people remember.

In it for the Long Run

So instead of thinking of a single dilemma, let’s think about what we should do if we get to play this game more than once. If someone screws you in the first round, you’ll remember – and probably won’t cooperate the next time. If you find someone who always cooperates, you can join them and work together for your mutual benefit – or decide that they’re an easy mark and take them for everything they’ve got.

But what is the best strategy? In an attempt to figure this out, in 1980 Robert Axelrod decided to have a contest. He sent the word out, and game theorists, scientists, and mathematicians all submitted entries for a battle royale to determine which strategy was the best.

Each entry was a computer program designed with a specific strategy for playing this dilemma multiple times against other clever entries. The programs would play this simple dilemma, deciding whether to cooperate or defect against each other, for 200 rounds. Five points for a successful deception (you defect, they cooperate), three points each for mutual cooperation, one point each if you both tried to screw each other (mutual defection), and no points if you were taken advantage of (you cooperate, they defect). Each program would play every other program as well as a copy of itself, and the program with the largest total score over all the rounds would win.

So what would some very simple programs be?

ALL-C (always cooperate) is just like it sounds. Cooperation is the only way, and this program never gets tired of being an upstanding guy.

ALL-D (always defect) is the counterpoint to this, and has one singular goal. No matter what happens, always, always, always try to screw the other person over.

RAND is the lucky dunce – don’t worry too much, just decide to cooperate or defect at random.

You can predict how these strategies might do if they played against each other. Two ALL-C strategies would endlessly cooperate in a wonderful dance of mutual benefit. Two ALL-D strategies would continually fight, endlessly grinding against each other and gaining little. ALL-C pitted against ALL-D would fare about as well as a fluffy bunny in a den of wolves – eternally cooperating and hoping for reciprocation, but always getting the shaft with ALL-D profiting.

So an environment of ALL-C would be a cooperative utopia – unless a single ALL-D strategy came in, and started bleeding them dry. But an environment entirely made of ALL-D would be a wasteland – no one would have any success due to constant fighting. And the RAND strategy is literally no better than a coin flip.

Time to Think

So what should we do? Those simple strategies don’t seem to be very good at all. If we think about it however, there’s a reason they do so poorly – they don’t remember. No matter what the other side does, they’ve already made up their minds. Intelligent strategies remember previous actions of their opponents, and act accordingly. The majority of programs submitted to Axelrod’s competition incorporated some sort of memory. For instance, if you can figure out you’re playing against ALL-C, it’s time to defect. Just like in the real world, these programs tried to figure out some concept of “reputation” that would allow them to act in the most productive manner.

And so Axelrod’s competition was on. Programs from all over the world competed against each other, each trying to maximize their personal benefit. A wide variety of strategies were implemented from some of the top minds in this new field. Disk drives chattered, monitors flickered, and eventually a champion was crowned.

And the Winner Is…

When the dust settled, the winner was clear – and the victory was both surprising and inspiring. The eventual champion seemed to be a 90 lb weakling at first glance, a mere four lines of code submitted by Anatol Rapoport, a mathematical psychologist from the University of Toronto. It was called “Tit-for-Tat”, and it did exactly that. It started every game by cooperating – and then doing exactly what the other player did in their last turn. It cooperated with the “nice” strategies, butted heads with the “mean” strategies, and managed to come out on top ahead of far more complex approaches.

The simplest and shortest strategy won, a program that precisely enforced the Golden Rule. But what precisely made Tit-for-Tat so successful? Axelrod analyzed the results of the tournament and came up with a few principles of success.

  • Don’t get greedy. Tit-for-Tat can never beat another strategy. But it never allows itself to take a beating, ensuring it skips the brutal losses of two “evil” strategies fighting against each other. It actively seeks out win-win situations instead of gambling for the higher payoff.
  • Be nice. The single best predictor of whether a strategy would do well was if they were never the first to defect. Some tried to emulate Tit-for-Tat but with a twist – throwing in the occasional defection to up the score. It didn’t work.
  • Reciprocate, and forgive. Other programs tended to cooperate with Tit-for-Tat since it consistently rewarded cooperation and punished defection. And Tit-for-Tat easily forgives – no matter how many defections it has seen, if a program decides to cooperate, it will join them and reap the rewards.
  • Don’t get too clever. Tit-for-Tat is perfectly transparent, and it becomes obvious that it is very, very difficult to beat. There are no secrets, and no hypocrisy – Tit-for-Tat gets along very well with itself, unlike strategies biased toward deception.

The contest attracted so much attention that a second one was organized, and this time every single entry was aware of the strategy and success of Tit-for-Tat. Sixty-three new entries arrived, all gunning for the top spot. And once again, Tit-for-Tat rose to the top. Axelrod used the results of these tournaments to develop ideas about how cooperative behaviour could evolve naturally, and eventually wrote a bestselling book called The Evolution of Cooperation. But his biggest accomplishment may be showing us that being nice does pay off – and giving us the numbers to prove it.