Our Modular Minds

I believe that ultimately human consciousness can be described by a program. Now this doesn’t mean we’re all in the Matrix, simply that our mind is a giant seething logical machine with values that are manipulated by rules. There is no strange new science in the sense of new specialties that must be discovered in order for the mind to be understood, but a progression in a new kind of science as Wolfram dubbed it – the study of how complexity arises.

A List of Rules

When I first heard that you could program a computer as a child I was amazed. A strange wonder that I could only spend a few shared minutes with at school, something that could draw, add, and write far faster than I could ever dream of – and I could tell it what to do? I wasn’t quite sure how to inform it to bend to my every wish, so I started with Turtle (actually called LOGO I later found) upon my teacher’s recommendation. I fed Turtle long lists of instructions – move forward, draw a line, turn left, repeated in all and any ways I could think of. He would draw glowing green shapes across my screen, and never tired.

The Need for Modularity

The only problem was that while the Turtle seemed infallible, I certainly could not say the same. I was making the classic beginner’s mistake – I would write one giant chunk of code that was supposed to cause my turtle to dance in precisely the way I wanted. Any little mistake would send it widely off course and I would end up with a mess that barely looked like the original design at the best of times. I later learned that using a programming concept called “modules” could help me isolate these errors and make code more efficient and reusable. Just like a company could have a manufacturing and engineering division which could communicate with standardized blueprints, a program could have different modules that would exchange data in a standardized manner. A modular program is more stable since mistakes are typically limited in influence to the module they’re contained in, and each module can be modified by separate influences with only the understanding that they are supposed to behave and communicate in a certain manner.

Damage as Evidence

So is our mind modular? Well, if it wasn’t, we could assume that a brain injury affecting a certain part of the brain would have a consistent and general impact across all of our consciousness. The only problem is that we generally only see a nonspecific mental decline like this from a nonspecific trauma, say impact blows to the head over a long period of time. Injuries in specific areas seem to be correlated with deficits in certain mental abilities – while leaving others totally intact.

A stroke can basically be thought of as an incident where blood flow is drastically affected in a certain specific area of the brain. This subregion of the brain is unable to function due to lack of blood flow, and very strange things can occur.

Howard Engel is a Canadian novelist who had a stroke. Upon waking one morning, he found that the morning paper seemed to be written in some strange script, an alphabet he could not understand. Everything else appeared normal, except his visual cortex had been damaged in a specific area which prevented him from visually parsing letters and words. As a writer, he despaired – it seems that his livelihood had been lost. Soon he realized a critical distinction which gave him hope – he may be unable to read visually, but could he write? Howard sat down and traced these strange looking symbols, his pen gliding over the bizarre shapes over and over. And eventually, the concepts came back to him. In a strange sense, he could now read again. Years and years of writing had associated certain movements of his hand with letters and concepts. Instead of words in his head put to paper by hand movements and a pen, he had to move concepts in the opposite direction – moving his hand over shapes written previously by others, the concepts echoed back up his motor cortex.

And it worked. There was irreparable damage to his visual cortex, a critical module malfunctioning. So he hacked his brain, redistributing resources from his motor cortex which had been trained to recognized these same symbols and concepts necessary for reading by his constant writing. Howard now traces the shapes he sees on the inside of his front teeth with his tongue. His speed has steadily increased, and he says he can now read about half of the subtitles in a foreign film before they flash off the screen.

It doesn’t seem too strange to suggest that there are different localized modules in the brain for motor control, visual interpretation, and other concepts easily identifiable with different aspects of the physical world. But are there modules with finer distinctions, working on different parts of our mental experience rather than different parts of the physical world?

The Wason Selection Task

The Wason selection task is a very interesting experiment in the field of psychological reasoning. Before I spoil it for you by talking too much, let’s just do it right now. Look at the following cards.

Assume the cards have a number on one side, and a color on the other. What cards need to be flipped over to make sure that all even numbers have red backs? Make sure you’ve picked a card or cards.

Got it? Now a bit of unsettling but ego-salvaging news. When this experiment is done with undergraduates, only 10 to 20 percent get it right. The correct answer is to flip over two cards: the number 4 to make sure it has a red back, and the blue card to make sure that it doesn’t have an even number on the other side. Most people suggest flipping over the 4 and the red card – this is wrong, as it doesn’t matter if the red card has an odd number on the other side.

Now let’s mix it up a bit. Instead of numbers and colors, let’s try people and social activities. Assume the cards have a drink on one side, and a person of a certain age on the other. What cards need to be flipped over to make sure everyone drinking beer is old enough to do so?

Now the answer flows quickly and easily, and almost everyone gets it correct. We need to flip over the beer card to make sure that the person on the other side is old enough, and we need to flip over the card showing the underage drinker to make sure they’re playing by the rules.

The weird thing is that both examples are logically equivalent. Instead of numbers and colors, we’ve just used people and drinks. But something very important has occurred, and it happens time and time again as these tests are administered. It seems that people are fast and accurate at solving this task only if it is described as a test of social obligations. They both can be described identically with logic – but that doesn’t appear to matter to our mind. We appear to have a module dedicated to social reasoning and conflict, and can only solve these problems quickly if it involves determining if someone is cheating or breaking social conventions. This ancient module would hold significant survival value – a general logic verification module not quite so much.

Modules Upon Modules

There appears to be significant evidence for a modular mind, not just in terms of divisions between senses such as sight or hearing and other actions like movement but also more abstract modules that deal with concepts such as social rules. Stroke victims can literally rewire their brains, passing concepts upward into their consciousness through paths never intended to be used in such a strange manner, duplicating the work of other modules lost to injury. These modules live in a strange world of physical interaction and abstract mental space, a huge interconnected mass with no clear outline behind it. The big question now becomes: is a sufficiently complex system able to understand itself, and are we that system?

The Golden Rule in the Wild

In the previous post, we discussed the Prisoner’s Dilemma and saw how a simple strategy called Tit-for-Tat enforced the Golden Rule and won a very interesting contest. But does Tit-for-Tat always come out on top? The most confounding thing about the strategy is that it can never win – at best, it can only tie other strategies. Its success came from avoiding the bloody battles that other more deceptive strategies suffered.

The major criticism of Axelrod’s contest is its artificiality. In real life, some may say, you don’t get to encounter everyone, interact with them, and then have a tally run at the end to determine just how you did. Perhaps more deceptive strategies would do better in a more “natural” environment where losing doesn’t mean you get another chance at another opponent, but that your failures cause you to simply die off.

Artificial Nature

So now let’s look at the same game, with the same scoring system, only this time there’s a twist. Assume that this contest takes place in some sort of ecosystem that can only support a certain number of organisms, and they must fight among each other for the right to reproduce. There will be many different organisms, and they will all be members of a certain species, or specific strategy. We can then construct an artificial world where these strategies can battle it out in a manner that seems to reflect the real world a bit better.

In order to determine supremacy, we’ll play a certain number of rounds of the game, called a generation. At the end of the generation, the scores are tallied for each strategy, and a new generation of strategies is produced – with a twist. Higher scoring strategies will produce more organisms representing them in the next generation, while lower scoring strategies will produce less. Repeat this for many generations, observe the trends, and we can see how these strategies do as part of a population that can grow and shrink, rather than a single strategy that lives forever.

So let’s look at an example. Suppose we have a population that consists of the following simple strategies:

Strategy Description
60% ALL-C Honest to a fault, this strategy always cooperates.
20% RAND The lucky dunce, this strategy defects or cooperates at random.
10% Tit-for-Tat This strategy mimics the previous move of the other player, every time.
10% ALL-D The bad boy of the bunch, this strategy always defects.


So what will happen? Was Tit-for-Tat’s dominance a result of the structure of the contest, or is it hardier than some might think? A graph of the changing populations over 50 generations may be seen below.

It’s a hard world to start. ALL-C immediately starts being decimated by the deception of ALL-D and RAND who start surging ahead, while Tit-for-Tat barely hangs on. ALL-D’s relentless deception allows it to quickly take the lead, and it starts knocking off its former partner in crime, RAND. Tit-for-Tat remains on the ropes, barely keeping its population around 10% as ALL-C and RAND are quickly eliminated around it.

And then something very interesting happens. ALL-D runs out of easy targets, and turns to the only opponents left – Tit-for-Tat and itself. Tit-for-Tat begins a slow climb as ALL-D begins to eat itself fighting over scraps. Slowly, steadily, Tit-for-Tat maintains its numbers by simply getting along with itself while allowing ALL-D to destroy each other. By 25 generations it’s all over – the easy resources exhausted, ALL-D was unable to adapt to the new environment and Tit-for-Tat takes over.

This illustrates a very important concept – that of an evolutionarily stable strategy. ALL-D was well on its way to winning, but left itself open to invasion by constant infighting. ALL-C initially had the highest population but was quickly eaten away by more deceptive strategies. Tit-for-Tat on the other hand was able to get along with itself, and defended itself against outside invaders that did not cooperate in turn. An evolutionarily stable strategy is something that can persist in this manner – once a critical mass of players start following it, it cannot be easily invaded or exploited by other strategies, including itself.

I Can’t Hear You

But there’s one critical weakness to Tit-for-Tat. We’re all aware of feuds that have gone on for ages, both sides viciously attacking the other in retaliation for the last affront, neither one precisely able to tell outsiders when it all started. And if we look at the strategies each use in a simplistic sense, it seems that they’re using Tit-for-Tat precisely. So how did it go so horribly wrong?

It went wrong because Tit-for-Tat has a horrible weakness – its memory is only one move long. If two Tit-for-Tat strategies somehow get stuck in a death spiral of defecting against each other, there’s no allowance in the strategy to realize this foolishness, and be the first to forgive. But how could this happen? Tit-for-Tat is never the first to defect after all, so why are both Tit-for-Tat strategies continually defecting?

The answer is that great force of nature, noise. A message read the wrong way, a shout misheard over the wind, an error in interpretation – all can be the impetus for this first initial defection. No matter that it was pointless and incorrect, the strategy has changed. While Tit-for-Tat’s greatest strength is that it never defects first, its greatest weakness is that it never forgives first either.

All of these simulations we’ve seen so far do not include noise, and it can have a catastrophic effect on the effectiveness of Tit-for-Tat. Its success was built on the strategy of never fighting among itself and allowing other deceptive strategies to destroy themselves by doing the same – but with noise, this advantage becomes a fatal weakness as Tit-for-Tat’s inability to be taken advantage of is turned against itself.

So what does a simulation including noise look like? You can see one below, and it contains an additional mystery strategy, Pavlov. Pavlov is very similar to Tit-for-Tat but slightly different – it forgives far more easily.

We see a similar pattern to our previous simulation. ALL-D has an initial population spike as it knocks off the easy targets, but Tit-for-Tat and Pavlov slowly climb to supremacy with ALL-D eventually eating scraps. But the influence of noise causes Tit-for-Tat to fight among itself, and Pavlov begins what previously seemed impossible – to begin to win against Tit-for-Tat.

Puppy Love

So what is Pavlov and why does it work better in a noisy environment like the real world? Well, Ivan Pavlov was the man who discovered classical conditioning. You probably remember him as the guy who fed dogs while ringing a bell, and who then just rang the bell – and discovered that the dogs salivated expecting food.

The strategy is simple – if you win, keep doing it. If you lose, change your approach. Pavlov will always cooperate with ALL-C and Tit-for-Tat. If it plays ALL-D however, it will hopefully cooperate, lose, get angry about it and defect, lose again, switch back to cooperation, and so on. Like a tiny puppy or the suitor of a crazy girlfriend, it can’t really decide what it wants to do, but it’s going to do it’s damndest to try to succeed anyways. It manages to prevent the death spiral of two Tit-for-Tat strategies continually misunderstanding each other by obeying a very simple rule – if it hurts, stop doing it. While it may be slightly more vulnerable to deceptive strategies, it never gets stuck in these self-destructive loops of behavior.

So there’s a lesson here – life is noisy, and people will never get everything correct all the time. Tit-for-Tat works very well for a wide variety of situations, but has a critical weakness where neither player in a conflict is willing or able to forgive. So the next time you’re in a situation like that, step back, use your head, and switch strategies – it’s what this little puppy would want you to do, anyways.

Triumph of the Golden Rule

We live in a world with other people. Almost every decision we make involves someone else in one way or another, and we face a constant choice regarding just how much we’re going to trust the person on the other side of this decision. Should we take advantage of them, go for the quick score and hope we never see them again – or should we settle for a more reasonable reward, co-operating in the hope that this peaceful relationship will continue long into the future?

We see decisions of this type everywhere, but what is less obvious is the best strategy for us to use to determine how we should act. The Golden Rule states that one should “do unto others as you would have them do unto you”. While it seems rather naive at first glance, if we run the numbers, we find something quite amazing.

A Dilemma

In order to study these types of decisions, we have to define what exactly we’re talking about. Let’s define just what a “dilemma” is. Let’s say it has two people – and they can individually decide to work together for a shared reward, or screw the other one over and take it all for themselves. If you both decide to work together, you both get a medium-sized reward. If you decide to take advantage of someone but they trust you, you’ll get a big reward (and the other person gets nothing). If you’re both jerks and decide to try to take advantage of each other, you both get a tiny fraction of what you could have. Let’s call these two people Alice and Bob – here’s a table to make things a bit more clear.

Alice cooperates
Alice defects
Bob cooperates Everyone wins! A medium-sized reward to both for mutual co-operation Poor Bob. He decided to trust Alice, who screwed him and got a big reward. Bob gets nothing.
Bob defects Poor Alice. She decided to trust Bob, who took advantage of her and got a big reward. Alice gets nothing. No honour among thieves… both Bob and Alice take the low road, and fight over the scraps of a small reward.

This specific order of rewards is referred to as the Prisoner’s Dilemma, and was formalized and studied by Melvin Dresher and Merrill Flood in 1950 while working for the RAND Corporation.

Sale, One Day Only!

Now of course the question is – if you’re in this situation, what is the best thing to do? First suppose that we’re never, ever going to see this other person again. This is a one time deal. Absent any moral consideration, your best option for the most profit is to attempt to take advantage of the other person and hope that they are clueless enough to let you, capitalism at its finest. You could attempt to cooperate, but that leaves you open to the other party screwing you. If each person acts in their own interest and is rational, they will attempt to one-up the other.

But there’s just one problem – if both people act in this way, they both get much less than they would if they simply cooperated. This seems very strange, as the economic models banks and other institutions use to model human behavior assume this type of logic – the model of the rational consumer. But this leads to nearly the worst possible option if both parties take this approach.

It seems that there is no clear ideal strategy for a one time deal. Each choice leaves you open to possible losses in different ways. At this point it’s easy to toss up your hands, leave logic behind, and take a moral stance. You’ll cooperate because you’re a good person – or you’ll take advantage of the suckers because life just isn’t fair.

And this appears to leave us where we are today – some good people, some bad people, and the mythical invisible hand of the market to sort them all out. But there’s just one little issue. We live in a world with reputations, with friends, and with foes – there are no true “one time” deals. The world is small, and people remember.

In it for the Long Run

So instead of thinking of a single dilemma, let’s think about what we should do if we get to play this game more than once. If someone screws you in the first round, you’ll remember – and probably won’t cooperate the next time. If you find someone who always cooperates, you can join them and work together for your mutual benefit – or decide that they’re an easy mark and take them for everything they’ve got.

But what is the best strategy? In an attempt to figure this out, in 1980 Robert Axelrod decided to have a contest. He sent the word out, and game theorists, scientists, and mathematicians all submitted entries for a battle royale to determine which strategy was the best.

Each entry was a computer program designed with a specific strategy for playing this dilemma multiple times against other clever entries. The programs would play this simple dilemma, deciding whether to cooperate or defect against each other, for 200 rounds. Five points for a successful deception (you defect, they cooperate), three points each for mutual cooperation, one point each if you both tried to screw each other (mutual defection), and no points if you were taken advantage of (you cooperate, they defect). Each program would play every other program as well as a copy of itself, and the program with the largest total score over all the rounds would win.

So what would some very simple programs be?

ALL-C (always cooperate) is just like it sounds. Cooperation is the only way, and this program never gets tired of being an upstanding guy.

ALL-D (always defect) is the counterpoint to this, and has one singular goal. No matter what happens, always, always, always try to screw the other person over.

RAND is the lucky dunce – don’t worry too much, just decide to cooperate or defect at random.

You can predict how these strategies might do if they played against each other. Two ALL-C strategies would endlessly cooperate in a wonderful dance of mutual benefit. Two ALL-D strategies would continually fight, endlessly grinding against each other and gaining little. ALL-C pitted against ALL-D would fare about as well as a fluffy bunny in a den of wolves – eternally cooperating and hoping for reciprocation, but always getting the shaft with ALL-D profiting.

So an environment of ALL-C would be a cooperative utopia – unless a single ALL-D strategy came in, and started bleeding them dry. But an environment entirely made of ALL-D would be a wasteland – no one would have any success due to constant fighting. And the RAND strategy is literally no better than a coin flip.

Time to Think

So what should we do? Those simple strategies don’t seem to be very good at all. If we think about it however, there’s a reason they do so poorly – they don’t remember. No matter what the other side does, they’ve already made up their minds. Intelligent strategies remember previous actions of their opponents, and act accordingly. The majority of programs submitted to Axelrod’s competition incorporated some sort of memory. For instance, if you can figure out you’re playing against ALL-C, it’s time to defect. Just like in the real world, these programs tried to figure out some concept of “reputation” that would allow them to act in the most productive manner.

And so Axelrod’s competition was on. Programs from all over the world competed against each other, each trying to maximize their personal benefit. A wide variety of strategies were implemented from some of the top minds in this new field. Disk drives chattered, monitors flickered, and eventually a champion was crowned.

And the Winner Is…

When the dust settled, the winner was clear – and the victory was both surprising and inspiring. The eventual champion seemed to be a 90 lb weakling at first glance, a mere four lines of code submitted by Anatol Rapoport, a mathematical psychologist from the University of Toronto. It was called “Tit-for-Tat”, and it did exactly that. It started every game by cooperating – and then doing exactly what the other player did in their last turn. It cooperated with the “nice” strategies, butted heads with the “mean” strategies, and managed to come out on top ahead of far more complex approaches.

The simplest and shortest strategy won, a program that precisely enforced the Golden Rule. But what precisely made Tit-for-Tat so successful? Axelrod analyzed the results of the tournament and came up with a few principles of success.

  • Don’t get greedy. Tit-for-Tat can never beat another strategy. But it never allows itself to take a beating, ensuring it skips the brutal losses of two “evil” strategies fighting against each other. It actively seeks out win-win situations instead of gambling for the higher payoff.
  • Be nice. The single best predictor of whether a strategy would do well was if they were never the first to defect. Some tried to emulate Tit-for-Tat but with a twist – throwing in the occasional defection to up the score. It didn’t work.
  • Reciprocate, and forgive. Other programs tended to cooperate with Tit-for-Tat since it consistently rewarded cooperation and punished defection. And Tit-for-Tat easily forgives – no matter how many defections it has seen, if a program decides to cooperate, it will join them and reap the rewards.
  • Don’t get too clever. Tit-for-Tat is perfectly transparent, and it becomes obvious that it is very, very difficult to beat. There are no secrets, and no hypocrisy – Tit-for-Tat gets along very well with itself, unlike strategies biased toward deception.

The contest attracted so much attention that a second one was organized, and this time every single entry was aware of the strategy and success of Tit-for-Tat. Sixty-three new entries arrived, all gunning for the top spot. And once again, Tit-for-Tat rose to the top. Axelrod used the results of these tournaments to develop ideas about how cooperative behaviour could evolve naturally, and eventually wrote a bestselling book called The Evolution of Cooperation. But his biggest accomplishment may be showing us that being nice does pay off – and giving us the numbers to prove it.

Human Evolution and Frameshift Mutations

How did humans evolve from early primates? How did “human like” traits such as a smaller jaw relative to apes and hairlessness pop up when they don’t appear in the wild in any real frequency? The typical explanation for why humans have smaller jaws than early primates is that our diets changed and our brains got bigger, pressures that caused a smaller jaw. But there’s another way to look at this – what if our diets changed and our brains got bigger due to proto-human society dealing and adapting to an increasingly frequent and nearly catastrophic mutation of the jaw?

Myosin Heavy Chain 16

The human and chimpanzee genomes have both been mapped, so we are able to make comparisons between them. This is extremely useful, as chimpanzees and humans shared a common ancestor, but genetic lines split apart approximately 7 million years ago. So examining the differences may tell us something about how humans evolved.


There is a protein called myosin heavy chain 16 (aka MYH16) which in chimpanzees and other non-human primates is expressed almost exclusively in their powerful jaw muscles. These strong jaws are an adult trait – a logically complex one that would be more sensitive to random mutations.

And that’s exactly what seems to have happened. Non-human primates have DNA that codes for the complete MYH16 protein. The corresponding part of human DNA is missing a random chunk – which causes a frameshift mutation.

Frameshift Mutations

What is a frameshift mutation? Well, first let’s find out how we build proteins. We have a strand of messenger RNA (imagine a long tape with letters on it) which a ribosome (hell, imagine a tiny elf) uses to produce proteins. The critical thing to consider is that a ribosome builds a protein by reading three nucleotides at a time, and these three nucleotides code for a certain amino acid. These amino acids are chained together to produce proteins. Some combinations of three nucleotides can also act as “punctuation marks”.

"Wait, did you say there's three million more pages after this?"

So our wee elf looks closely at the long tape of letters, and starts off with the first three. His “frame”, the little chunk he works on, is three letters long. This frame is an instruction to build a certain amino acid, which he makes. He then goes along the tape, three letters at a time, making an amino acid each time that he sticks onto the last. This will eventually create a long chain of amino acids that we call a protein. But each frame doesn’t need to code for just an amino acid – it can also code for other instructions (those “punctuation marks”) starting or stopping this chaining process.

Now you may have guessed what a frameshift mutation is by now – it’s where a single letter in our tape disappears, or a new random one gets thrown in, causing our frame to get shifted slightly. This means that the resulting triplets after this error will be horribly wrong. It’s like the difference between


if one were to speak in sentences containing only three letter words. The first sentence makes sense if we parse three letters at a time. The two others have a random letter removed, and a random letter added in. If we parse them three letters at a time, the sentence turns into garbage halfway through! The resulting nonsense (or malformed protein) is a result of a random insertion or deletion of information (nucleotides) and our “frame”, the manner in which we interpret it.


So a frameshift mutation occured in early humans that affected the production of the protein MYH16. This protein is involved in the strong powerful jaws that primates have, but not humans. We often think of mutations as a simple little “blip” in the genetic code, but the way our bodies parse this code can cause cascading effects. Instead of MYH16 having a slightly different amino acid in a random spot from a random mutation, the specified amino acids after the mutation will change completely!

So you might think that we’ll have some odd protein that’s mostly normal, and the parts after the mutation affected by the frameshift will be wonky. But – and this is an important but – the triplets code for “punctuation marks” too, remember? In this MYH16 mutation, it turns out that this frameshift caused a punctuation mark (aka a stop codon) to just pop up – so the protein is cut off far sooner than it should be! Not too good for any traits relying on that protein.

Look at the differences between these gorilla and human skulls below. The large bony ridges on the gorilla skull on the left are where the larger jaw muscles attach – otherwise they would literally tear off of the skull. You can also see how the gorilla skull seems “empty” on the sides – that’s because it is filled with large jaw muscles, reducing space available for the brain. The red tinted parts are where the jaw muscles attach – you can see how much more “anchoring” a gorilla’s jaw muscle requires.


And this is where it gets interesting. This mutation in our human ancestors happened approximately 2.4 million years ago. Right before our ancestors stopped looking like primates and started looking like us. If you lacked the protein that operated a powerful jaw muscle, you could not carry a large jawbone around and use it effectively. If you can’t carry a large jawbone around, there is strong selection pressure for those with smaller jaws to survive. If your jaw gets smaller, then the loading of the jaw on the skull decreases – bony ridges disappear, and the skull can get larger and lighter since it doesn’t need to be as strong. A larger and lighter skull can accommodate a bigger brain.

It appears that a random mutation, flipping a single bit of genetic information, has beautifully complex cascading results. Viewing the world as a hostile agent of noise and fury, winding down to an eventual death by entropy is wrong. You can fold a piece of paper, give it to a child, and have them cut crude holes in it with cheap scissors – and when you unfold it, the snowflake is beautiful.

So too can randomness be folded and twisted by logical structures in biology and physics – and the result is our amazing world.

Chimpanzees and Neoteny

One of the biggest “human” questions is “where did we come from?”. While the mechanisms of evolution are well established, the route humanity took to get to its present state is not as well detemined. It’s the difference between knowing the rules of chess and being able to figure out the personality and play style of a grandmaster from a few snapshots of a very long game in progress.

One proposed mechanism for the evolution of humans from primates is neoteny, where juvenile traits are retained and adult adaptations lost. This has been observed in foxes subject to behavioural selection. For instance, look at this young chimpanzee.


This picture is from a 1926 study by the German anthropologist Adolf Naef. He describes it as “the the most human-like picture of an animal, of any that is known to me.” The little guy does seem to have a rather regal and refined air about him, but we can’t just wave our hands and call it case closed at this point. Can we look at the development of a chimpanzee and see if there are any quantifiable parallels?

Bone structure is a great place to start. Chimpanzees, like humans, have a skeleton that changes shape and size as the organism matures.


The two skulls on the far left are those of an infant chimpanzee (top) and an infant human (bottom). Bone structure and shape are very similar, with the classic huge head and tiny cute face we seem programmed to love. The two skulls in the middle are of a adolescent chimpanzee (top) and an adult human (bottom). We can see the jaw start to lengthen in both, and their overall similarity. The final picture on the top right is of an adult chimpanzee, who has a significantly larger and more powerful bite than any adult human.

So what does this show us? Well, humans and chimpanzees appear to have very similar development in terms of bone structure as they grow up, except that humans just seem to… stop at a certain point. There are a multitude of theories as to why this happens, but they all seem to follow the pattern of certain behaviours being selected for which affect the balance of hormones in the body that control the development of adult features. This is called neoteny.

Now neoteny doesn’t mean that every single part of the entire animal becomes more juvenile, or that the animal becomes less complex overall. It’s a selective reduction in complexity – traits that appear later in the animals development (ie adolescence) become less likely to appear.

So how did humans get their unique features? It’s very difficult to select for traits like a bigger brain or hairlessness when those traits don’t appear in the wild in any real frequency to begin with. Viewing human evolution through this lens seems to indicate that change would be very slow, and very hard to do.


But what if instead of selecting for a simple trait, we (or the species as a whole) selects for a behaviour? The neat thing about selecting for this is that hormones have a strong influence on behaviour. So we are partly selecting for certain hormone levels or actions. These hormones also share logical relationships with other hormones, and act in many different parts of the body, not just the parts of the brain influencing behaviour.

If we put significant selection pressure on a species, we are effectively increasing the mutation rate (ie “mutant” creatures tend to be selected more). Increases in mutation rates would be more likely to affect more logically complex proteins arising later in life involved in the development of adolescent features (due to more references to more parts of the mutating DNA) rather than less logically complex proteins that would be involved in juvenile features.

As a result, we now have a mechanism for how these bizarre traits that we simply don’t see in the wild can become so common, so quickly, and also a predicted side effect – neoteny.

But how could this end up as an advantage? It seems that mutations are destroying those adult adaptations that made the organism successful in the first place. But what if the world changes simply because you and others like you live in it? We like to think of physical strength as the be all and end all of “dominance”, but I think this is only true if you’re “one chimp against the world”. A chimp who can more accurately figure out social structure and how to manipulate his place in it could be far more successful in breeding than a chimp who is simply stronger than average.

A chimpanzee’s ability to learn is drastically reduced upon reaching maturity. But baby chimps…


Baby chimps will eagerly mimic a human caretaker – sticking out their tongues, opening their mouth wide, or making their best effort at a kissy face. Not only is the basic mechanism of learning there (imitation), it appears to be very focused on social relationship. And this ability decreases with age! It seems that the retention of juvenile traits is not the burden it appears at first.

So the origin of humanity? Well, it’s still up in the air. But I think it’s incredibly likely that we literally changed ourselves – that living together created environmental pressures (namely social ones) that selected for behaviour in an incredibly complex manner, where the ability to learn and social skills were valued and led to reproductive success. All too often we look for outside pressures in evolution, when some of the most magnificent examples (like the plumage and mating rituals of birds of paradise) are simply a result of everyone agreeing to play an elaborate game.

Clever as a Fox

Sometimes we see things so often that we simply forget to ask “why are they like that?” For instance, let’s take a closer look at domestic animals. Dogs, cats, horses, cows, pigs – animals that we live with, and who couldn’t live without us.

Common Traits

What do all these domestic animals have in common?

pb_pup pb_cat pb_dog
pb_cow pb_horse pb_pig

Now this isn’t a particularly subtle example, but that’s kind of the point. You can see that all of these domestic animals have large white patches – they’ve lost pigment in their coats in some areas. Why do we care? Well, this is something that is extremely common among domesticated animals, but very rare among wild animals. I hear you saying “but what about zebras, or any other wild animal with white patches?”. What we’re referring to here is slightly different. A zebra will always have that patterning, whereas what we’re looking at here is depigmentation – the loss of color in certain areas in an animal that is “normally” colored.

What else is common among domestic animals but rare in the wild? Well, things like dwarf and giant varieties, floppy ears, and non-seasonal mating. Charles Darwin, in Chapter One of Origin of the Species noted that “not a single domestic animal can be named which has not in some country drooping ears”. A very significant observation when you consider that there is only a single wild animal with drooping ears – the elephant.

So perhaps something weird is going on here. Why do animals as different as cats and dogs have these common traits? It seems to arise simply from being around humans!

The Hypothesis


The Russian geneticist Dmitri Belyaev provided a very interesting potential explanation. Genetics at the time was preoccupied with easily measurable traits that could be passed on – if you bred dogs, you could pick the biggest puppies, breed them, and they would produce bigger dogs on average. Fine. But that is selection of a single simple trait, something that likely did not require that many genes to “switch” in order for the puppies to be bigger.

But what if you were selecting for something more complicated? What if, instead of selecting for a simple trait like size or eye color, you selected for something more vague like behaviour – in this case, the very behaviour that made these animals more likely to be around humans. We can call it tamability, or lack of aggressiveness, or whatever – the point is, we are selecting for those animals who will behave in a manner we want around us. A wolf who does not display aggressive behaviour might be able to grab a few scraps of food from the garbage pile of a early human settlement, rather than being driven off.

And if we were selecting a complicated behaviour, rather than a simple trait, it seems likely that it will require more change in the animals genetic code. And since the genetic code is a tangled web where a small bit of DNA can be referenced in many areas of the body – perhaps selecting for a common behaviour would also cause other common traits to arise in animals that are otherwise different.

It’s like giving your car a paint job versus trying to make it go faster – the paint job is easy, but trying to make it faster could lead to your car exhibiting other traits you didn’t directly request, like consuming more gas during regular driving. This could be common across all your project cars. One is a low level trait (the paint, the size of puppy) that can be encompassed in a tiny bit of information (color, size), the other is a high level trait (speed, tamability) that must involve a wide variety of sub-systems changing as well.

The Experiment

Now if you were a Soviet scientist in the late 1950s, you probably worked on something awesome like a giant robot that shot nuclear missles, or a flying submarine. Not Dmitri Belyaev. No, he lost his job as head of the Department of Fur Animal Breeding at the Central Research Laboratory of Fur Breeding in Moscow in 1948 because he was committed to the theories of classical genetics rather than the very fashionable (and totally wrong) theories of Lysenkoism.

So instead, he started breeding foxes. Well, it was technically an experiment to study animal physiology, but that was more of a ruse to get his Lysenkoism-loving bosses off his back while he could study genetics and his theories of selecting for behaviour.


He started out with 130 silver foxes. Like foxes in the wild, their ears are erect, the tail is low slung, and the fur is silver-black with a white tip on the tail. Tameness was selected for rigorously – only about 5% of males and 20% of females were allowed to breed each generation.


At first, all foxes bred were classified as Class III foxes. They are tamer than the calmest farm-bred foxes, but flee from humans and will bite if stroked or handled.


The next generation of foxes were deemed Class II foxes. Class II foxes will allow humans to pet them and pick them up, but do not show any emotionally friendly response to people. If you are a cat owner, you would call the experiment a success at this point.


Later generations produced Class I foxes. They are eager to establish human contact, and will wag their tails and whine. Domesticated features were noted to occur with increasing frequency.


Forty years after the start of the experiment, 70 to 80 percent of the foxes are now Class IE – the “domesticated elite”. When raised with humans, they are affectionate devoted animals, capable of forming strong bonds with their owner.

These “elite” foxes also exhibit domestic features such as depigmentation (1,646% increase in frequency), floppy ears (35% increase in frequency), short tails (6,900% increase in frequency), and other traits also seen frequently in domesticated animals.

The Results

Belyaevn passed away in 1985, but he was able to witness the early success of his hypothesis, that selecting for behaviour can cause cascading changes throughout the entire organism. For instance, the current explanation for the loss of pigment is that melanin (a compound that acts to color the coat of the animal) shares a common pathway with adrenaline (a compound that increases the “fight or flight” instinct of an animal). Reduction of adrenaline (by selecting for tame animals) inadvertently reduces melanin (causing the observed depigmentation effects).

So if Belyaevn is right, genetics is not just a low slow process that works on tiny incremental tweaks. Complicated environmental pressures can result in complicated genetic results, in a stunningly quick period of time. Where do I think we’re going with this?

Well, designer pets for one. Following the collapse of the Soviet Union, the project ran into serious financial trouble in the late 1990s. They had to cut down the amount of foxes drastically, and the project survived primarily on funding obtained from selling the tame foxes as exotic pets. Imagine a menagerie of dwarf exotic animals, who crave human attention and form bonds with people. It would be obscenely profitable.

And the out there thought for the day? We’re doing this to ourselves. We don’t encourage people to act aggressively all day to everyone they meet. We reward certain behaviours more than other behaviours. My unprovable conjecture? Humanity is selecting itself for certain behaviours, and the traits we think of as fundamentally human (loss of hair, retention of juvenile characteristics relative to primates) are a side effect of this self-selection.


Here are some great videos with footage of the tame foxes.

From NOVA – Dogs and More Dogs (starts at about 17:30)

“Suddenly, it all started to make sense. As Belyaev bred his foxes for tameness, over the generations their bodies began producing different levels of a whole range of hormones. These hormones, in turn, set off a cascade of changes that somehow triggered a surprising degree of genetic variation.

Just the simple act of selecting for tameness destabilized the genetic make up of these animals in such a way that all sorts of stuff that you would never normally see in a wild population suddenly appeared.” (Full transcript)