I think one reason it was difficult is, yet again, that D’Arcy Thompson’s transformations change one adult form into another adult form. As I emphasized in Chapter 8, that is not how genes in evolution work. Every individual animal has a developmental history. It starts as an embryo and grows, by disproportionate growth of different parts of the body, into an adult. Evolution is not a genetically controlled distortion of one adult form into another; it is a genetically controlled alteration in a developmental program. Julian Huxley (grandson of T.H. and brother of Aldous) recognized this when, soon after publication of the first edition of D’Arcy Thompson’s book, he modified the ‘method of transformations’ to study the way early embryos turn into later embryos or adults. That’s all I want to say about D’Arcy Thompson’s method of transformations here. I’ll return to the topic in the final chapter, to make a related point.

  Comparative evidence has always, as I suggested at the beginning of this chapter, told even more compellingly than fossil evidence in favour of the fact of evolution. Darwin himself took a similar view, at the end of his chapter in On the Origin of Species on the ‘Mutual Affinities of Organic Beings’:

  Finally, the several classes of facts which have been considered in this chapter, seem to me to proclaim so plainly, that the innumerable species, genera, and families of organic beings, with which this world is peopled, have all descended, each within its own class or group, from common parents, and have all been modified in the course of descent, that I should without hesitation adopt this view even if it were unsupported by other facts or arguments.

  MOLECULAR COMPARISONS

  What Darwin didn’t – couldn’t – know is that the comparative evidence becomes even more convincing when we include molecular genetics, in addition to the anatomical comparisons that were available to him.

  Just as the vertebrate skeleton is invariant across all vertebrates while the individual bones differ, and just as the crustacean exoskeleton is invariant across all crustaceans while the individual ‘tubes’ vary, so the DNA code is invariant across all living creatures, while the individual genes themselves vary. This is a truly astounding fact, which shows more clearly than anything else that all living creatures are descended from a single ancestor. Not just the genetic code itself, but the whole gene/protein system for running life, which we dealt with in Chapter 8, is the same in all animals, plants, fungi, bacteria, archaea and viruses. What varies is what is written in the code, not the code itself. And when we look comparatively at what is written in the code – the actual genetic sequences in all these different creatures – we find the same kind of hierarchical tree of resemblance. We find the same family tree – albeit much more thoroughly and convincingly laid out – as we did with the vertebrate skeleton, the crustacean skeleton, and indeed the whole pattern of anatomical resemblances through all the living kingdoms.

  If we want to work out how closely related any pair of species is – say, how close a hedgehog is to a monkey – the ideal would be to look at the complete molecular texts of every gene of both species, and compare every jot and tittle, as a biblical scholar might compare two scrolls or fragments of Isaiah. But it is time-consuming and expensive. The Human Genome Project took about ten years, representing many person-centuries. Although it would now be possible to achieve the same result in a fraction of the time, it would still be a large and expensive undertaking, as would the hedgehog genome project. Like the Apollo moon landings, and like the Large Hadron Collider (which has just been switched on in Geneva as I write – the gigantic scale of this international endeavour moved me to tears when I visited), the complete deciphering of the human genome is one of those achievements that makes me proud to be human. I am delighted that the chimpanzee genome project has now been successfully accomplished, and the equivalent for various other species. If the present rate of progress continues (see ‘Hodgkin’s Law’ below), it will soon be economically feasible to sequence the genome of every pair of species whose closeness of cousinship we might want to measure. Meanwhile, for the most part we have to resort to sampling particular parts of their genomes, and it works pretty well.

  We can sample by picking out a few choice genes (or proteins, whose sequences are directly translated from genes) and comparing them across species. I’ll come to that in a moment. But there are other ways of doing a kind of crude, automatic sampling, and the technologies to do that have been around for longer. An early method, which works surprisingly well, exploits the immune system of rabbits (you could actually use any animal you like, but rabbits do the job nicely). As part of the body’s natural defence against pathogens, the rabbit’s immune system manufactures antibodies against any foreign protein that enters the bloodstream. Just as you could tell that I have had whooping cough by looking at the antibodies in my blood, so you can tell what a rabbit has been exposed to in the past by looking at its immune response in the present. The antibodies present in the rabbit constitute a history of the natural shocks to which its flesh has been heir – including artificially injected proteins. If you inject, say, a chimpanzee protein into a rabbit, the antibodies that it makes will subsequently attack the same protein if it is injected again. But suppose your second injection is of the equivalent protein, not from a chimpanzee but from a gorilla? The rabbit’s prior exposure to the chimpanzee protein will have partially forearmed it against the gorilla version, but the reaction will be weaker. And it will also have forearmed it against the kangaroo version of the protein, but the reaction will be weaker still, given that the kangaroo is much less closely related to the chimpanzee that did the priming than the gorilla is. The strength of the rabbit’s immune response to a subsequent injection of a protein is a measure of the resemblance of that protein to the original to which the rabbit was first exposed. It was by this method, using rabbits, that Vincent Sarich and Allan Wilson, at the University of California at Berkeley, demonstrated in the 1960s that humans and chimpanzees are much more closely related to each other than anybody had previously realized.

  There are also methods that use the genes themselves, comparing them across species directly rather than comparing the proteins they encode. One of the oldest and most effective of these methods is called DNA hybridization. DNA hybridization is usually what lies behind those statements one often sees along the lines of: ‘Humans and chimpanzees share 98 per cent of their genes.’ There is some confusion, by the way, about exactly what is meant by percentage figures such as these. Ninety-eight per cent of what is identical? The exact figure depends on how large the units are that we are counting. A simple analogy makes this clear, and it does so in an interesting way, because the differences between the analogy and the real thing are as revealing as the similarities. Suppose we have two versions of the same book and we want to compare them. Perhaps it is the book of Daniel, and we want to compare the canonical version with an ancient scroll that has just been discovered in a cave overlooking the Dead Sea. What percentage of the chapters of the two books are identical? Probably zero, for it takes only one discrepancy, anywhere in the whole chapter, for us to say the two are not identical. What percentage of their sentences are identical? The percentage will now be much higher. Even higher will be the percentage of words that are identical, because words have fewer letters than sentences – fewer opportunities to bust the identity. But a word resemblance is still broken if any one letter in the word differs. Therefore, if you line the two texts up side by side and compare them letter by letter, the percentage of identical letters will be even higher than the percentage of identical words. So an estimate like ‘98 per cent in common’ doesn’t mean anything unless we specify the size of the unit we are comparing. Are we counting chapters, words, letters or what? And the same is true when we compare DNA from two species. If you are comparing whole chromosomes, the percentage shared is zero, because it only takes one tiny difference, somewhere along the chromosomes, to define the chromosomes as different.

  The often-quoted figure of about 98 per cent for the
shared genetic material of humans and chimps actually refers neither to numbers of chromosomes nor to numbers of whole genes, but to numbers of DNA ‘letters’ (technically, base pairs) that match each other within the respective human and chimp genes. But there is a pitfall. If you do the lining up naïvely, a missing letter (or an added letter), as opposed to a mistaken letter, will cause all subsequent letters to mismatch, because they will then all be staggered, one step out (until there is a mistake in the other direction to bring them back into step again). It is clearly unfair to let the estimate of discrepancies be inflated in this way. A scholar’s eye, scanning two scrolls of Daniel, automatically copes with this, in a way that is hard to quantify. How can we do it with DNA? This is where we leave our analogy with books and scrolls and go straight to the real thing because, as it happens, the real thing – DNA – is easier to understand than the analogy!

  If you gradually heat DNA, there comes a point – somewhere around 85°C – when the bonding between the two strands of the double helix breaks, and the two helices separate. You can think of 85°C, or whatever the temperature turns out to be, as a ‘melting point’. If you let it cool again, each single helix spontaneously joins up again with another single helix, or fragment of single helix, wherever it finds one with which it can pair, using the ordinary base-pairing rules of the double helix. You might think that this would always be the partner from which it lately separated, and with which, of course, it is perfectly matched. Indeed it could be, but it usually isn’t as tidy as that. Fragments of DNA will find other fragments with which they can pair, and they will usually not be exactly their original partners. And indeed, if you add separated fragments of DNA from another species, fragments of the single strands are quite capable of joining up with fragments of single strands from the wrong species, in just the same way as they will join up with single strands from the right species. Why should they not? It is the remarkable conclusion of the Watson–Crick molecular biology revolution that DNA is just DNA. It doesn’t ‘care’ whether it is human DNA, chimp DNA or apple DNA. Fragments will happily pair off with complementary fragments wherever they find them. Nevertheless, the strength of bonding is not always equal. Single-stranded lengths of DNA bond more tightly with matching single strands than they do with less similar single strands. This is because more of the ‘letters’ of the DNA (Watson and Crick’s ‘bases’) find themselves opposite partners with which they cannot pair. The bonding of the strands is therefore weakened – like a zip fastener with some teeth missing.

  How shall we measure this strength of bonding, after fragments from different species have found each other and united? By an almost ludicrously simple method. We measure the ‘melting point’ of the bonds. You remember I said that the melting point of double-stranded DNA is about 85°C. This is true of normal, properly matched double-stranded DNA, as when a strand of human DNA is ‘melted’ away from a complementary strand of human DNA. But when the bonding is weaker – as when a human strand has bonded with a chimpanzee strand – a slightly lower temperature is sufficient to break the bond. And when human DNA has bonded with DNA from a more distant cousin like a fish or a toad, an even lower temperature suffices to separate them. The difference between the melting point when a strand is bonded to one of its own kind, and the melting point when it is bonded to a strand from another species, is our measure of the genetic distance between the two species. As a rule of thumb, each decrease by 1° Celsius in the ‘melting point’ is approximately equivalent to a drop of 1 per cent in the number of DNA letters matched (or an increase of 1 per cent in the number of missing teeth in the zip fastener).

  There are complications in the method, which I haven’t gone into, and tricky problems, which have ingenious solutions. For instance, if you mix human with chimp DNA, much of the fragmented human DNA will bond with other human DNA fragments, and much of the chimp DNA will bond with its own kind. How do you separate off the hybrid DNA, whose ‘melting point’ is what you really want to measure, from the ‘same-kind’ DNA? The answer is by a clever trick involving previous radioactive labelling. But the details would take us too far off our path. The main point here is that DNA hybridization is the technique that leads scientists to figures like 98 per cent for the genetic similarity between humans and chimpanzees, and it yields predictably lower percentages as you move to more distantly related pairs of animals.

  The newest method of measuring the similarity between a pair of matching genes from different species is the most direct, and the most expensive: actually read the sequence of letters in the genes themselves, using the same methods as were used for the Human Genome Project. Although it is still expensive to compare the entire genome, you can get a good approximation by comparing a sample of genes, and this is now increasingly done.

  Whichever technique we use for measuring similarity between two species, whether it is rabbit antibodies, or melting points, or direct sequencing, the next step is pretty much the same. Having obtained a single number representing the similarity between each pair of species, we then place the figures in a table. Take a set of species and write their names, in the same order, as both the column headings and the row headings. Then place the percentage similarities in the appropriate cells. The table will be triangular (half of a square) because, for example, the percentage similarity between human and dog will be the same as the similarity between dog and human. So if you filled in all of a square table each of the two halves either side of the diagonal would mirror the other.

  Now, what sort of results should we expect? On the evolution model we should predict that you’ll find yourself putting a high score in the cell connecting human and chimpanzee; a lower score in the cell connecting human and dog. The human/dog cell should theoretically have an identical resemblance score to the chimpanzee/ dog cell because humans and chimpanzees have exactly the same degree of relation to dogs. It should be identical, too, to the monkey/ dog cell and the lemur/dog cell. This is because humans, chimpanzees, monkeys and lemurs are all connected to the dog via their common ancestor, an early primate (which probably looked a bit like a lemur). The same score should show up in the human/cat, chimpanzee/cat, monkey/cat and lemur/cat cells, because cats and dogs are related to all primates via the shared ancestor of all carnivores. There should be a much lower score – ideally equally low – in all the cells uniting, say, a squid with any mammal. And it shouldn’t matter which mammal you choose, since all are equally distant from a squid.

  These are strong theoretical expectations, but there is no reason why, in practice, they should not be violated. If they were violated, it would be evidence against evolution. What actually happens turns out to be – within statistical margins of error – just what we should expect on the assumption that evolution has happened. This is another way of saying that, if you put the genetic distances between pairs of species on the limbs of a tree, everything adds up in a satisfying way. Of course the adding up is not quite perfect. Numerical expectations in biology are seldom realized with better than approximate accuracy.

  Comparative DNA (or protein) evidence can be used to decide – on the evolutionary assumption – which pairs of animals are closer cousins than which others. What turns this into extremely powerful evidence for evolution is that you can construct a tree of genetic resemblances separately for each gene in turn. And the important result is that every gene delivers approximately the same tree of life. Once again, this is exactly what you would expect if you were dealing with a true family tree. It is not what you would expect if a designer had surveyed the whole animal kingdom and picked and chosen – or ‘borrowed’ – the best proteins for the job, wherever in the animal kingdom they might be found.

  The earliest large-scale study along these lines was done by a group of geneticists in New Zealand led by Professor David Penny. Penny’s group took five genes which, although not identical across all mammals, were similar enough to have earned the same name in all. The details don’t matter, but for the record the five gen
es were those for haemoglobin A, haemoglobin B (haemoglobins give blood its red colour), fibrinopeptide A, fibrinopeptide B (fibrinopeptides are used in clotting blood) and cytochrome C (which plays an important role in cellular biochemistry). They chose eleven mammals to compare: rhesus monkey, sheep, horse, kangaroo, rat, rabbit, dog, pig, human, cow and chimpanzee.

  Penny and his colleagues thought statistically. They wanted to calculate the probability that, purely by chance, two molecules would yield the same family tree, if evolution wasn’t true. So they tried to imagine all possible trees that could terminate in eleven descendants. It’s a surprisingly large number. Even if you limit yourself to ‘binary trees’ (that is, trees with branches that only bifurcate – no tri-furcating or higher-furcating), the total number of possible trees is more than 34 million. The scientists patiently looked at every one of the 34 million trees and compared each one with the other 33,999,999 trees. No, of course they didn’t! It would take too much computer time. They did, however, devise a clever statistical approximation, a shortcut equivalent to that mammoth calculation.