The Ghost in the Machine
* This is a simplified account of a somewhat controversial subject. For details, see, for instance, Gregory [3] and Kottenhoff. [4]
Our perceptual habits are as stubborn as our motor habits. It is as difficult to alter our way of seeing the world as it is to alter our signature or accent of speech; each habit is governed by its own canon of rules. The mechanisms which determine our vision and hearing are part of our perceptual equipment, but operate as quasi-independent functional holons, hierarchically ordered along the entwined trees of the nervous system.
The next step upward in the hierarchy leads to the baffling phenomena of pattern-recognition -- or, to put it differently, to the question how we abstract and recognise universals. When you listen to a gramophone record of an opera with, say, fifty instruments in the orchestra and four voices singing, and then look at the record with a magnifying glass, the whole magic is reduced to the single wavy, spiral curve of the groove. This poses a problem similar to that of how we interpret language (cf. Chapter II). The airwaves, too, which carry the opera into the ear, have only a single variable: variations of pressure in time. The individual instruments and voices have all been superimposed on each other: violin, flute, soprano, and what have you, have been scrambled together into an acoustic porridge, and the mixture threaded out into a kind of long noodle -- a single modulation pulse which makes the eardrum vibrate faster and slower with varying intensity. These vibrations are broken down in the inner ear into a sequence of pure tones, and that sequence is all that is transmitted to the brain. Any information regarding the individual instruments whose production has gone into the porridge seems to be irretrievably lost. Yet as we listen, we do not hear a succession of pure tones; we hear an ensemble of instruments and voices, each with its characteristic timbre. How this dismantling and reassembling operation is performed we understand only very imperfectly to date,* and no textbook of psychology seems to deem the matter worthy of discussion. But we know at least that the timbre of an instrument is determined by the series of partials which accompany the fundamental, and by the energy-distribution among them; together they provide the characteristic tonal spectrum of the instrument in question. We identify the sound of a violin or flute by reconstructing this spectrum -- that is, by picking out and bracketing together its partials, which were drowned among thousands of other partials in the composite air-pulse. In other words, we abstract a stable pattern from the acoustic flux -- we fish out of it the timbre of the flute -- and of course the timbres of a number of other instruments. These are the listener's stable auditory holons. They in turn combine, on the higher levels of the hierarchy, into patterns of melody, harmony, counterpoint, according to more complex rules of the game. (Melody, for instance, is a pattern quite different from timbre, extracted from the same medley of sounds by tracing different variables: rhythm and pitch.)
* See The Act of Creation, pp. 516 ff.
Melody, timbre, counterpoint, are patterns in time -- as phonemes, words and phrases are patterns in time. None of them makes sense -- musical, linguistic, semantic sense if considered as a linear chain of elementary units. The message of the air-pressure pulses can only be de-coded by identifying the wheels within wheels, the simpler patterns integrated into more complex patterns like arabesques in an oriental carpet. The process, as already mentioned, is made to appear more mysterious by the fact that time has only a single dimension. But a single variable is sufficient to encode all music ever written -- provided there is a human nervous system to de-code it. Without it the vibrations caused by the gramophone needle are just so much moving air.
However, the recognition of patterns in space presents a no less difficult problem. How does one recognise a face, a landscape, a printed word, at a glance? Even the identification of a single letter, written by various hands, in various sizes, and appearing in various positions on the retina, and hence on the optical cortex, presents an almost intractable problem for the physiologist. In order to identify the input, the brain must activate some memory-trace; but we cannot have memory-traces which match all and every conceivable variation of writing the letter /f/ -- not to mention several thousand ideograms, if one happened to be Chinese. Some very complex scanning process must be involved which first identifies characteristic simpler features in the complex whole (visual holons -- like loops, triangles, etc.); then abstracts the relations between these features; and then the relations between the relations. Our eyes are in fact constantly engaged in a variety of different types of scanning motions, of which we are unaware; and experiments show that when scanning activities are prevented, the visual field disintegrates. Scanning the visual field means translating what is simultaneously given in space into a succession of impulses in time -- as the TV camera transcribes its visual field into a succession of impulses in time, which are then re-translated by the receiving set into the image on the screen. And vice versa, when we listen to speech or music, the nervous system extracts patterns in time by bracketing together the present with the reverberations of the immediate past, and with memories of the distant past, into one complex process occurring in the specious present in the three-dimensional brain. It constantly transposes temporal into spatial patterns, and spatial events into temporal sequences. In Lashley's classic dictum: 'spatial and temporal order appear to be almost completely interchangeable in cerebral action'. [5]
Thus at the series of relay stations through which the input-stream must pass, it is subjected to filtering, scanning and analysing processes, which strip it of irrelevancies, extract stable configurations from the flux of sensations, analyse and identify patterns of events in space and time. A decisive stage is the transition from the perceptual to the cognitive levels of the hierarcy -- from sight and sound to meaning. The sounds of the syllables /fiu/ and /lañ/ mean nothing. They are nonsense-syllables, unrelated to each other. But a relation instantly emerges when we learn that /fiu/ means 'boy' in Hungarian, and /lañ/ means 'girl'. Once we have invested the sound of a syllable with meaning, it cannot be divested of it.
The meaning we attach to these sound-patterns is agreed by the conventions of language. But man has an irrepressible tendency to read meaning into the buzzing confusion of sights and sounds impinging on his senses; and where no agreed meaning can be found, he will provide it out of his own imagination. He sees a camel in the cloud, a face hidden in the foliage of a tree, a butterfly or an anatomical detail in the ink-blot of the Rohrschach test; he hears messages conveyed by the booming of the church bells or the rattling of carriage wheels. The sensorium extracts meaning from the chaotic environment as the digestive system extracts energy from Food. If we look at a Byzantine mosaic floor, we do not perceive it as an assembly of individual stone-fragments; we automatically combine the fragments into sub-assemblies -- ears, noses, draperies; and these sub-assemblies into individual figures; and these into a composite whole. And when the artist draws a human face, he follows the reverse procedure: he first roughs in the outline of the whole, then sketches in eyes, mouth, ears, as quasi-independent sub-structures, perceptual holons which can be schematised according to certain tricks and formulae.
The hierarchic principle is inherent in our modes of perception; but it can be refined by learning and practice. When an art student acquires an elementary knowledge of anatomy, it improves not the skill of his fingers, but the skill of his eye. Constable made a study of the various types of cloud formation and classified them into categories; he developed a visual 'cloud vocabulary' which enabled him to see and paint skies as nobody had done before. The trained eye of the bacteriologist or of the X-ray specialist enables him to identify the objects he is looking for, where the layman only sees shadowy blurs.
If Nature abhors the void, the mind abhors what is meaningless. Show a person an ink-blot, and he will start at once to organise it into a hierarchy of shapes, tentacles, wheels, masks, a dance of figures. When the Babylonians began to chart the stars, they first of all grouped them together into constellations of lions, virgins, ar
chers, and scorpions -- shaped them into sub-assemblies, celestial holons. The first calendar-makers wove the linear thread of time into the hierarchic pattern of solar days, lunar months, stellar years, Olympic cycles. Similarly, the Greek astronomers broke up homogenous space into the hierarchy of the eight heavenly spheres, each equipped with its clockwork of epicycles.
We cannot help interpreting Nature as an organisation of parts-within-parts, because all living matter and all stable inorganic systems have a part-within-part architecture, which lends them articulation, coherence and stability; and where the structure is not inherent or discernible, the mind provides it by projecting butterflies into the ink-blot and camels into the clouds.
To sum up: in motor hierarchies an implicit intention or generalised command is particularised, spelled out, step by step, in its descent to the periphery. In the perceptual hierarchy we have the opposite process: the input of the receptor organs on the organism's periphery is more and more 'de-particularised', stripped ofirrelevancies during its ascent to the centre. The output hierarchy concretises, the input hierarchy abstracts. The former operates by means of triggering devices, the latter by means of filtering or scanning devices. When I intend to write the letter R, a trigger activates a functional holon, an automatised pattern of muscle contractions which produces the letter R in my particular handwriting. When I read, a scanning device in my visual cortex identifies the letter R regardless of the particular hand that wrote it. Triggers release complex outputs by means of a simple coded signal. Scanners function the opposite way: they convert complex inputs into a simple coded signal.
VI
A MEMORY FOR FORGETTING
Mais où sont les neiges d'antan? François Villon
'I've a grand memory for forgetting, David,' remarks Alan Breck in Kidnapped. He speaks for all of us. Our fond memories are the dregs left in the wineglass, the dehydrated sediments of perceptions whose flavour has gone. I hasten to add that there are of course exceptions to this -- memories of almost hallucinatory vividness of scenes or episodes which have some special emotional significance. I shall call this the 'vivid fragment' or 'picture-strip' type of memory -- as distinct from 'abstractive' memory -- and come back to it later in this chapter.
Abstractive Memory
The bulk of what we are able to remember of our own life history, and of the knowledge we have acquired in its course, is of the 'abstractive' type. Take a simple example: you watch a television play. The exact words of each actor are forgotten by the time he speaks his next line, and only the meaning remains; the next morning you only remember the sequence of scenes which constituted the story; after a year you only remember that it was about a tangle between two men and a woman on a desert island. The original input has been stripped, skeletonised. Similarly with books one has read, and episodes one has lived through. As time passes, memory is more and more reduced to an outline, a condensed abstract of the original experience. The play you saw a month ago has been abstracted by a series of steps, each of which condenses particulars into more generalised schemata; it has been reduced to a formula. The playwright's imagination made an idea branch out into a structure divided into three acts, each divided into scenes, each consisting of smaller divisions -- exchanges, phrases, words. Memory-formation reverses the process, makes the tree gradually shrink back into its roots, as in a trick film played backward.
The word 'abstract' has, in common usage, two main connotations: it is the opposite of'concrete' in the sense that it refers to a general concept rather than a particular instance; and in the second place, an 'abstract' is a summary or condensation of the essence of a longer document, snch as civil servants prepare for their superiors. Memory is abstractive in both senses.
This is, as I have already said, not the full story. If it were, we should be computers, not people. But for the moment let us consider this abstractive mechanism a little further. Memory-formation is a process continuous with perception. It has been said that if a visitor wanted to see Stalin, he had to pass through seventeen gates, from the outer Kremlin gates to the door of the innermost sanctum, and at each successive gate he was submitted to a more thorough screening. We have seen that the sensory intake is subjected to a similar scrutiny before being admitted to awareness. At every gateway of the perceptual hierarchy it is analysed, classiffied, stripped of all detail that is irrelevant For the purpose in hand. We recognise the letter R written in an almost illegible scrawl as 'the same thing' as a huge printed R in a newspaper headline, by a scanning process which disregards all details as irrelevant and only retains the basic geometrical R-design -- the 'R-ness' of the R -- as worth signalling to higher quarters. The signal can then be encoded in a kind of simple Morse. It contains all the information that matters -- 'it's an R' -- in condensed, skeletonised form, but the wealth of detail is of course lost. The scanning process is indeed the exact reverse of the triggering process.
Even those few among the multitude of stimuli constantly impinging on our senses, which have successfully passed all screenings and thus achieved the status of a consciously perceived event, must usually submit to a further rigorous stripping before deemed worthy to be admitted to permanent memory storage; and with the passing of time even this skeletonised abstract is subject to further decay. Anybody who tries to write a detailed chronicle of his doings during the week before last must be painfully surprised at the rate of decay, and the amount of detail irretrievably lost.
This impoverishment of lived experience is unavoidable. It is partly a matter of parsimony -- although the storage capacity of the brain is probably much greater than most people make use of in their lifetime; but the decisive factor is that the processes of generalisation and abstraction imply by definition the sacrifice of particulars. And if, instead ofabstracting universals like 'R' or 'tree' or 'dog', memory were a collection of all our particular experiences of 'R's' and 'trees' and 'dogs' -- a store of lantern-slides and tape-recordings -- it would be completely useless: since no sensory input can be identical in all respects with any stored slide or recording, we would never be able to identify an R or recognise a dog or understand a spoken sentence. We could not even find our way through that immense store of particularised items. Abstractive memory, on the other hand, implies a system of stored knowledge, hierarchically ordered with headings, sub-headings and cross-references like the entries in a Thesaurus or the subject catalogue of a library. Some volume may have got into the wrong place, and some flashy jacket designs might stick out and catch the eye, but on the whole the order holds.
A Speculative View
Fortunately there are compensations for the unavoidable impoverishment of lived experience in the abstractive process.
In the first place the scanning process can acquire a higher degree of sophistication through learning and experience. To the novice, all red wines taste alike, and all Japanese males look the same. But he can train himself to superimpose more delicate scanners on the coarser ones, as Constable trained himself to discriminate between diverse types of clouds, and classified them into sub-categories. Thus we learn to abstract finer and finer nuances -- to make the perceptual hierarchy grow new twigs, as it were.
In the second place, memory is not based on a single abstractive hierarchy, but on a variety of interlocking hierarchies -- such as those of vision, taste and hearing. It is like a forest of separate trees but with entwined branches -- or like our library catalogue with cross-references between different subjects. Thus the recognition of a taste is often dependent on cues provided by smell, though we may not be aware of it. But there are more subtle cross-connections. You can recognise a tune played on a violin although you have previously only heard it played on the piano; on the other hand, you can recognise the sound of a violin, although the last time a quite different tune was played on it. We must therefore assume that melody and timbre have been abstracted and stored independently by separate hierarchies within the same sense modality, but with different criteria of relevance. One abstra
cts melody and filters out everything else as irrelevant, the other abstracts the timbre of the instrument and treats melody as irrelevant. Thus not all the details discarded in the process of stripping the input are irretrievably lost, because details stripped off as irrelevant according to the criteria of one hierarchy may have been retained and stored by another hierarchy with different criteria of relevance.
The recall of the experience would then be made possible by the co-operation of several interlocking hierarchies, which may include different sense modalities, for instance sight and sound, or different branches within the same modality. Each by itself would provide one aspect only of the original experience -- a drastic impoverishment. Thus you may remember the words only of the aria 'Your Tiny Hand is Frozen', but have lost the melody. Or you may remember the melody only, having forgotten the words. Finally, you may recognise Caruso's voice on a gramophone record, without remembering what you last heard him sing. But if two or all three of these factors are represented in the memory store, the reconstruction of the experience in recall will of course be more complete.
The process could be compared to multi-colour printing by the superimposition of several colour-blocks. The painting to be reproduced -- the original experience -- is photographed through different colour-filters on blue, red, and yellow plates, each of which retains only those features that are 'relevant' to it: i.e., those which appear in its own colour, and ignores all other features; then they are recombined into a more or less faithful reconstruction of the original input. Each hierarchy would then have a different 'colour' attached to it, the colour symbolising its criteria of relevance. Which memory-forming hierarchies will be active at any given time depends, of course, on the subject's general interests and momentary state of mind.