In Professor Arthur R. Jensen's "The Nature of Intelligence and its Relation to Learning' (Melbourne Studies in Education,1978) some seven pages are devoted to an attempt to show that " intelligence' or 'general mental ability' can be identified with what is known as the g factor, a product of the factor analysis of correlations between a number of 'intelligence' tests. The theory is that all intelligence tests produce slightly different results because each one is contaminated to some extent by irrelevant factors, and these factors are not constant; but if one can analyse what all the tests test for in common then that will be 'intelligence' expressed as g. Many critics of intelligence testing, however, have held that the common factor between the tests is an illusion, that the correlations between the tests and the g these produce are only products of the way the tests are constructed - that the psychologists are finding the proof that they have themselves hidden there. Tests are designed to correlate with other tests;
substantial correlations of items with tests as a whole [and, mutatis mutandis, between the test as a whole and other tests] are built in by simply eliminating items and subtests which have low correlations with the test as a whole and including items which would otherwise be highly suspect (e.g. Wechsler's information test) because they have a high correlation with the test as a whole. Ternan and Merril state flatly: 'Tests that had a low correlation with the whole were dropped even if they were satisfactory in other respects.' (Stanford-Binet Intelligence Scale, p. 33)[1]
Similarly, Professor Jensen himself says 'If a test item is not significantly loaded on g (i.e. the first principal component) it does not measure what we mean operationally by intelligence and should be excluded from any test so labelled.'[2] The deck has been stacked in favour of g.
In order to show that the measurements taken in intelligence tests have some relation to an inherent quality of the mind of the person being tested it is necessary to connect them to a third thing capable of independent measurements - a criterion variable. Tests have been supported by showing that they were linked with - that they correlated with - success in school, income, or social position. Correlations of a greater or lesser extent can be demonstrated in all these areas. All these measures, however, introduce new difficulties. The relation between correlation and causation is not clear. Does a high IQ lead to high income, or - something approaching the reverse - does the IQ test measure the qualities displayed by the well-off? I have not space here to go into the arguments dealing with these relationships in any detail, and mention them only to show how useful it would be for a psychometrician to be able to produce a scale that was clearly independent both of the social prejudices of the test designer and the societal complications of the testee. Professor Jensen believes he has found such a scale in the phylogenetic hierarchy.
'The mistaken notion that the g measured by intelligence tests resides in the specific item content and is therefore only an index of culture-specific learnings is most strongly contradicted by the study of the evolution and phylogeny of intelligence.'[3]
An examination of his thinking in this regard must begin with an agreement to correct one crude error. Professor Jensen uses the term 'phyla' throughout as though it were equivalent to species-
the degree of complexity and abstractedness of what can be learned, given any amount of time and training, shows quite distinct differences between phyla.[4]
A phylum is, after the division into the animal and plant kingdoms, the largest grade in the taxonomic hierarchy. Has any work been done on the comparative intelligences of such phyla as blue-green algae and slime moulds? All the examples Professor Jensen cites come from the Vertebrata, a subphylum of the phylum Chordata. That difficulty cleared aside, the argument to evolution is that
(i) the phylogenetic hierarchy corresponds in its order to
(ii) the position of the species on a scale of its success at increasingly g-rated tasks, which in turn corresponds to
(iii) the order of intelligence in which animals are placed by virtually universal consent (of humans), and that therefore
(iv) 'The g factor is thus not just peculiar to individual differences among persons within a particular culture, but is continuous with broader biological aspects of neural organization reflected, as well, in individual differences within other primate species, and even in the evolutionary differences in behavioural capacities between various species. In this sense intelligence is as much a biological reality, fashioned by evolution, as are the morphological features of organisms.[5]'
Evolution has been used to reify intelligence (and if calling intelligence as much a biological reality as a thumb does not reify it, it is difficult to think of a form of words that would).
The trouble with this argument is that it does not consist of three premises and a conclusion; it is one premise four times.
There is no such thing as a 'phylogenetic hierarchy' in the sense that Professor Jensen appeals to. There is no such thing as 'a phyletic scale' or 'phylogenetic statuses' or 'phyletic levels' except as a convenience for scientists. The study of phylogeny is designed to establish the relations of descent between species - when species diverged from a common ancestral root, how they have developed since, what processes produced the features displayed by each species now. When phylogeny is reduced to a diagram it is characteristically expressed as a tree showing the branching lines of descent and, as with a tree, it is pointless to expect continuity between the ends of the branches, let alone ranking. Professor Jensen seems to regard it as if it were some thing more akin to a ladder - as if there were a single line of descent and fish were a less evolved form of birds, birds of rats, rats of cats, and so on. It is possible with more or less certainty to calculate when each phyla, class and species diverged from the line of common descent; unsurprisingly, those that diverged from the line of humanity recently resemble us more than those that went their own way from the beginning. It is also certainly possible to arrange a selection of species in the order of their neurological complexity; but that is all that you are doing when you do that. You are not describing the shape of nature.
If Professor Jensen's scale existed it would be possible to explain the fact that fish have smaller brains than men by their relative positions on the phylogenetic scale. This is not the way natural selection works. Morphological characteristics, and convergence and divergence between species, develop in response to the demands of different environments. It makes little sense to rank them as if they were all attempting the same thing with more or less success. Writers on evolution can talk of higher forms succeeding lower forms of the same genus because there is continuity between earlier forms and those to which they give rise. Our ancestors were closer to the primordial protozoa, were the result of a shorter period of development, and have been preserved in lower geological strata. The same considerations do not apply to comparisons between existing species, where the use of 'higher' and 'lower' represents no more than an understandable anthropomorphism, grading species by their nearness to us.
Moving on to the order Professor Jensen feels is established by the tests he refers to ('It is possible to give such diverse species as fish, birds, rats, cats, dogs, monkeys, and apes essentially equivalent forms of the same test problems')[6] we note that the process of establishing that the tests are g-loaded is a complex one. Since no reliability coefficient has been calculated on a large number of tests administered to a representative population of fish, g cannot be indicated directly. It is inferred from the tests because
For example, both normal children and children with varying degrees of mental retardation have been given the battery of tests that Kohler used with chimpanzees. Exactly the same rank order of difficulty of these problems emerged for human children as for chimpanzees and lower primates (Viaud, 1960, pp. 44-5). The complexity factor common to these experimental animal problems, that differentiates species of primates, rank-orders human children the same as do standard IQ tests. Thus, the g factor of IQ tests reflects the same kind of ability to deal with complexity that is measured by the animals [sic] tests which most clearly reveal phylogenetic differences in behavioural adaptive capacity. [7]
The immediate difficulty here is that Professor Jensen has given only a very partial citation for the evidence on which his assertions in this area rest: 'Many of these tests have been described by Viaud, 1960'. Viaud gives no details of any test applied to fish, no details of any single test that has been applied to more than three of the seven species Professor Jensen mentions, and, most seriously, makes no reference whatsoever to any correlation between any test applied to both humans and animals and a standard IQ test. Viaud says only that
In 1935 Gottschaldt studied a group of one hundred hospitalized children between the ages of two and ten. He divided them into four groups in descending order of intelligence; normal, feeble-minded, imbecilic, and idiotic, and set them the same, or nearly the same, problems Kohler had set his chimpanzees. The results were comparable, and so was the order of difficulty which emerged ... [8]
It has been commented before [9] that when dealing with Professor Jensen's citations it is a good idea to go back to the originals and check what they actually say (helpful, even, to track down the raw data - cf. Burt) [10]. The wisdom of such a course is shown here. 'Exactly the same' is not exactly the same thing as 'comparable'. More importantly, a careless reader might have come away from Professor Jensen's phrasing with the impression that Viaud's account dealt in some way with standard intelligence tests - that the children and the primates mentioned in the line before the citation were the same children and the same primates mentioned in the line after. A division by undescribed means into normal, feeble-minded, imbecilic and idiots is plainly not a standard IQ test, and no possible deductions can be made from it about g.
If one does not wish to believe that Professor Jensen has unwarrantably extended the inferences that can be made from the data he cites, one must assume that he rests his case on other data he has omitted to cite. This is unfortunate, because his argument does hinge on the point. It is not contested, after all, that there are differences in the mode of mental functioning between animal species, or that a dog is able to perform a wider range of operations than a fish but fewer than a person. What is at issue is whether the governing factor in every case is the possession of a greater or lesser amount of the same quality, and whether, if this is so, this quality is the same quality that is measured by IQ tests. Some bridge needs to be established between IQ tests and animals; and that we are not given.
Should we accept that other tests similar to those described by Viaud, but involving a linkage with IQ, have been carried out and have produced data to support Professor Jensen's conclusions, it is still necessary to draw attention to the methodological problems involved in grading animals against humans. Some are physical; Viaud's observation that 'Dogs and cats generally have little difficulty in solving roundabout problems, but fail when faced with prehension problems, i.e. they cannot haul in the goal object by pulling a string" [11] is capable of a number of interpretations. We may have disposed of culture bias only to be faced with species bias. It may be that it is possible to overcome this, but the point needs to be demonstrated. Other problems have to do with the mode of analysis. The area such tests cover - the area of pronounced mental retardation and below - is an area in which all standard IQ tests are admittedly least accurate, having been standardized virtually 'invariably on populations that did not include the retarded. Even if Gottschaldt had given his hospital children a recognized IQ test the results would contain a high degree of inaccuracy. Looking at it from the other direction it is, as Professor Eysenck comments, very difficult to extract evidence of g from a restricted population of the 'very bright, or rather dull' [12] and for purposes of IQ testing all animals are definitely in the latter group. A still more general difficulty is that we are comparing a factor that differentiates between in dividual humans to a factor that differentiates between species; we are not matching like with like. What is the within-species variance on these tests? Are the IQs of hens normally distributed? The point has some bearing on whether what we call "intelligence' in humans displays any of the same characteristics as "intelligence' in animals. Does intelligence correlate with place in the pecking order?
It is here that the argument falls back, as psychometric theory inevitably does, on an appeal to common sense. Professor Jensen says that 'There is virtually universal assent that some animals are more intelligent than others. By what criteria do we judge the dog to be more intelligent than the chicken, the monkey more intelligent than the dog, and the chimpanzee more intelligent than the monkey? Zoologists, ethologists, and comparative psychologists have amassed a good deal of information regarding this question."[13] They have done nothing of the sort. What he appears to mean is not that these groups have collected information on the criteria we use to rank animals - that is a function of our psychology or social structure (what are the relative of the horse and the dog in England and Arabia?); they have collected differences between animal groups that correlate with our rankings and can be used to validate and refine them. It is exactly this transition from 'intelligence' as a piece of common sense, what everybody knows, to 'intelligence' as a product of intelligence tests that is objectionable. Professor Jensen is saying that when we attribute intelligence to animals we are referring to the same thing that is being measured by zoologists, and the coincidence proves that both exist. A similar use of common assent occurs in the argument that because most people say that lawyers are more intelligent than dustmen, and because lawyers score higher on average than dustmen on IQ tests, this shows that IQ tests measure intelligence; the correlation is taken as validation both of the tests and the social scale. An alternative hypothesis would be that the tests and the general belief correlate because both spring from a common process whereby a society evolves and establishes stabilizing structures of thought. Whether, however, lawyers score higher because their work requires more brains or because the test measures class values, the end result is in accordance with 'common sense' and those who contest it are seen to be denying what everybody knows-that there is a hierarchy, that some people are rich in intelligence, some poor, and most in between. It is this wholesale assumption of 'common sense' in psychometrics that represents its main strength and at the same time most undermines its claims to be treated as a science.
Brief note must be taken of the 'further evidence of the bio logical basis of intelligence '[14] that Professor Jensen mentions. The questions of the heritability of IQ and the correlations between the IQs of close relations have been discussed at length elsewhere, and I shall not go into them here. He also mentions, however, correlations between g and anatomical and electrophysiological brain measurements.
The relationship between brain size and IQ has been greatly played down in most recent psychology textbooks. But a thorough and methodologically sophisticated recent review of all the evidence relevant to human brain size and intelligence concludes that the best estimate of the within-sex correlations between brain size and IQ is about +0.30, taking proper account of physical stature, birthweight, and other correlated variables (Van Valen, 1974).[15]
Here, again, Professor Jensen's use of his sources stretches the bounds of academic civility. Van Valen's paper is on the relation of brain size and intelligence-which he nowhere defines-not IQ; only four out of the eight surveys he considers relevant used IQ tests. It is in fact a mark against Van Valen that he considered the earlier data usable. Pearson's researches in 1906 -- the Stanford-Binet test was first standardized in 1913 - are surely in psychometric terms rudimentary rumblings and in general suspect because of the researcher's known prejudices ("In a painstaking statistical study of the inferiority of Jewish immigrants, he concluded that while their average mental ability was somewhat lower than that of native-born Englishmen, the clearest difference was that Jewish children were innately dirtier than Gentile ones.") [15] One of the besetting sins of psychometrics is that it continues to hoard its references long after they have gone thoroughly rotten, as if the transformation into number raised experiments into a sphere where their methodology could not date. (It is a minor foible of Professor Jensen's to extend the shelf life of his references by using secondary citations. It is more acceptable to cite 'a recent review' than Pearson; and would his comparison between retarded children and chimpanzees sound as convincing if it was made clear that the experiments took place in the Germany of 1935?)
The data Van Valen uses, whatever its flaws, record an observed correlation between head size and intelligence of 0.1. The methodological sophistication admired by Professor Jensen consists in the methods Van Valen uses to raise this figure to 0.3.
The observed correlation of 0.1 is between poor measures of intelligence and poor measures of brain size. Any real relation between intelligence and brain size will be diluted by the random noise introduced by inadequate measurement. This loss of information can be quantified. Let i denote intelligence, b brain size, and c external cranial size. Assume that c is correlated with i only through its relationship with b. Then pic = pibpbc. For example, we can take pic = 0.1 and pbc = 0.5. Then pib = 0.2.... We can apply the same equation to loss of information because of poor measures of intelligence. If loss here (1-psquared) is 0.5, pib rises to about 0.3 [17].
Professor E. J. Williams of the Statistics Department of Melbourne University comments that:
The recorded correlations, even when statistically significant, are small in magnitude, suggesting that about one per cent of the variability in intelligence is associated with head size. Van Valen's attempt to establish that the 'real' association is higher than that observed is invalid, since it does not really take account of the sampling errors in the estimates of other correlations (although it refers to them ).[18]
It might also be noted that the attempt relies on assuming exactly what Van Valen is being required to prove - namely, that intelligence correlates with brain size. When we take into consideration that he has also given no grounds for his assumed correlation of 0.7 between intelligence tests and intelligence, it is difficult not to conclude that the most recent psychology textbooks were quite right. Professor Jensen's later citation of "such already well established facts, for example, as the correlation (of about 30) between brain size and IQ (Van Valen, 1974)"' exemplifies his belief that propositions extracted none too scrupulously from flimsy data may be firmly established by constant repetition.
I do not, of course, expect the undermining of Professor Jensen's assertions in this particular area to have any effect on his general credibility. His position in the modem pantheon as the man of science reluctantly driven to state unpleasant truths is not based on his performance in debates of this sort and will not be affected. The discussion will not make progress until it is recognized that psychometrics represents a numerical embodiment of one view of the proper construction of society.
[1] N. Block and G. Dworking, The IQ Controversy (New York, Pantheon Books, 1976), P. 463.
[2] A. R. Jensen, 'The Current State of the IQ Controversy, Australian Psychologist, vol. IL3, no. 1, March 3L9Z_ p. ii.
[3] A. R. Jensen, 'The Nature of Intelligence and its Relation to Learning', Melbourne Studies in Education 1978, p. .117.
[4] Ibid, p.119
[5] Ibid, p. 120
[6] Ibid., p. 118.
[7] Ibid., p. 120.
[8] G. Viaud, Intelligence; its Evolution and Forms (London, Hutchinson, 1960), P. 44.
[9] See, for example, L. Hudson, The Cult of the Fact (New York, Harper Torch, 1973), PP. 115-20; and L. Kamin, The Science and Politics of IQ, (penguin, 1977) pp. 178-207
[10] D. Dorfman, 'The Cyril Burt Question: New Findings', Science, 201, 4362, 29 September 1978
[11] Viaud op. cit., p. 29.
[12] H. J. Eysenck, -The Inequality of Man (London, Fontana, 1975), P. 51.
[13] A.R. Jensen, 'The Nature of Intelligence and its Relation to Learning', p.118.
[14] Ibid., p. 120.
[15] Ibid, pp. 120-1.
[16] J. Blum, Pseudoscience and Mental Ability (New York, Monthly Press, 1978), P. 50.
[17] L. Van Valen, 'Brain Size and Intelligence in Man', American Journal of Physical Anthropology, vol. 40, P. 418.
[18] In a letter to the author, 28 March 1979.
[19] A.R. Jensen, 'The Current Status of the IQ Controversy', p. 11.
Republished with thanks from S. Murray-Smith (Ed.), Melbourne Studies in Education, 1979, Melbourne University Press, Melbourne, pp. 174-183
Return to Chris Borthwick Homepage - Persistent Vegetative State, Facilitated Communication, Stevenson's Fables
Any and all comments eagerly received on e-mail at -