Individual score development

Individual score development over the first five tests — October 2019

Eleven years after the first study of individual score development, another attempt at this difficult and labour-intensive task is undertaken, this time by looking at the first five tests taken by a randomly selected sample of candidates. The procedure is as follows:

A candidate is selected at random by a (software) random number generator. It is verified whether that candidate has taken at least five tests. If so, the candidate is added to the sample. This is repeated until 23 candidates have been found. The (intraindividual) chronological order of their test submissions is established to determine which are the first five and what their order is (this is fairly labour-intensive currently, hence the sample size being kept limited). The median scores in protonorms are computed across the 23 candidates for respectively the first, the second, the third, the fourth, and the fifth test. The results are as follows:

Test order	Med. prot.	I.Q.	Graph (I.Q. 134 = 0)
1	413	140	******
2	381	135	*
3	441	145	***********
4	387	135	*
5	395	137	***

Conclusion

This looks very roughly like an increase from the first to the third test, possibly followed by a decline, but the the raggedness suggests that this may not be significant and might begin to look smoother when averaged over many more cases. This will be investigated later on. A smoother impression of the increase may be obtained now by averaging the first two tests (397), and the second two tests (414).

Remark on the minimum of 5 tests

It should be noted that this study may not be representative for the whole group of high-range candidates as a result of the minimum of five tests taken to be included in the sample. On average, candidates take only 2.63 tests per person (latest statistic). To reveal the actual development over the first two or the first three tests better, the minimum for inclusion in the sample should be set at 2, respectively 3. This will be done later on.

Remark on disturbing factors

Possible factors that disturb these results are (1) the fact that candidates take other tests than those of I.Q. Tests for the High Range before and in between these five tests, and (2) the different time spans over which the five tests were taken; in some cases they were taken in one and the same calendar month, in other cases they were taken over the course of up to sixteen years.

Remark on intraindividual variation in scores

The spread of scores per individual is not considered in this report, but is such an interesting phenomenon (as is its possible relation to conscientiousness or other traits) that it will be studied later on as progress in automation makes it easier to compute such for a sizeable number of cases.

Individual score development over the years

Chronologic score development — June 2008

Used for this report are scores from 15 candidates who have each taken at least 10 tests, counting only first attempts, and considering the total scores and not the subscores in cases of tests that give subscores. The numbers of scores per candidate thus counted vary from 10 to 20, and the total number of scores is 180.

Among these candidates are a few high scorers, and a larger number of more moderate to somewhat low scorers. There are no really low scorers, as such candidates never take this many tests per person.

For each candidate, the median of the candidate's scores in protonorms is computed for each year of test-taking, to reveal the individual score development over the years. Then, the yearly medians are mediated between the candidates to obtain a more objective impression of the development. Since not all candidates have the same number of test-taking years, this is done separately for those with 2 years (all 15), and for the subsets of candidates with 3, 4, and 5 years; There are too few candidates and tests beyond five test-taking years to get an objective impression there. In parentheses behind each value is the number of scores on which that value is based.

Size of candidate set	15	13	11	7
Median year 1	410 (44)	413 (30)	413 (23)	410 (11)
Median year 2	428 (49)	439.5 (43)	459 (31)	459 (18)
Median year 3		420 (39)	443.5 (37)	447 (24)
Median year 4			433 (22)	426.5 (18)
Median year 5				453 (12)

The development appears to be an increase from the first to the second year, and then a more or less shallow decline for the next several years, while staying above the first-year level.

The size of the increase is protonorm 410 to 428 (4 IQ points in the current norming) in the set with the most data, to protonorm 410 to 459 (10 IQ points) in that with the least data.

The nature or cause of this increase, from the viewpoint of behavioural science, is that candidates modify their behaviour to maximize their score; in other words that they learn to take the tests better from the first to the second year. Whether the increase represents a "real" increase of intelligence is an intuitive question that lies outside the scope of this report.

Spread of scores

Apart from this chronological development, scores also vary from test to test incidentally, as a result of the fact that some tests suit a candidate better than other (individual aptitude profile), that some tests are done with more effort than other (individual consistency in effort), that the norms for some tests are still preliminary, and other forms of "error of measurement".

This incidental variation does not represent a learning process or increase of intelligence over the years. To show the size of the variation, for each candidate the range and quartile deviation of their scores in protonorms are computed. The median values of these are:

Range: 140 (about 26 IQ points in the current norming)
Quartile deviation: 27 (about 5 IQ points)

So if one takes this many tests (10 to 20 by this strict counting, which means several to many more from the candidate's viewpoint), one's scores can be expected to lie in a range of 26 IQ points, with the middle 50% of the scores in a band of about 10 IQ points. This is roughly what has already been informally observed since the mid-1990s (regarding scores on tests by others than Paul Cooijmans). This may be something to take into account in IQ society admission policies (for instance, requiring that a certain proportion of the candidate's scores are above the pass level, or requiring more than one qualifying score from candidates who have taken many tests), be it that one does not always know how many tests a person has taken. On the other hand, very few people take so many tests that through this phenomenon alone they qualify for a society while really being say 10 or more IQ points below the pass level ("really" meaning "as judged by the median of their scores"). Only about 1 in 100 candidates are as prolific in test-taking as those in this report.

It can be noted that the spread of scores differs per individual; Some have their scores close together, others far apart. It is tempting to think this is related to personality aspects, such as individual aptitude profile (uneven profile gives wider spread of scores) and conscientiousness (lower conscientiousness gives wider spread of scores as they sometimes simply bungle it), but that has not been formally studied yet.