Raw score trends over the years

© October 2019 Paul Cooijmans

Score trends per item type category

For each test, the correlation is computed between the median scores per year and the year numbers. Per item type category, the mean of those correlations is computed across the tests in that category. A separation is made between older and newer tests to reveal the phenomenon that score inflation took place mainly in the period 1995-2004 (this was obvious upon a preliminary visual inspection). The tests overlap in time so no sharp separation can be made, but roughly speaking, the older tests have much of their scores from 1995-2004, and the newer tests from 2005-present.

One may speculate that the earlier inflation was caused by the rise of the Internet as a tool for research and fraud. From 2005 on, various methods have been applied to counteract fraud, and been effective.

Test typemean r newer testsmean r older tests
Heterogeneous.01-.07
Verbal (non-analogy)-.23.03
Logical-.21-.18
Numerical-.09.32
Verbal analogy-.12.13
Spatial-.07.76
Overall means-.142.198

In hindsight, almost all of the raw score inflation (positive correlations) took place on homogeneous Spatial, Numerical, and Verbal (analogy) tests before 2005. Other tests types have never shown serious inflation, and on the newer tests there is no inflation on the whole but instead a steady decline of raw scores, as if people are getting stupider.