Scores from this test are sometimes reported as "I.Q." with a standard deviation of 24, and sometimes as raw scores out of 36. This report deals with the "I.Q.'s". In a few cases they were reported with an standard deviation of 16, and those have been converted to 24 as that is the most used scale for this test.
Note that the testees reporting "I.Q.'s" are not the same individuals as those reporting raw scores (although a few report both so there is a small overlap). So the scores in this report are from a different group than those in the report dealing with R.A.P.M. raw scores.
n = 50
n = 7
|Test For Genius - Revision 2004||6||0.81|
|Verbal section of Test For Genius - Revision 2004||6||0.78|
|Spatial section of Test For Genius - Revision 2004||6||0.65|
|Short Test For Genius||8||0.07|
|Qoymans Multiple-Choice #2||5||-0.01|
|Space, Time, and Hyperspace||18||-0.09|
|Analogies of Long Test For Genius||10||-0.11|
|The Final Test||8||-0.13|
|Qoymans Multiple-Choice #1||8||-0.16|
|Association subtest of Long Test For Genius||10||-0.21|
|Intelligence Quantifier by assessment||8||-0.25|
|Qoymans Multiple-Choice #4||7||-0.32|
|Cooijmans Intelligence Test - Form 1||8||-0.42|
|Genius Association Test||5||-0.52|
|Long Test For Genius||10||-0.55|
|Lieshout International Mesospheric Intelligence Test||5||-0.57|
|Cooijmans Intelligence Test - Form 2||8||-0.64|
|Qoymans Multiple-Choice #3||8||-0.70|
Weighted average of correlations: -0.132
Conservatively estimated minimum g loading: -0.36
|Culture Fair Numerical Spatial Examination - Final version||6||0.46|
|Logima Strictica 36||10||0.25|
|Raven's Advanced Progressive Matrices (raw)||5||0.24|
|Strict Logic Sequences Exam I||5||-0.05|
|Strict Logic Spatial Exam 48||5||-0.21|
|Nonverbal Cognitive Performance Examination||7||-0.26|
|Cattell Culture Fair||12||-0.27|
Weighted average of correlations: 0.019
Please be aware that correlations with these external tests are in most cases affected (depressed, typically) by one or more of the following: 1) Little overlap with the object test because of the much lower ceilings and inherent ceiling effects of the tests used in regular psychology; 2) Candidates reporting scores selectively, for instance only the higher ones while withholding lower ones; 3) Candidates reporting, or having been reported by psychometricians, incorrect scores.
These are estimated g factor loadings, but against homogeneous tests containing only particular item types, as opposed to non-compound heterogeneous tests. Although tending to surprise the lay person, it is not uncommon for tests to have high loadings on item types they do not actually contain themselves. Such loadings reflect the empirical fact that most tests for mental abilities measure primarily g, regardless of their contents; that the major part of test score variance is caused by g, and only a minor part by factors germane to particular item types. It is of key importance to understand that this is a fact of nature, a natural phenomenon, and not something that was built into the tests by the test constructors.
|Type||g loading of Raven's Advanced Progressive Matrices (I.Q.) on that type|
Balanced g loading = -0.29
Yes, this test has a negative g loading in its upper range. That means higher scores correspond to lower ability levels. This is of course not true for the full range of the test, but only for a tiny segment. It is in fact known from many factor-analytical studies that the Raven matrices tests are among the highest g loaded tests; but such studies deal with the normal range of I.Q., between plus and minus about two standard deviations from 100.
The correlation may also have been influenced by the fact that Raven scores are often reported with an artificial ceiling at 156, the 99th centile (see above; many 156s), but that influence may go both ways; in other words, without the artificial ceiling the negative correlation might have been even larger!
Also, in the report on R.A.P.M. raw scores (which are not influenced by reporting with artificial ceiling), a similar negative correlation is found, so apparently it is not the score reporting of psychologists that causes it to be negative.
Further study of this negative correlation and the point where it starts can better be based on known raw scores, as there exist many different sets of norms for this test (for different countries, ages, and the renormings to correct for the Flynn effect which has been large on this test), so one can not rely on the "I.Q.'s" to correspond consistently to particular raw scores.
|Mother's educational level||21||0.27|
|Father's educational level||21||0.23|
|Disorders (parents and siblings)||22||0.10|
|Observed associative horizon||4||-0.08|
|Year of birth||53||-0.18|
|Gifted Adult's Inventory of Aspergerisms||9||-0.24|
|P.S.I.A. System factor||6||-0.25|
|P.S.I.A. Deviance factor||9||-0.31|
|P.S.I.A. Ethics factor||9||-0.35|
Remark: As one would expect for a test with negative g loading, its relation to personal details is largely opposite to what it is for most positively g loaded high-range tests.
Correlation of this test with national average I.Q.'s published by Lynn and Vanhanen:
In parentheses the number of score pairs on which that estimated g factor loading is based. The Upward and Downward values are calculated including the pertinent score itself. It is normal that g factor loadings go down when the range is restricted like this, but careful study of the Upward and Downward columns may reveal possible scores below or above which the test loses validity altogether.
|Score||Upward (n)||Downward (n)|
|115||-0.39 (190)||NaN (0)|
|145||-0.46 (185)||NaN (0)|
|150||-0.45 (131)||0.74 (10)|
|155||-0.50 (113)||0.69 (35)|
|156||-0.50 (113)||0.33 (102)|
|160||-0.54 (43)||0.33 (130)|
|165||NaN (0)||-0.21 (177)|
|170||NaN (0)||-0.21 (177)|
|175||NaN (0)||-0.34 (184)|
|180||NaN (0)||-0.34 (185)|
|187||NaN (0)||-0.39 (190)|
It is chilling how accurately this test fails to measure mental ability above about 155 (that is, I.Q. 134). The fact that the female median is the same as the male median on this test, while with high-range tests it is mostly somewhat lower, is consistent with this; the valid ceiling of the test is too low to allow for a sex difference in the high range to show up (that is, too low to allow males to outscore females). This is typical for tests used in regular psychometrics, which are mostly deliberately constructed to be sex-equal by leaving out problems that show a large sex difference (which in effect also means that no truly difficult problems are allowed to be included, as on difficult problems males score higher than females).
This test is one of those which have suffered the most from the Flynn effect; the rise of raw scores over the years. That is probably one of the reasons why psychologists have often reported its scores with an artificial ceiling at the 99th centile. They see before their eyes that far too many have perfect or near-perfect scores, and that the silly norm tables attach astronomical "I.Q.'s" to that, which seem all the more ridiculous because they are given on an idiotic scale with 24 points per standard deviation.
They sense this can not be right and give all those high scorers "156 or higher" or "higher than 155" or "99th percentile". It is understandable, but it destroys information.
In any case, it seems I.Q.'s from this test are not suitable for admission to higher-I.Q. societies.