Statistics of The Alchemist Test
© Paul Cooijmans
Scores on The Alchemist Test as of 19 September 2022
Contents type: Numerical, logical. Period: 2014-present
0 | *** |
2 | *** |
4 | * |
7 | *** |
8 | * |
9 | * |
10 | *** |
11 | ** |
12 | ** |
13 | * |
14 | * |
15 | ** |
18 | *** |
19 | *** |
20 | ** |
22 | * |
23 | * |
25 | * |
26 | * |
27 | * |
Correlation of The Alchemist Test with other mental ability tests
(Test index) Test name | n | r |
(43) Test For Genius - Revision 2010 | 7 | 0.96 |
(234) Strict Logic Sequences Exam I (Jonathan Wai) | 4 | 0.95 |
(114) Dicing with death | 6 | 0.95 |
(42) The Marathon Test | 8 | 0.93 |
(119) A Relaxing Test | 5 | 0.93 |
(260) Tests by Nikolaos Soulios (aggregate) | 4 | 0.91 |
(26) Verbal section of Test For Genius - Revision 2004 | 8 | 0.91 |
(111) Test For Genius - Revision 2016 | 9 | 0.90 |
(32) Spatial section of The Marathon Test | 13 | 0.89 |
(45) Numerical and spatial sections of The Marathon Test | 13 | 0.89 |
(104) The Final Test - Revision 2013 | 5 | 0.88 |
(0) Test of the Beheaded Man | 12 | 0.86 |
(27) Spatial section of Test For Genius - Revision 2004 | 9 | 0.85 |
(21) Psychometric Qrosswords | 7 | 0.84 |
(1) Cartoons of Shock | 7 | 0.84 |
(36) Reflections In Peroxide | 17 | 0.84 |
(103) Problems In Gentle Slopes of the second degree | 19 | 0.83 |
(39) Combined Numerical and Spatial sections of Test For Genius - Revision 2010 | 9 | 0.83 |
(47) Psychometrically Activated Grids Acerbate Neuroticism | 7 | 0.82 |
(31) Numerical section of The Marathon Test | 13 | 0.82 |
(48) Narcissus' last stand | 10 | 0.81 |
(23) Gliaweb Riddled Intelligence Test - Revision 2011 | 15 | 0.81 |
(18) The Nemesis Test | 9 | 0.80 |
(106) Cooijmans Intelligence Test - Form 4 | 17 | 0.80 |
(113) The Piper's Test | 8 | 0.80 |
(33) Problems In Gentle Slopes of the first degree | 8 | 0.78 |
(109) The Bonsai Test - Revision 2016 | 12 | 0.77 |
(30) Verbal section of The Marathon Test | 9 | 0.76 |
(16) Lieshout International Mesospheric Intelligence Test | 16 | 0.75 |
(2) Cooijmans Intelligence Test - Form 3 | 18 | 0.75 |
(108) Verbal section of Test For Genius - Revision 2016 | 9 | 0.74 |
(44) Associative LIMIT | 12 | 0.70 |
(37) Problems In Gentle Slopes of the third degree | 9 | 0.67 |
(19) Numerical section of Test For Genius - Revision 2010 | 19 | 0.67 |
(12) Cooijmans On-Line Test - Two-barrelled version | 8 | 0.67 |
(11) Isis Test | 11 | 0.65 |
(112) Combined Numerical and Spatial sections of Test For Genius - Revision 2016 | 13 | 0.65 |
(40) Reason Behind Multiple-Choice - Revision 2008 | 19 | 0.64 |
(215) Tests by Jason Betts (aggregate) | 5 | 0.64 |
(3) Qoymans Multiple-Choice #5 | 20 | 0.63 |
(105) Space, Time, and Hyperspace - Revision 2016 | 13 | 0.63 |
(110) Cooijmans Intelligence Test 5 | 12 | 0.61 |
(28) The Test To End All Tests | 10 | 0.61 |
(117) The Hammer Of Test-Hungry - Revision 2013 | 4 | 0.58 |
(4) A Paranoiac's Torture: Intelligence Test Utilizing Diabolic Exactitude | 17 | 0.57 |
(10) Genius Association Test | 12 | 0.56 |
(216) Tests by Ivan Ivec (aggregate) | 7 | 0.54 |
(25) The Sargasso Test | 11 | 0.54 |
(35) Only idiots | 6 | 0.46 |
(5) Daedalus Test | 12 | 0.43 |
(242) Unknown and miscellaneous tests | 14 | 0.41 |
(24) Reason - Revision 2008 | 20 | 0.37 |
(46) Labyrinthine LIMIT | 7 | 0.35 |
Weighted average of correlations: 0.714 (N = 574, weighted sum = 409.99)
Conservatively estimated minimum g loading: 0.85
Ranking in above table is based on the unrounded correlations. All available data is present in this table, no tests are left out except for those with less than 4 score pairs. All known pairs are used, including possible floor/ceiling scores or outliers.
Estimated loadings of The Alchemist Test on particular item types
These are estimated g factor loadings, but against homogeneous tests (containing only particular item types) as opposed to non-compound heterogeneous tests. Although tending to surprise the lay person, it is not uncommon for tests to have high loadings on item types they do not actually contain themselves. Such loadings reflect the empirical fact that most tests for mental abilities measure primarily g, regardless of their contents; that the major part of test score variance is caused by g, and only a minor part by factors germane to particular item types. It is of key importance to understand that this is a fact of nature, a natural phenomenon, and not something that was built into the tests by the test constructors.
Type | n | g loading of The Alchemist Test on that type |
Verbal | 80 | 0.84 |
Numerical | 36 | 0.87 |
Spatial | 51 | 0.88 |
Logical | 32 | 0.63 |
Heterogeneous | 227 | 0.86 |
N = 426
Compound tests have been left out of this table to avoid overlap.
Balanced g loading = 0.82
National medians for The Alchemist Test
Country | n | median score |
Canada | 3 | 20.0 |
China | 3 | 19.0 |
Korea_South | 3 | 14.0 |
United_States | 7 | 12.0 |
Italy | 3 | 9.0 |
South_Africa | 2 | 8.5 |
For reasons of privacy, only countries with 2 or more candidates are included in this table. Ranking is based on the medians, and then alphabetic.
Correlation with national I.Q.'s of The Alchemist Test
Correlation of this test with national average I.Q.'s published by Lynn and Vanhanen:
Correlation of The Alchemist Test with personal details
Personalia | n | r |
Observed associative horizon | 4 | 0.88 |
Observed behaviour | 9 | 0.71 |
P.S.I.A. Orderly - Revision 2007 | 8 | 0.62 |
P.S.I.A. True - Revision 2007 | 8 | 0.57 |
P.S.I.A. Ethics factor - Revision 2007 | 8 | 0.43 |
P.S.I.A. Rational - Revision 2007 | 8 | 0.39 |
Educational level | 34 | 0.34 |
P.S.I.A. Introverted - Revision 2007 | 8 | 0.33 |
P.S.I.A. System factor - Revision 2007 | 8 | 0.20 |
P.S.I.A. Antisocial - Revision 2007 | 8 | 0.17 |
P.S.I.A. Neurotic - Revision 2007 | 8 | 0.16 |
P.S.I.A. Cold - Revision 2007 | 8 | 0.13 |
Sex | 36 | 0.13 |
Year of birth | 35 | 0.10 |
Father's educational level | 31 | 0.04 |
Mother's educational level | 32 | -0.03 |
Gifted Adult's Inventory of Aspergerisms | 6 | -0.06 |
P.S.I.A. Rare - Revision 2007 | 8 | -0.07 |
P.S.I.A. Cruel - Revision 2007 | 8 | -0.11 |
P.S.I.A. Deviance factor - Revision 2007 | 8 | -0.18 |
P.S.I.A. Just - Revision 2007 | 8 | -0.19 |
Disorders (parents and siblings) | 33 | -0.25 |
P.S.I.A. Extreme - Revision 2007 | 8 | -0.26 |
P.S.I.A. Aspergoid - Revision 2007 | 8 | -0.32 |
Disorders (own) | 34 | -0.40 |
Estimated g factor loadings upward and downward of particular scores
In parentheses the number of score pairs on which that estimated g factor loading is based. The goal of this is to verify the hypothesis that g becomes less important, accounts for a smaller proportion of the variance, at higher I.Q. levels. The mere fact of restricting the range like this also depresses the g loading compared to computing it over the test's full range, so it would be normal for both values to be lower than the test's full-range g loading.
Raw score | Upward g (N) | Downward g (N) |
0 | 0.85 (574) | NaN (0) |
6 | 0.72 (419) | 0.63 (59) |
12 | 0.76 (150) | 0.82 (364) |
18 | 0.46 (86) | 0.84 (439) |
30 | NaN (0) | 0.85 (574) |
Reliability
Error
Scores by age
Age class | n | median score |
65 to 69 | 1 | 11.0 |
50 to 54 | 1 | 2.0 |
45 to 49 | 1 | 12.0 |
40 to 44 | 2 | 9.0 |
35 to 39 | 4 | 23.0 |
30 to 34 | 6 | 18.0 |
25 to 29 | 9 | 12.0 |
22 to 24 | 4 | 16.0 |
20 or 21 | 1 | 4.0 |
18 or 19 | 3 | 7.0 |
17 | 2 | 10.5 |
16 | 1 | 9.0 |
N = 35
Scores by year taken
Year taken | n | median score |
2014 | 6 | 8.5 |
2015 | 3 | 9.0 |
2016 | 1 | 2.0 |
2017 | 2 | 6.0 |
2018 | 3 | 15.0 |
2019 | 4 | 10.0 |
2020 | 3 | 14.0 |
2021 | 8 | 15.0 |
2022 | 6 | 20.0 |
ryear taken × median score = 0.77 (N = 36)
Robustness and overall test quality
Correlations of sections with total score
Numerical | 0.90 |
Logical | 0.92 |
Correlations between sections (internal consistency versus profile information)
Ideal values for correlations between sections are around .5, thus being a compromise between the test's ability to yield a "profile" and its ability to provide an indication of general intelligence. With a too high correlation (like .8 or higher) the sections measure basically the same so there is almost no profile information in them, with a too low correlation (like .2 or lower) the sections are so different that there is little point in combining them into a measure of general intelligence.
Section histograms
Prop. = proportion of candidates outscored in this section. In parentheses the proportion outscored for any possible scores higher than the present score but lower than the next-higher score in the table.
Numerical
Score | Prop. | # scores (* = 1 score) |
0 | 0.042 (0.083) | *** |
2 | 0.153 (0.222) | ***** |
3 | 0.250 (0.278) | ** |
4 | 0.292 (0.306) | * |
5 | 0.347 (0.389) | *** |
6 | 0.417 (0.444) | ** |
7 | 0.472 (0.500) | ** |
8 | 0.556 (0.611) | **** |
9 | 0.694 (0.778) | ****** |
10 | 0.806 (0.833) | ** |
12 | 0.889 (0.944) | **** |
13 | 0.958 (0.972) | * |
14 | 0.986 (1.000) | * |
Logical
Score | Prop. | # scores (* = 1 score) |
0 | 0.083 (0.167) | ****** |
1 | 0.194 (0.222) | ** |
2 | 0.278 (0.333) | **** |
3 | 0.347 (0.361) | * |
4 | 0.389 (0.417) | ** |
5 | 0.472 (0.528) | **** |
6 | 0.542 (0.556) | * |
7 | 0.583 (0.611) | ** |
9 | 0.653 (0.694) | *** |
10 | 0.736 (0.778) | *** |
11 | 0.833 (0.889) | **** |
12 | 0.917 (0.944) | ** |
13 | 0.958 (0.972) | * |
14 | 0.986 (1.000) | * |
Item analysis
Item statistics are not published as that would help candidates. To detect bad items, answers and comments from candidates are studied, as well as, for each problem, the correlation with total score on the remaining problems (item-rest correlation) and the proportion of candidates getting it wrong (hardness of the item). Possible bad items are revised, replaced, or removed, possibly resulting in a revised version of the test.