Contents type: Verbal, numerical, spatial, logical. Period: 2007-2026
| 7 | * |
| 16 | * |
| 20 | * |
| 22 | * |
| 23 | *** |
| 25 | ** |
| 26 | *** |
| 27 | * |
| 28 | *** |
| 29 | ****** |
| 29.5 | * |
| 30 | *********** |
| 30.5 | * |
| 31 | *** |
| 32 | ***** |
| 33 | **** |
| 34 | ******* |
| 35 | ** |
| 35.5 | *** |
| 36 | **** |
| 36.5 | * |
| 37 | **** |
| 38 | ***** |
| 39 | **** |
| 40 | *** |
| 40.5 | * |
| 41 | ** |
| 42 | ***** |
| 43 | * |
| 44 | ** |
| 45 | * |
| 46 | ** |
| Test name | n | r | p value |
|---|---|---|---|
| Spatial Insight Test | 4 | 0.98 | 0.09 |
| Ultra Test (Ronald K. Hoeflin) | 4 | 0.95 | 0.10 |
| The Marathon Test - Revision 2024 | 5 | 0.94 | 0.06 |
| Gliaweb Recycled Intelligence Test | 14 | 0.90 | 0.001 |
| Test of Shock and Awe | 6 | 0.89 | 0.05 |
| Qoymans Multiple-Choice #3 (batch scored by Paul Cooijmans) | 5 | 0.88 | 0.08 |
| Culture Fair Numerical Spatial Examination - Final version (Etienne Forsström) | 7 | 0.84 | 0.04 |
| Tests by Iakovos Koukas (aggregate) | 5 | 0.84 | 0.10 |
| Non-Verbal Cognitive Performance Examination (Xavier Jouve) | 5 | 0.82 | 0.10 |
| Epiq Tests (aggregate) | 10 | 0.80 | 0.02 |
| The Piper's Test | 25 | 0.80 | 0.00008 |
| Dicing with death | 24 | 0.79 | 0.0002 |
| Divine Psychometry (Matthew Scillitani) | 15 | 0.78 | 0.004 |
| Narcissus' last stand | 31 | 0.76 | 0.00003 |
| International High IQ Society tests (aggregate) | 5 | 0.76 | 0.12 |
| Verbal section of Test For Genius - Revision 2016 | 34 | 0.76 | 0.00001 |
| Space, Time, and Hyperspace | 6 | 0.75 | 0.10 |
| Test For Genius - Revision 2016 | 34 | 0.74 | 0.00002 |
| Verbal section of The Marathon Test | 26 | 0.74 | 0.0002 |
| The Smell Test | 18 | 0.73 | 0.002 |
| De Laatste Test - Herziening 2019 | 6 | 0.73 | 0.10 |
| Three Sonnets (Heinrich Siemens) | 6 | 0.73 | 0.10 |
| The Test To End All Tests | 32 | 0.72 | 0.00006 |
| The Final Test | 16 | 0.71 | 0.006 |
| The Marathon Test | 22 | 0.71 | 0.001 |
| Cooijmans Intelligence Test - Form 4 | 44 | 0.70 | 0.000004 |
| Problems In Gentle Slopes of the first degree | 15 | 0.70 | 0.01 |
| Tests by Xavier Jouve, other than those listed separately (aggregate) | 4 | 0.69 | 0.24 |
| Associative LIMIT | 39 | 0.69 | 0.00002 |
| Only idiots | 16 | 0.68 | 0.009 |
| Titan Test (Ronald K. Hoeflin) | 9 | 0.67 | 0.06 |
| Test of the Beheaded Man | 41 | 0.67 | 0.00002 |
| A Relaxing Test (David Miller) | 19 | 0.67 | 0.005 |
| Cooijmans Intelligence Test - Form 2 | 12 | 0.67 | 0.03 |
| Psychometric Qrosswords | 17 | 0.64 | 0.01 |
| Genius Association Test | 43 | 0.64 | 0.00003 |
| Gliaweb Riddled Intelligence Test - Revision 2011 | 34 | 0.64 | 0.0002 |
| Numerical section of The Marathon Test | 28 | 0.63 | 0.001 |
| Strict Logic Spatial Examination 48 (Jonathan Wai) | 7 | 0.63 | 0.12 |
| Logima Strictica 24 (Robert Lato) | 4 | 0.63 | 0.27 |
| The Gate | 8 | 0.62 | 0.10 |
| Tests by Ivan Ivec (aggregate) | 13 | 0.62 | 0.03 |
| Numerical and spatial sections of The Marathon Test | 26 | 0.62 | 0.002 |
| Spatial section of The Marathon Test | 27 | 0.62 | 0.002 |
| Numerologica (Andrei Udriște) | 5 | 0.61 | 0.22 |
| Reflections In Peroxide | 34 | 0.61 | 0.0005 |
| Random Feickery (Brandon Feick) | 13 | 0.61 | 0.04 |
| Combined Numerical and Spatial sections of Test For Genius - Revision 2016 | 36 | 0.60 | 0.0004 |
| Qoymans Multiple-Choice #5 | 51 | 0.60 | 0.00002 |
| Test For Genius - Revision 2010 | 11 | 0.60 | 0.06 |
| Raven's Advanced Progressive Matrices (I.Q.) | 4 | 0.60 | 0.30 |
| The Bonsai Test - Revision 2016 | 45 | 0.58 | 0.0001 |
| Problems In Gentle Slopes of the third degree | 44 | 0.58 | 0.0002 |
| Qoymans Multiple-Choice #4 | 16 | 0.55 | 0.03 |
| The Nemesis Test | 37 | 0.55 | 0.0009 |
| Strict Logic Sequences Examination II (Jonathan Wai) | 6 | 0.55 | 0.22 |
| Verbal section of Test For Genius - Revision 2004 | 24 | 0.54 | 0.01 |
| Gliaweb Riddled Intelligence Test (old version) | 5 | 0.54 | 0.28 |
| Space, Time, and Hyperspace - Revision 2016 | 37 | 0.54 | 0.001 |
| Cooijmans Intelligence Test - Form 3 | 50 | 0.54 | 0.0002 |
| Tests by Paul Laurent Miranda (aggregate) | 4 | 0.53 | 0.37 |
| The LAW - Letters And Words | 6 | 0.52 | 0.25 |
| Cooijmans Intelligence Test 5 | 38 | 0.51 | 0.002 |
| Test For Genius - Revision 2004 | 17 | 0.51 | 0.04 |
| Words | 6 | 0.49 | 0.27 |
| Reason Behind Multiple-Choice - Revision 2008 | 50 | 0.49 | 0.0006 |
| Numerical section of Test For Genius - Revision 2010 | 44 | 0.49 | 0.001 |
| A Paranoiac's Torture: Intelligence Test Utilizing Diabolic Exactitude | 35 | 0.49 | 0.005 |
| Wechsler Adult Intelligence Scales | 13 | 0.49 | 0.10 |
| Lieshout International Mesospheric Intelligence Test | 46 | 0.48 | 0.001 |
| Psychometrically Activated Grids Acerbate Neuroticism | 15 | 0.48 | 0.07 |
| Cartoons of Shock | 22 | 0.47 | 0.03 |
| Tests by Theodosis Prousalis (aggregate) | 7 | 0.46 | 0.26 |
| Isis Test | 34 | 0.45 | 0.01 |
| Tests by Mislav Predavec (aggregate) | 11 | 0.44 | 0.16 |
| The Mental Inventor (Neurolus Psychometrics) | 4 | 0.43 | 0.44 |
| Miscellaneous tests | 32 | 0.43 | 0.02 |
| The Alchemist Test (Anas El Husseini) | 25 | 0.42 | 0.04 |
| Reason - Revision 2008 | 50 | 0.42 | 0.003 |
| Labyrinthine LIMIT | 16 | 0.41 | 0.11 |
| Letters | 7 | 0.40 | 0.34 |
| Spatial section of Test For Genius - Revision 2004 | 26 | 0.38 | 0.06 |
| Laaglandse Aanlegtest - Herziening 2016 | 5 | 0.36 | 0.46 |
| The Final Test - Revision 2013 | 12 | 0.32 | 0.28 |
| Association subtest of Long Test For Genius | 4 | 0.30 | 0.60 |
| Strict Logic Sequences Examination I (Jonathan Wai) | 15 | 0.29 | 0.28 |
| 916 Test (Laurent Dubois) | 4 | 0.28 | 0.64 |
| De Golfstroomtest - Herziening 2019 | 5 | 0.26 | 0.60 |
| Raven's Advanced Progressive Matrices (raw) | 6 | 0.26 | 0.57 |
| Daedalus Test | 20 | 0.24 | 0.28 |
| Reason | 8 | 0.24 | 0.52 |
| Analogies of Long Test For Genius | 4 | 0.23 | 0.68 |
| Numbers | 12 | 0.22 | 0.46 |
| Bonsai Test | 6 | 0.21 | 0.64 |
| Tests by Arne Andre Gangvik (aggregate) | 4 | 0.20 | 0.71 |
| Problems In Gentle Slopes of the second degree | 29 | 0.19 | 0.30 |
| Combined Numerical and Spatial sections of Test For Genius - Revision 2010 | 15 | 0.19 | 0.48 |
| Reason Behind Multiple-Choice | 7 | 0.15 | 0.71 |
| Logima Strictica 36 (Robert Lato) | 11 | 0.13 | 0.68 |
| Cooijmans On-Line Test - Two-barrelled version | 20 | 0.12 | 0.62 |
| Odds | 5 | 0.03 | 0.94 |
| The Hammer Of Test-Hungry - Revision 2013 | 11 | 0.02 | 0.94 |
| Tests by Jason Betts (aggregate) | 11 | -0.14 | 0.64 |
| Cattell Culture Fair | 7 | -0.22 | 0.60 |
| Tests by James Dorsey (aggregate) | 7 | -0.23 | 0.57 |
| Tests by Nikolaos Soulios (aggregate) | 13 | -0.23 | 0.42 |
| Gliaweb Raadselachtig Analogieproefwerk | 6 | -0.31 | 0.48 |
| Tests by Greg Grove (aggregate) | 6 | -0.56 | 0.22 |
| Test of Inductive Reasoning / J.C.T.I. (Xavier Jouve) | 6 | -0.68 | 0.12 |
Weighted mean of correlations: 0.535 (N = 1956)
Estimated g factor loading: 0.73
Ranking in above table is based on the unrounded correlations. All available data is present in this table, no tests are left out except for those with less than 4 score pairs. All known pairs are used, including possible floor/ceiling scores or outliers.
These are estimated g factor loadings, but against homogeneous tests (containing only particular item types) as opposed to non-compound heterogeneous tests. Although tending to surprise the lay person, it is not uncommon for tests to have high loadings on item types they do not actually contain themselves. Such loadings reflect the empirical fact that most tests for mental abilities measure primarily g, regardless of their contents; that the major part of test score variance is caused by g, and only a minor part by factors germane to particular item types. It is of key importance to understand that this is a fact of nature, a natural phenomenon, and not something that was built into the tests by the test constructors.
| Type | n | g loading of The Sargasso Test on that type |
|---|---|---|
| Verbal | 309 | 0.78 |
| Numerical | 115 | 0.68 |
| Spatial | 168 | 0.71 |
| Logical | 84 | 0.53 |
| Heterogeneous | 819 | 0.77 |
N = 1495
Compound tests have been left out of this table to avoid overlap.
Balanced g loading = 0.69
| Country | n | median score |
|---|---|---|
| Romania | 3 | 39.0 |
| United_Kingdom | 5 | 38.0 |
| Sweden | 5 | 37.0 |
| Spain | 5 | 36.0 |
| Germany | 5 | 34.0 |
| Italy | 3 | 34.0 |
| France | 3 | 33.0 |
| United_States | 25 | 33.0 |
| Finland | 5 | 30.0 |
| Korea_South | 5 | 30.0 |
| Greece | 4 | 25.0 |
Total number of countries: 32
For reasons of privacy, only countries with 3 or more candidates are included in this table. Ranking is based on the medians, and then alphabetic.
| Personalia | n | r | p value |
|---|---|---|---|
| Observed associative horizon | 6 | 0.62 | 0.17 |
| PSIA Deviance factor - Revision 2007 | 27 | 0.56 | 0.005 |
| PSIA Extreme - Revision 2007 | 27 | 0.55 | 0.005 |
| PSIA Introverted - Revision 2007 | 27 | 0.53 | 0.008 |
| PSIA Rare - Revision 2007 | 27 | 0.47 | 0.02 |
| PSIA System factor - Revision 2007 | 27 | 0.45 | 0.02 |
| PSIA True - Revision 2007 | 27 | 0.44 | 0.03 |
| PSIA Ethics factor - Revision 2007 | 27 | 0.42 | 0.03 |
| PSIA Orderly - Revision 2007 | 27 | 0.41 | 0.04 |
| PSIA Rational - Revision 2007 | 27 | 0.41 | 0.04 |
| PSIA Cold - Revision 2007 | 27 | 0.33 | 0.09 |
| Educational level | 87 | 0.27 | 0.01 |
| PSIA Just - Revision 2007 | 27 | 0.27 | 0.17 |
| PSIA Aspergoid - Revision 2007 | 27 | 0.16 | 0.40 |
| Father's educational level | 80 | 0.02 | 0.87 |
| Sex | 94 | 0.02 | 0.87 |
| Observed behaviour | 21 | 0.00 | 1.00 |
| Disorders (parents and siblings) | 86 | -0.01 | 0.94 |
| Cooijmans Inventory of Neo-Marxist Attitudes | 15 | -0.02 | 0.94 |
| PSIA Neurotic - Revision 2007 | 27 | -0.06 | 0.76 |
| Mother's educational level | 82 | -0.09 | 0.44 |
| Year of birth | 93 | -0.09 | 0.40 |
| PSIA Antisocial - Revision 2007 | 27 | -0.10 | 0.62 |
| Disorders (own) | 88 | -0.23 | 0.03 |
| Gifted Adult's Inventory of Aspergerisms | 21 | -0.26 | 0.24 |
| PSIA Cruel - Revision 2007 | 27 | -0.27 | 0.16 |
Notice: A correlation is generally considered significant if its p value is 0.05 or less.
The goal of estimated g factor loadings for restricted ranges is to verify the hypothesis that g becomes less important, accounts for a smaller proportion of the variance, at higher I.Q. levels. The mere fact of restricting the range like this also depresses the g loading compared to computing it over the test's full range, so it would be normal for these values to be lower than the test's full-range g loading.
| Below 1st quartile (30.0) | 0.61 (N = 772) |
|---|---|
| Below median (33.5) | 0.62 (N = 989) |
| Above median (33.5) | 0.56 (N = 904) |
| Above 3rd quartile (38.0) | 0.60 (N = 353) |
Remark: These reliability coefficients are low for a stand-alone I.Q. test (.9 or higher is normal) but logical if one considers that the test consists exclusively of "bad" items that were previously removed from other tests.
| Age class | n | Median score |
|---|---|---|
| 65 to 69 | 3 | 37.0 |
| 60 to 64 | 3 | 35.5 |
| 50 to 54 | 5 | 38.0 |
| 45 to 49 | 11 | 34.0 |
| 40 to 44 | 10 | 36.0 |
| 35 to 39 | 10 | 32.8 |
| 30 to 34 | 10 | 30.0 |
| 25 to 29 | 21 | 34.0 |
| 22 to 24 | 9 | 32.0 |
| 20 or 21 | 5 | 31.0 |
| 18 or 19 | 2 | 29.0 |
| 17 | 1 | 30.0 |
| 16 | 1 | 30.0 |
| 15 | 1 | 28.0 |
| 13 | 1 | 32.0 |
| Age unknown | 1 | 26.0 |
N = 94
| Year taken | n | Median score | protonorm |
|---|---|---|---|
| 2007 | 7 | 33.0 | 395 |
| 2008 | 8 | 37.0 | 461 |
| 2009 | 1 | 30.0 | 370 |
| 2010 | 4 | 29.0 | 347 |
| 2011 | 1 | 39.0 | 495 |
| 2012 | 4 | 35.5 | 431 |
| 2013 | 1 | 29.0 | 347 |
| 2014 | 4 | 35.0 | 421 |
| 2015 | 3 | 25.0 | 302 |
| 2016 | 1 | 40.5 | 522 |
| 2017 | 4 | 33.0 | 395 |
| 2018 | 5 | 37.0 | 461 |
| 2019 | 4 | 33.3 | 401 |
| 2020 | 9 | 31.0 | 384 |
| 2021 | 6 | 31.0 | 384 |
| 2022 | 8 | 36.0 | 440 |
| 2023 | 5 | 29.0 | 347 |
| 2024 | 9 | 30.5 | 377 |
| 2025 | 9 | 40.0 | 506 |
| 2026 | 1 | 38.0 | 485 |
N = 94
Item statistics are not published as that would help candidates. To detect bad items, answers and comments from candidates are studied, as well as, for each problem, the correlation with total score on the remaining problems (item-rest correlation) and the proportion of candidates getting it wrong (hardness of the item). Possible bad items are revised, replaced, or removed, possibly resulting in a revised version of the test.