Qoymans Multiple-Choice #1 statistics

Introduction

This test was created in 2001 to do something completely different for once. I had always seen multiple-choice tests as inferior (and still do) so I made an extreme one. While the apparently "easy", one-sided verbal test was extremely popular, the data gathered by it was of low quality, and the test had low g loading and reliability. Therefore I later combined it with another test of the same type into a larger test and removed or revised bad items. I also added a "pass" option to each question, which guaranteed half a point. The eventual result is the Qoymans Multiple-Choice #5, which is much better than the earlier ones.

Scores on Qoymans Multiple-Choice #1 as of 19 November 2024

Contents type: Verbal. Period: 2001-2002

n: 188
Median: 26.0
Quartile deviation: 7.0
Maximum possible: 60
Minimum possible: -60
Male median: 27.0 (n = 134)
Female median: 22.0 (n = 45)
Unknown sex median: 18.0 (n = 9)
Proportion of males among candidates: 0.713
Hardness: 0.29
Resolution: 7.0

3	*
6	**
7	**
8	*
9	**
10	****
11	***
12	****
13	********
14	*
15	*******
16	***
17	*******
18	*****
19	*******
20	***
21	****
22	********
23	*********
24	*****
25	*******
26	************
27	*****
28	*********
29	*******
30	****
31	********
32	****
33	******
34	***
35	****
36	******
37	****
38	**
39	*****
40	****
41	**
43	***
44.5	*
45	***
46	*
47	*
48	*

Scores by males

n = 134

3	*
6	*
7	**
8	*
9	*
10	**
11	**
12	***
13	****
15	****
16	*
17	****
18	**
19	***
20	*
21	***
22	**
23	******
24	*****
25	*****
26	***********
27	*****
28	*********
29	******
30	***
31	*******
32	***
33	******
34	***
35	***
36	****
37	***
38	**
39	****
40	****
41	**
43	**
45	**
46	*
47	*

Scores by females

n = 45

9	*
10	*
11	*
12	*
13	****
14	*
15	***
16	*
17	**
18	**
19	**
20	**
21	*
22	******
23	**
25	**
26	*
29	*
30	*
32	*
35	*
36	**
37	*
39	*
43	*
44.5	*
45	*
48	*

Correlation of Qoymans Multiple-Choice #1 with other mental ability tests

Test name	n	r
Short Test For Genius	4	0.83
Qoymans Multiple-Choice #3 (batch scored by Jonathan Wai)	12	0.75
Test of Shock and Awe	5	0.72
916 Test (Laurent Dubois)	4	0.61
Bonsai Test	7	0.57
Tests by Greg Grove (aggregate)	11	0.52
Chimera High Ability Riddle Test (Bill Bultas)	4	0.51
International High IQ Society tests (aggregate)	15	0.48
Evens	12	0.46
Tests by Xavier Jouve, other than those listed separately (aggregate)	7	0.41
Cooijmans On-Line Test	7	0.40
Analogies #1	14	0.39
Association subtest of Long Test For Genius	13	0.37
Qoymans Multiple-Choice #2	27	0.36
Raven's Advanced Progressive Matrices (raw)	4	0.34
Scholastic Aptitude Test (old)	9	0.31
Miscellaneous tests	31	0.31
Sigma Test (Melão Hindemburg)	6	0.27
Analogies of Long Test For Genius	12	0.26
Qoymans Multiple-Choice #3 (batch scored by Paul Cooijmans)	7	0.25
Numbers	9	0.24
The Final Test	14	0.21
European I.Q. Test	4	0.13
The Test To End All Tests	9	0.12
Mega Test (Ronald K. Hoeflin)	5	0.09
Wechsler Adult Intelligence Scales	8	0.03
Qoymans Multiple-Choice #4	10	-0.01
Genius Association Test	8	-0.03
Qoymans Automatic Test #2	8	-0.04
Non-Verbal Cognitive Performance Examination (Xavier Jouve)	5	-0.06
Tests by Nicolas Elenas (aggregate)	10	-0.07
Raven's Advanced Progressive Matrices (I.Q.)	9	-0.08
Cooijmans Intelligence Test - Form 1	13	-0.09
New York High I.Q. Society tests	16	-0.10
Long Test For Genius	9	-0.14
Space, Time, and Hyperspace	18	-0.16
Graduate Record Examination (prior to October 2001)	5	-0.16
Odds	4	-0.19
Encephalist - R (Xavier Jouve)	5	-0.20
Omega Contemplative Items Pool (Tommy Smith)	10	-0.21
Qoymans Automatic Test #1	7	-0.24
W-87 (International Society for Philosophical Enquiry)	4	-0.25
Queendom tests	8	-0.29
Logima Strictica 36 (Robert Lato)	8	-0.40
Cito-toets	4	-0.48
Spatial section of Test For Genius - Revision 2004	4	-0.58
Test For Genius - Revision 2004	4	-0.66
Cattell Culture Fair	4	-0.88
Verbal section of Test For Genius - Revision 2004	4	-0.92
F.N.A. (Xavier Jouve)	4	-0.95

Weighted average of correlations: 0.128 (N = 441)

Estimated g factor loading: 0.36

Ranking in above table is based on the unrounded correlations. All available data is present in this table, no tests are left out except for those with less than 4 score pairs. All known pairs are used, including possible floor/ceiling scores or outliers.

Estimated loadings of Qoymans Multiple-Choice #1 on particular item types

These are estimated g factor loadings, but against homogeneous tests (containing only particular item types) as opposed to non-compound heterogeneous tests. Although tending to surprise the lay person, it is not uncommon for tests to have high loadings on item types they do not actually contain themselves. Such loadings reflect the empirical fact that most tests for mental abilities measure primarily g, regardless of their contents; that the major part of test score variance is caused by g, and only a minor part by factors germane to particular item types. It is of key importance to understand that this is a fact of nature, a natural phenomenon, and not something that was built into the tests by the test constructors.

Type	n	g loading of Qoymans Multiple-Choice #1 on that type
Verbal	134	0.52
Numerical	25	0.53
Spatial	22	-0.48
Logical	7	0.63
Heterogeneous	73	0.45

N = 261

Compound tests have been left out of this table to avoid overlap.

Balanced g loading = 0.33

National medians for Qoymans Multiple-Choice #1

Country	n	median score
Sweden	3	43.0
Italy	3	34.0
United_States	18	30.0
Netherlands	4	27.0

For reasons of privacy, only countries with 3 or more candidates are included in this table. Ranking is based on the medians, and then alphabetic.

Correlation with national I.Q.'s of Qoymans Multiple-Choice #1

Correlation of this test with national average I.Q.'s published by Lynn and Vanhanen, later Lynn and Becker:

r = 0.06 (n = 41)

Correlation of Qoymans Multiple-Choice #1 with personal details

Personalia	n	r
Father's educational level	14	0.46
Educational level	16	0.43
Observed behaviour	13	0.32
Sex	179	0.15
Mother's educational level	14	0.01
Observed associative horizon	7	-0.03
Disorders (parents and siblings)	15	-0.22
Year of birth	140	-0.29
Disorders (own)	17	-0.34
Gifted Adult's Inventory of Aspergerisms	9	-0.65

Correlation with personal details of Qoymans Multiple-Choice #1 - within females

Personalia	n	r
Year of birth	35	-0.34

Correlation with personal details of Qoymans Multiple-Choice #1 - within males

Personalia	n	r
Father's educational level	13	0.40
Educational level	14	0.40
Observed behaviour	12	0.24
Observed associative horizon	6	0.23
Mother's educational level	13	0.22
Disorders (own)	15	-0.24
Year of birth	105	-0.28
Disorders (parents and siblings)	14	-0.43
Gifted Adult's Inventory of Aspergerisms	7	-0.62

Estimated g factor loadings for restricted ranges

In parentheses the number of score pairs on which that estimated g factor loading is based. The goal of this is to verify the hypothesis that g becomes less important, accounts for a smaller proportion of the variance, at higher I.Q. levels. The mere fact of restricting the range like this also depresses the g loading compared to computing it over the test's full range, so it would be normal for these values to be lower than the test's full-range g loading.

Below 1st quartile (raw 18.0)	0.08 (20)
Below median (raw 26.0)	0.37 (110)
Above median (raw 26.0)	0.25 (343)
Above 3rd quartile (raw 32.0)	0.45 (205)

Reliability

Split-half (odd-even) = 0.83
Split-half (1, 2… 5, 6… vs. 3, 4… 7, 8…) = 0.75
Cronbach's alpha = 0.80

Error

Standard error = 4.5 raw score points

Scores by age

Age class	n	Median score
75 to 79	2	25.5
55 to 59	2	42.5
50 to 54	3	29.0
45 to 49	4	27.5
40 to 44	12	36.0
35 to 39	17	27.0
30 to 34	20	30.5
25 to 29	22	27.5
22 to 24	11	19.0
20 or 21	14	31.0
18 or 19	16	23.0
17	9	30.0
16	2	28.5
15	2	16.0
14	4	21.0
9	1	6.0

N = 141

Scores by age - within females

Age class	n	Median raw
75 to 79	1	25.0
45 to 49	1	22.0
40 to 44	4	36.0
35 to 39	6	27.0
30 to 34	4	30.0
25 to 29	5	19.0
22 to 24	4	18.0
20 or 21	2	24.0
18 or 19	4	19.0
17	2	12.0
16	1	26.0

N = 34

Scores by age - within males

Age class	n	Median raw
75 to 79	1	26.0
55 to 59	2	42.5
50 to 54	3	29.0
45 to 49	3	33.0
40 to 44	8	35.5
35 to 39	11	27.0
30 to 34	16	30.5
25 to 29	17	29.0
22 to 24	7	24.0
20 or 21	12	31.0
18 or 19	12	26.0
17	7	31.0
16	1	31.0
15	2	16.0
14	4	21.0
9	1	6.0

N = 107

Scores by year taken

Year taken	n	median score	protonorm
2001	67	29.0	385
2002	120	23.0	340
2004	1	38.0	433

N = 188

Robustness and overall test quality

Robustness (score trend by time measured in months) = r_{raw scores × months} = -0.22 (n = 188)
Quality (new method) = 3.8

Item analysis

Item statistics are not published as that would help candidates. To detect bad items, answers and comments from candidates are studied, as well as, for each problem, the correlation with total score on the remaining problems (item-rest correlation) and the proportion of candidates getting it wrong (hardness of the item). Possible bad items are revised, replaced, or removed, possibly resulting in a revised version of the test.

[More statistical reports]