Statistics of Test of Shock and Awe

Introduction

Remark: This test is no longer used in its own right but part of Cartoons of Shock, which later became part of the Bonsai Test - Revision 2016. These statistics are from the period before the test was incorporated therein.

Scores on Test of Shock and Awe as of 11 September 2024

Contents type: Verbal, numerical. Period: 2003-2005

n: 23
Median: 20.0
Quartile deviation: 3.5
Range: 22
Maximum possible: 33
Male median: 20.0 (n = 21)
Female median: 17.0 (n = 2)
Proportion of males among candidates: 0.913
Hardness: 0.43
Resolution: 0.38

5	*
10	*
12	*
13	*
14	*
15	**
16	*
17	**
20	**
21	**
22	****
24	**
25	*
26	**

Correlation of Test of Shock and Awe with other mental ability tests

Test name	n	r
Cartoons of Shock	11	0.92
Cooijmans Intelligence Test - Form 1	4	0.90
The Sargasso Test	6	0.89
Qoymans Multiple-Choice #3	6	0.89
Spatial Insight Test	6	0.87
Analogies #1	7	0.87
The Final Test	17	0.79
Non-Verbal Cognitive Performance Examination (Xavier Jouve)	6	0.77
Sigma Test (Melão Hindemburg)	4	0.77
The Nemesis Test	6	0.73
Qoymans Multiple-Choice #1	5	0.72
Space, Time, and Hyperspace	11	0.65
Long Test For Genius	7	0.63
Analogies of Long Test For Genius	8	0.63
Reason	6	0.59
Wechsler Adult Intelligence Scales	4	0.57
Qoymans Multiple-Choice #4	9	0.57
Miscellaneous tests	9	0.56
Strict Logic Sequences Exam I (Jonathan Wai)	7	0.55
Test of the Beheaded Man	6	0.55
Bonsai Test	9	0.54
Association subtest of Long Test For Genius	7	0.53
The Test To End All Tests	10	0.48
Qoymans Multiple-Choice #5	5	0.47
Reason Behind Multiple-Choice - Revision 2008	5	0.44
Test For Genius - Revision 2004	9	0.42
Reason Behind Multiple-Choice	5	0.42
Cooijmans Intelligence Test - Form 2	7	0.41
Verbal section of Test For Genius - Revision 2004	9	0.41
Spatial section of Test For Genius - Revision 2004	9	0.39
Reason - Revision 2008	5	0.37
Logima Strictica 36 (Robert Lato)	6	0.37
Associative LIMIT	7	0.37
Numbers	7	0.35
Genius Association Test	14	0.31
Lieshout International Mesospheric Intelligence Test	9	0.29
Mega Test (Ronald K. Hoeflin)	5	0.22
Odds	4	0.21
Tests by Greg Grove (aggregate)	6	0.13
Titan Test (Ronald K. Hoeflin)	6	-0.05
Strict Logic Sequences Exam II (Jonathan Wai)	4	-0.28

Weighted average of correlations: 0.532 (N = 293)

Estimated g factor loading: 0.73

Ranking in above table is based on the unrounded correlations. All available data is present in this table, no tests are left out except for those with less than 4 score pairs. All known pairs are used, including possible floor/ceiling scores or outliers.

Estimated loadings of Test of Shock and Awe on particular item types

These are estimated g factor loadings, but against homogeneous tests (containing only particular item types) as opposed to non-compound heterogeneous tests. Although tending to surprise the lay person, it is not uncommon for tests to have high loadings on item types they do not actually contain themselves. Such loadings reflect the empirical fact that most tests for mental abilities measure primarily g, regardless of their contents; that the major part of test score variance is caused by g, and only a minor part by factors germane to particular item types. It is of key importance to understand that this is a fact of nature, a natural phenomenon, and not something that was built into the tests by the test constructors.

Type	n	g loading of Test of Shock and Awe on that type
Verbal	97	0.77
Numerical	18	0.63
Spatial	35	0.73
Logical	11	0.70
Heterogeneous	70	0.78

N = 231

Compound tests have been left out of this table to avoid overlap.

Balanced g loading = 0.72

National medians for Test of Shock and Awe

Country	n	median score
Finland	4	18.0
United_States	7	17.0

For reasons of privacy, only countries with 3 or more candidates are included in this table. Ranking is based on the medians, and then alphabetic.

Correlation with national I.Q.'s of Test of Shock and Awe

Correlation of this test with national average I.Q.'s published by Lynn and Vanhanen, later Lynn and Becker:

r = 0.31 (n = 21)

Correlation of Test of Shock and Awe with personal details

Personalia	n	r
Educational level	18	0.66
Sex	23	0.09
Disorders (own)	19	0.07
Observed behaviour	7	0.06
Father's educational level	16	0.05
Disorders (parents and siblings)	18	-0.28
Mother's educational level	16	-0.33
Year of birth	23	-0.39
Gifted Adult's Inventory of Aspergerisms	9	-0.51

Estimated g factor loadings for restricted ranges

In parentheses the number of score pairs on which that estimated g factor loading is based. The goal of this is to verify the hypothesis that g becomes less important, accounts for a smaller proportion of the variance, at higher I.Q. levels. The mere fact of restricting the range like this also depresses the g loading compared to computing it over the test's full range, so it would be normal for these values to be lower than the test's full-range g loading.

Below 1st quartile	0.30 (49)
Below median	0.66 (148)
Above median	0.55 (160)
Above 3rd quartile	0.40 (90)

Reliability

Split-half (odd-even) = 0.93
Split-half (1, 2… 5, 6… vs. 3, 4… 7, 8…) = 0.89
Cronbach's alpha = 0.87

Error

Standard error = 2.0 raw score points

Scores by age

Age class	n	Median score
50 to 54	3	16.0
45 to 49	3	22.0
40 to 44	4	21.0
35 to 39	4	23.0
30 to 34	1	17.0
25 to 29	4	17.5
22 to 24	1	14.0
17	1	15.0
16	1	17.0
14	1	10.0

N = 23

Scores by year taken

Year taken	n	median score	protonorm
2003	7	21.0	433
2004	8	18.0	400
2005	6	20.5	424
2008	1	24.0	493
2012	1	5.0	262

r_{year taken × median score} = -0.71 (N = 23)

Robustness and overall test quality

Robustness by month = 0.81 (r_{raw scores × months} = -0.31)
Quality = 0.629

Item analysis

Item statistics are not published as that would help candidates. To detect bad items, answers and comments from candidates are studied, as well as, for each problem, the correlation with total score on the remaining problems (item-rest correlation) and the proportion of candidates getting it wrong (hardness of the item). Possible bad items are revised, replaced, or removed, possibly resulting in a revised version of the test.

Correlations between sections (internal consistency versus profile information)

Verbal × Numerical

0.50

Ideal values for correlations between sections are around .5, thus being a compromise between the test's ability to yield a "profile" and its ability to provide an indication of general intelligence. With a too high correlation (like .8 or higher) the sections measure basically the same so there is almost no profile information in them, with a too low correlation (like .2 or lower) the sections are so different that there is little point in combining them into a measure of general intelligence.

Section frequency tables

Prop. = proportion of candidates outscored in this section. In parentheses the proportion outscored for any possible scores higher than the present score but lower than the next-higher score in the table.

Verbal

Score	Prop.	# scores (* = 1 score)
3	0.022 (0.043)	*
4	0.087 (0.130)	**
6	0.196 (0.261)	***
9	0.326 (0.391)	***
11	0.413 (0.435)	*
12	0.522 (0.609)	****
13	0.674 (0.739)	***
14	0.761 (0.783)	*
15	0.826 (0.870)	**
16	0.891 (0.913)	*
17	0.957 (1.000)	**

Numerical

Score	Prop.	# scores (* = 1 score)
2	0.022 (0.043)	*
4	0.087 (0.130)	**
6	0.152 (0.174)	*
7.5	0.196 (0.217)	*
8	0.326 (0.435)	*****
9	0.717 (1.000)	*************

[More statistical reports]