Statistics of Female Intelligence Test

Remark on the norms

Above are the preliminary norms for this test, which, in hindsight, has no significant correlation with I.Q. test scores and does not measure g in this small sample, even has a very slight negative g loading (meaning that higher scores correspond to lower I.Q.'s, but with a very shallow gradient). This may be a result of the test's construction; it consisted only of tasks for those ability types at which females are known to outscore males. Making a test female-friendly appears to impede g loading. Leaving out either or both of the two outlying scores (-152 and 120) does not significantly change the test's lack of correlation with I.Q. scores.

Scores on Female Intelligence Test as of 19 September 2019

Contents type: Verbal, numerical, spatial, speeded mental task, clerical accuracy, dexterity/motoric. Period: 2003-2015

n: 10
Median: 58.5
Quartile deviation: 36.0
Range: 272
Maximum possible: 145
Minimum possible: -200
Male median: 65.0 (n = 6)
Female median: 32.5 (n = 4)
Resolution: 1.22

-152	*
-8	*
3	*
47	*
55	*
62	*
67	*
75	*
76	*
120	*

Note: This test was supervised, timed, and prohibited the use of reference aids. In practice it has always been administered unsupervised though, so that the possibility of fraud existed (for instance, using more time than reported, or using reference aids). Where relevant, score reports mention that the test was self-timed. Leaving out possibly fraudulent outliers does not significantly affect most of the statistics in this report though.

Scores by males

n = 6

-152	*
47	*
55	*
75	*
76	*
120	*

Scores by females

n = 4

-8	*
3	*
62	*
67	*

Correlation of Female Intelligence Test with other tests by Paul Cooijmans

(Test index) Test name	n	r
(11) Isis Test	2	1.00
(16) Lieshout International Mesospheric Intelligence Test	2	1.00
(66) Test For Genius - Revision 2004	2	1.00
(28) The Test To End All Tests	3	0.97
(7) The Final Test	3	0.93
(10) Genius Association Test	3	0.92
(21) Psychometric Qrosswords	3	0.81
(35) Intelligence Quantifier by assessment	4	0.41
(26) Verbal section of Test For Genius - Revision 2004	3	0.03
(79) Association subtest of Long Test For Genius	3	-0.30
(15) Letters	3	-0.37
(41) The LAW - Letters And Words	3	-0.74
(29) Words	3	-0.94
(75) Analogies of Long Test For Genius	3	-0.97
(3) Qoymans Multiple-Choice #5	2	-1.00
(24) Reason - Revision 2008	2	-1.00
(40) Reason Behind Multiple-Choice - Revision 2008	2	-1.00
(53) Qoymans Multiple-Choice #3	2	-1.00
(56) Short Test For Genius	2	-1.00
(57) Space, Time, and Hyperspace	2	-1.00
(63) Long Test For Genius	2	-1.00
(68) Numbers	2	-1.00

Weighted average of correlations: -0.131 (weighted sum: -7.34)

Conservatively estimated minimum g loading: -0.36

Ranking in above table is based on the unrounded correlations. All available data is present in this table, no tests are left out except for those with less than 2 score pairs. All known pairs are used, including possible floor/ceiling scores or outliers.

Correlation of Female Intelligence Test with tests by others

(Test index) Test name	n	r
(220) Cattell Culture Fair	2	1.00
(229) Mega Test	2	1.00
(236) International High IQ Society Miscellaneous tests	4	0.87
(242) Unknown and miscellaneous tests	6	-0.01
(235) Nonverbal Cognitive Performance Examination	3	-0.09
(231) Mysterium Entrance Exam	3	-0.43
(225) Logima Strictica 36	2	-1.00
(241) Ultra Test	2	-1.00

Weighted average of correlations: 0.079 (weighted sum: 1.89)

Please be aware that correlations with these external tests are in most cases affected (depressed, typically) by one or more of the following: (1) Little overlap with the object test because of the much lower ceilings and inherent ceiling effects of the tests used in regular psychology; (2) Candidates reporting scores selectively, for instance only the higher ones while withholding lower ones; (3) Candidates reporting, or having been reported by psychometricians, incorrect scores.

Estimated loadings of Female Intelligence Test on particular item types

These are estimated g factor loadings, but against homogeneous tests (containing only particular item types) as opposed to non-compound heterogeneous tests. Although tending to surprise the lay person, it is not uncommon for tests to have high loadings on item types they do not actually contain themselves. Such loadings reflect the empirical fact that most tests for mental abilities measure primarily g, regardless of their contents; that the major part of test score variance is caused by g, and only a minor part by factors germane to particular item types. It is of key importance to understand that this is a fact of nature, a natural phenomenon, and not something that was built into the tests by the test constructors.

Type	g loading of Female Intelligence Test on that type
Verbal	-0.30
Numerical	-1.00
Spatial	0.00
Logical	-1.00
Heterogeneous	0.00

Compound tests have been left out of this table to avoid overlap.

Balanced g loading = -0.46

National medians for Female Intelligence Test

Country	n	median score
United_Kingdom	2	58.5
United_States	3	3.0

For reasons of privacy, only countries with 2 or more candidates are included in this table. Ranking is based on the medians, and then alphabetic.

Correlation with national I.Q.'s of Female Intelligence Test

Correlation of this test with national average I.Q.'s published by Lynn and Vanhanen:

r = 0.00 (n = 9)

Correlation of Female Intelligence Test with personal details

Personalia	n	r
Observed associative horizon	4	0.79
P.S.I.A. Cruel	3	0.57
P.S.I.A. System factor	3	0.43
P.S.I.A. Orderly	3	0.39
P.S.I.A. Just	3	0.20
Disorders (own)	7	0.15
Disorders (parents and siblings)	7	0.07
Sex	10	0.04
P.S.I.A. True	3	-0.02
Educational level	7	-0.07
P.S.I.A. Antisocial	3	-0.10
P.S.I.A. Rare	3	-0.15
P.S.I.A. Ethics factor	4	-0.17
Gifted Adult's Inventory of Aspergerisms	3	-0.27
P.S.I.A. Deviance factor	4	-0.29
P.S.I.A. Introverted	3	-0.37
P.S.I.A. Cold	3	-0.38
P.S.I.A. Extreme	3	-0.39
P.S.I.A. Aspergoid	3	-0.43
Year of birth	10	-0.45
P.S.I.A. Neurotic	3	-0.45
P.S.I.A. Rational	3	-0.46
Father's educational level	6	-0.80
Mother's educational level	6	-0.90
Observed behaviour	3	-0.92

Estimated g factor loadings upward and downward of particular scores

In parentheses the number of score pairs on which that estimated g factor loading is based. The goal of this is to verify the hypothesis that g becomes less important, accounts for a smaller proportion of the variance, at higher I.Q. levels. The mere fact of restricting the range like this also depresses the g loading compared to computing it over the test's full range, so it would be normal for both values to be lower than the test's full-range g loading.

Raw score	Upward g (n)	Downward g (n)
-200	-0.36 (56)	NaN (0)
22.5	-0.69 (27)	-1.00 (14)
58.5	0.00 (4)	-0.25 (37)
145	NaN (0)	-0.36 (56)

Reliability

Split-half (odd-even) = 0.97
Split-half (other division) = 0.10

This low reliability excludes sizeable correlations of the test with any other variable.

(Cronbach's alpha can not be meaningfully computed for this test because the items (sections) give scores outside the range of 0 to 1.)

Error

Standard error = 49.7 raw score points

This huge error implies that scores on this test are almost meaningless.

Scores by age

Age class	n	median score
50 to 54	2	32.5
45 to 49	1	55.0
40 to 44	1	67.0
35 to 39	2	61.0
25 to 29	2	34.0
18 or 19	2	-16.0

Scores by year taken

Year taken	n	median score
2003	5	67.0
2004	1	-8.0
2006	1	62.0
2010	1	120.0
2011	1	-152.0
2015	1	3.0

r_{year taken × median score} = -0.29 (n = 10)

Robustness and overall test quality

Robustness by chronological rank = 0.85
Robustness by month = 0.84 (r_{raw scores × months} = -0.41)
Quality = 0.642

That fact that overall quality is more or less reasonable despite the test's inability to measure intelligence is due to the very high values for resolution and robustness; used for admission, this test has caused no inflation of qualifiers. If one wanted to admit candidates based on some sort of mental test but without the under-representation of women that one sees in the high range, an approach like this might work, and in fact some television games and quizzes seem to purposely subject candidates to g-deprived tasks like these in order have more females among their finalists and winners. Do notice that the male/female participation ratio is 1.5 on this test, while it is otherwise about 11 on high-range tests, so these tasks do draw more female candidates, and also seem to allow for a more equal male/female ratio beyond pass levels like the 99.9th centile.

Correlations of sections with total score

Section	r
1. Word fluency	.87
2. Arithmetic (mental)	.07
3. Reading (answering questions about a text)	.04
4. Writing (hand-copying)	.95
5. Spelling and grammar (correcting errors)	.94
6. Searching letters (clerical checking)	.06
7. Matching figures (finding matching figures)	-.01
8. Drawing (dexterity)	.15
9. Maze (dexterity)	.14
10. Speed (reverse of time used)	.46

Some information on scores per section

1. Word fluency

Ranges from 38 to 63. The large range of scores in this section probably explains the high correlation with total score.

2. Arithmetic

Ranges from 43 to 48.

3. Reading

Ranges from 8 to 10.

4. Writing

Ranges from -133 to 0. The large range of scores in this section probably explains the very high correlation with total score.

5. Spelling and grammar

Ranges from -54 to 14. The large range of scores in this section probably explains the very high correlation with total score.

6. Searching letters

Ranges from -10 to 0.

7. Matching figures

Ranges from 1 to 10.

8. Drawing

Ranges from -2 to 0.

9. Maze

Ranges from -11 to 0.

10.Speed

Ranges from -90 to -21. The optimum seems to be just over -60 (so just under one hour). The shortest realistic time is about 50 minutes (-50).