To facilitate statistical processing, data from high-range I.Q. tests need to be stored in a structured way. Below, an example of such a structure is described in general terms.
The tests themselves, their correct answers, and the programs that need to be written to combine and process data from these sections are not considered here.
Section 1. | Section 2. | Section 3. | Section 4. | Section 5. | |
---|---|---|---|---|---|
# Units | 1 | 1 | 1 for each test | 1 for each test | 1 |
# Records | 1 for each test or personal datum | 1 for each candidate | 1 for each test submission | 1 for each possible score | 1 for each norm |
Fields | Contain descriptive information on the test or personal datum | Contain personal information and test scores of the candidate | Contain (personal) information on that test submission, and the submission's item scores | Contain the norm for that score | Contain corresponding values on other scales for that norm |
For each field (test or personal datum) that occurs in the Candidate records (2.), this section (1.) has a record containing descriptive information on that field, needed when producing statistical reports. Examples of data fields of the records of this section:
These fields are only filled in where applicable; some records may not need all of these fields.
For each test candidate that occurs in the Test submission records (3.), this section (2.) has a record containing fields with personal information, and containing a field for each test that has been taken. The latter fields indicate at least whether the test has been taken, but may also contain the score instead (even though the score is redundant as it can also be fetched from the Test submission records, where it in turn is redundant as well). The Candidate records theoretically have several hundreds of fields (because there exist that many tests that may have been taken), but in most of the records not more than a few dozen of fields are occupied (because most candidates have not taken more than a few dozen of tests).
Redundant information like the test scores may be included to speed up processing. Statistical computations tend to be complex and require large numbers of simple calculations to be performed recursively. If basic data like scores have to computed dynamically whenever needed, this may slow down the most complex computations greatly.
Caution: Candidate records are privacy-sensitive, and this can not be resolved by anonymizing the records because a candidate's combination of scores on a number of tests is as unique and identifying as a fingerprint. Only three, probably even two scores suffice to uniquely identify a candidate, even in the absence of any personal information at all.
This is a complex section which contains a unit (equivalent to a table) for each test. Each unit therefore corresponds to a record in section 1, a unit in section 4., and a field in section 2. Each unit contains a record for each submission to that test. These records contain fields such as:
This complex section contains a unit for each test. Such a unit contains a record for every possible raw score on the test in question, providing a norm, either from a table or by means of a formula, or a combination thereof. Every norm corresponds to a record in 5.
This section contains a unit with for each norm - whichever type of norms one uses - a record containing that norm and the corresponding values on a few other scales, such as proportions outscored with regard to certain populations, I.Q., and so on.