Item scoring recommendations

© Paul Cooijmans

On item scoring

When creating and scoring psychometric tests, one should strive to make items that are atomic and can be scored dichotomously, typically with 0 for wrong and 1 for right. In practice, this means that complex problems that have multiple aspects have to be split up into a several atomic items that are scored separately. This has the following advantages:

  1. The respective aspects of complex problems can be item-analysed separately, giving insight into their functioning;
  2. Complex problems with multiple aspects naturally receive greater weight in the test due to the separate scoring of their aspects;
  3. Test reliability can be computed using Crohnbach's alpha, which is considered the best indicator of reliability.

To further explain the last point: Crohnbach's alpha requires uniform scoring of all items (such as from 0 to 1) and becomes meaningless when some items are scored differently from others. Crohnbach's alpha may even exceed 1 in the latter case. In the ideal case that all items are strictly dichotomous (0 or 1) and none give fractional values, a simplified formula for Crohnbach's alpha can be used that gives the same result but is mathematically simpler (one of the "Kuder-Richardson" formulas).

For tests with non-uniform item scoring, so where different items yield different amounts of credit, Crohnbach's alpha can not be used, and reliability is computed using the split-half method instead, which is robust against non-uniform and non-dichotomous scoring.

- [More statistics explained]

The Imperial Seal