Item scoring recommendations

© Paul Cooijmans

On item scoring

When creating and scoring psychometric tests, one should strive to make items that are atomic and can be scored dichotomously, typically with 0 for wrong and 1 for right. In practice, this means that complex problems that have multiple aspects have to be split up into several atomic items that are scored separately. This has the following advantages:

  1. The respective aspects of complex problems can be item-analysed separately, giving insight into their functioning;
  2. Complex problems with multiple aspects naturally receive greater weight in the test due to the separate scoring of their aspects;
  3. Test reliability can be computed using Crohnbach's alpha, which is considered the best indicator of reliability.

To further explain the last point: Crohnbach's alpha requires uniform scoring of all items (such as from 0 to 1) and becomes meaningless when some items are scored differently from others. Crohnbach's alpha may even exceed 1 in the latter case. In the ideal case that all items are strictly dichotomous (0 or 1) and none give fractional values, a simplified formula for Crohnbach's alpha can be used that gives the same result but is mathematically simpler (one of the "Kuder-Richardson" formulas).

For tests with non-uniform item scoring, so where different items yield different amounts of credit, Crohnbach's alpha can not be used, and reliability is computed using the split-half method instead, which is robust against non-uniform and non-dichotomous scoring.

- [More statistics explained]

The Imperial Seal