Classical analysis of item scores

16 important questions on Classical analysis of item scores

When you make a distribution of scores of the total sample, on a given item in which the answer mode is scaled, by which concepts can this distribution be described?

  • location
  • dispersion
  • shape

How do we call the place on the scale where the distribution of the item scores is centered?

the location

How do we call the scatter of the item scores in a distribution, de verdeling?

the dispersion
  • Higher grades + faster learning
  • Never study anything twice
  • 100% sure, 100% understanding
Discover Study Smart

What can be determined using the location of the item score distribution?

  • the classical item difficulty, in maximum performance tests, or
  • the classical item attractiveness, in typical performance tests

How do you calculate the mean of a dichotomotously scored item, and how is this expressed?

  • sum all the scores, divide by total sample
  • p = the proportion who answered the question correct

What are the steps of item analysis?

  • lower bound of the test reliability is determined
  • items are rewritten or removed
  • stop when a criterium is reached, For instance:
  • – A certain reliability (E.g., 0.8)
  • – A certain number of items (E.g., 25)

In a maximum performance test, how do we call items that a lot of test takers answer incorrect?

a difficult item

In a typical performance test, how do we call items that a lot of test taker respond low to?

unattractive items

How do we know the item difficulty/attractiveness?

the item difficulty/item attractiveness are equal to the item means

When creating a test, there are two sets of guidelines, describe them.


  • – General measurement instrument:

• “something for everyone”
– = you should have a proportional number of easy/difficulty
attractive/unattractive items
  • – Instrument for cut-off decisions (e.g., hiring a new employee):
  • • Only consider items with a difficulty/attractiveness that is relevant for the required decision
  • • E.g., if selecting a manager, dont ask simple arithmatic questions
  • • E.g., if selecting highly depressed subjects, dont ask too attractive questions (“I sometimes feel sad”)

What results in high test reliability?

  • Large item correlations result in high reliability
  • Items with larger variances contribute more to the reliability

How do we call the concept that describes How well can a given item distinguish between people that differ on the underlying construct, So: how well can a given item predict the construct?

item discrimination

What does it mean when items discriminate well?

They can distinguish well between people that differ on the underlying construct

When using the item-test correlation, why is the correlation biased upwards?

  •   because you're also correlating the item with itself, so even bad items would get some correlation.

How can you fix the fact that you're biasing upwards when using the item-test correlation?

use the item-rest correlation

Why are items with a high variance more useful than items with a low variance?

items with a high variance distinguish between the ability/behavior of test takers better.

The question on the page originate from the summary of the following study material:

  • A unique study and practice tool
  • Never study anything twice again
  • Get the grades you hope for
  • 100% sure, 100% understanding
Remember faster, study better. Scientifically proven.
Trustpilot Logo