European Survey Research AssociationEuropean Survey Research Association
 
Home About us Membership Conferences Journal Courses Minutes Contact

Login to your account:

Sign up | Reset password

Conferences

Conferences


ESRA2009: Conference main page | Overview of sessions | Time table

Warsaw 2009: Presentations and short courses


IRT models in the assessment of the rater effect – the case of Polish external examinations

Session: IRT: Item Response Theory in Survey Methodology (II)

Authors:

  • Henryk Szaleniec; Centralna Komisja Egzaminacyjna, Poland
  • Dorota Weziak-Bialowolska; Warsaw School of Economics, Poland

Abstract:

The aim of the paper is to present Polish experience in the assessment of the rater effect in the external examinations. The rater effect was measured by: (1) examining the dispersion of rater severity/leniency and (2) checking if there was a correlation between the amount of rater effect and chosen traits of raters.
The IRT model – namely the Many-Facet Rasch Model – was used to explore and measure the rater effect. At first, the difficulty level of tasks was assessed using the real data – 10% sample of whole population of students. Then, the abilities of students as well as the levels of severity of raters were quantified with difficulty of tasks anchored. Applied data was obtained from a tailor-made research that was launched by Polish Central Examination Office in 2008. The examiners rated the real exam-sheets in two separate modes: at home – paper version of the exam and once in the examinations center using tailored made rating system – e-marking. Each task/exam was rated at least five times.
The aim of that research was to verify the hypothesis that the mode of rating (paper exam versus e-marking) influenced the rater severity/leniency. It was also verified whether the mode of rating affects the results.
Additionally it was verified if the rater effect is connected with the aptitude towards change (novelty). This aptitude was measured by several 5-point Likert scale statements and quantified by confirmatory factor analysis.