A group of 91 students second bachelor of the University of Hasselt in the track physiotherapy had the following task at the end of this year:
– They had to formulate a clinical research question based on their experience as a physiotherapist;
– Then they searched for a relevant scientific paper and formulated an answer to the research question based on the article.
– At last they had to evaluate the article and point out the strengths and weaknesses of their study.
Normally all these papers are evaluated by one or two tutors. These tutors judge the paper by giving a ‘passed’ or ‘failed’ and provided feedback. You can imagine, this results in a substantial workload, especially when more than one task per student needs to be marked.
The tutor was inspired by a presentation about the D-PAC project. At first, the tutor was a bit skeptic. However, the possibilities of the tool were tempting enough to conduct an experiment in which peers would have to judge and comment the papers using the D-PAC tool. Next to this, the tutors evaluate the papers on their traditional manner. Afterwards the judgments and feedback of the students could be compared with the judgement and the feedback of the tutors.
Based on the pairwise comparison data we calculated the Scale Separation Reliability (SSR) for the student evaluations. The SSR was .80 and can be seen as a very reliable scale. To achieve this, 91 students had made 910 comparisons in total, in other words, every paper was compared 20 times.
The feedback students provided was of high quality. The results of a survey conducted by the students supported this statement. Students perceived the D-PAC peer feedback as relevant, honest and legitimate. Because almost every assessor gave feedback on almost every paper they had to compare, each student received feedback of 15 à 20 peers. Students indicated this as an added value of the D-PAC method.
If we compare outcomes of the students’ assessment and the tutors’ pass/fail decisions, we see a high resemblance. As Figure 1 shows, 12 students were given a fail by the tutor (red dots) and they all are located on the left side of the rank order. We can conclude that students can, by using pairwise comparison, evaluate their peers papers as good as tutors on their traditional manner.
However, as you can see, some blue dots remain on the left hand side, meaning that they were judged by students to be of poor quality, whereas tutors considerd them passed. Therefore, the coming year, the tutor will check the 40% lowest ranked papers to verify whether they failed. As such, using this combination of peer review and feedback together with a final check by the tutor, the workload of the tutor is reduced by at least 60% while ensuring the quality of the decision and the feedback.