Which video is better? D-PAC allows educators to assess videos in an easy and credible way

Different types of media such as video, audio, or images are increasingly used for the assessment of students’ competences. However, as they allow for a large variation in performance between students, the process of grading is rather difficult. The online tool D-PAC aims to support educators in the assessment of video and images.

 

In D-PAC, students can easily upload their work in any media type (text, audio, image, video), after which the work is presented in randomly selected pairs to the assessors. The only task for assessors is to choose which one of the two is best, using their own expertise. Assessors find it easy to make such comparative judgements because they are not forced to score each work on a (long) list of criteria. Each work is presented multiple times to multiple assessors, resulting in a scale in which students’ worked is ranked according to its quality. 

 

‘Working with D-PAC was really easy and fast.’

Ivan Waumans, KDG University College

Recently, D-PAC has been used in a Bachelor Multimedia and Communication Technology for the assessment of students’ animation skills. Students received an audio fragment of the radio play by ‘Het Geluidshuis’ and had to accompany it with animation. A group of 9 assessors evaluated the quality of the animations. The assessors differed in background and expertise: 3 people from Het Geluidshuis, 2 expert animators, 2 alumni students, and 2 teachers.

 

For Ivan Waumans, coordinator of the course, working with D-PAC was really easy and fast. ‘About 2 hours after I sent the login information to the assessors I got an email from one of them saying: Done!’ Assessors valued that they could do the evaluations from their homes or offices. Some assessors did all the comparisons in one session, whereas others spread their comparisons over a few days. None of them had any trouble using or understanding D-PAC. The only difficulty the assessors experienced was when they had to choose between 2 videos that were of equal quality. Ivan had to reassure them that it was OK to just pick one of them, because the tool generates the same ability score for videos of equal quality. Ability scores represent the likelihood that a particular video will win from others. Based upon these scores the tool provides a ranking order in which videos are ordered from poor to high quality. 

 

video
Assessors evaluated the quality of animations using pairwise comparisons in D-PAC

 

‘After explaining comparative judgement, students accepted their grade’

Ivan and his team assigned grades to the animations based upon the order and ability scores. As there were gaps between ability scores, the final grades were not equally distributed over the ranking order. For instance, the top 2 videos got 18/20 and 16/20. Teachers were happy with this more objective grading system. ‘When I look at certain videos and their grade, I notice that I would have given a higher or lower grade depending on my personal taste or the relation with the students’, Ivan explained. He experienced that by including external people in the evaluation, this bias was eliminated. There were only 2 students who were a bit disappointed about the grade they received. But after explaining the procedure of comparative judgement, they accepted their grade. The fact that 9 people contributed in ranking the videos, instead of only one teacher, convinced them the grade was fair.

 

More information

D-PAC allows educators to assess students’ performance in video or images in a more reliable and credible way, without increasing the workload of teachers.

Want to find out more? Send us an e-mail.

 

Media & Learning Newsletter

This blog has been published in the newsletter of Media & Learning:

Screen Shot 2017-03-09 at 11.20.47

Noorderburen waarderen onze expertise

Onder het motto ‘samen professionaliseren’ pikte hogeschool Zuyd (Heerlen, Sittard, Maastricht) het D-PAC-verhaal op. Geboeid door onze kennis en ervaring op vlak van professioneel beoordelen en peer-assessment, willen ze ook anderen binnen de hogeschool inspireren en stimuleren.

In samenwerking met Dominique Sluijsmans (lector Professioneel Beoordelen, Zuyd) en Judith van Hooijdonk (I-team, Zuyd) is begin deze week een blog over D-PAC als tool voor Technology Enhanced Learning (TEL) verschenen.

blog Zuyd 2

Op de blogpagina van ICT in Onderwijs en Onderzoek @ Zuyd staan overigens ook nog andere zeer interessante nieuwtjes.

Dankjewel I-team om met en voor ons te willen netwerken: benieuwd naar de reacties op deze boeiende blog.

Peer assessment in D-PAC reduces workload for tutors!

A group of 91 students second bachelor of the University of Hasselt in the track physiotherapy had the following task at the end of this year:

    – They had to formulate a clinical research question based on their experience as a physiotherapist;
    – Then they searched for a relevant scientific paper and formulated an answer to the research question based on the article.
    – At last they had to evaluate the article and point out the strengths and weaknesses of their study.

Normally all these papers are evaluated by one or two tutors. These tutors judge the paper by giving a ‘passed’ or ‘failed’ and provided feedback. You can imagine, this results in a substantial workload, especially when more than one task per student needs to be marked.

The tutor was inspired by a presentation about the D-PAC project. At first, the tutor was a bit skeptic. However, the possibilities of the tool were tempting enough to conduct an experiment in which peers would have to judge and comment the papers using the D-PAC tool. Next to this, the tutors evaluate the papers on their traditional manner. Afterwards the judgments and feedback of the students could be compared with the judgement and the feedback of the tutors.

Based on the pairwise comparison data we calculated the Scale Separation Reliability (SSR) for the student evaluations. The SSR was .80 and can be seen as a very reliable scale. To achieve this, 91 students had made 910 comparisons in total, in other words, every paper was compared 20 times.

The feedback students provided was of high quality. The results of a survey conducted by the students supported this statement. Students perceived the D-PAC peer feedback as relevant, honest and legitimate. Because almost every assessor gave feedback on almost every paper they had to compare, each student received feedback of 15 à 20 peers. Students indicated this as an added value of the D-PAC method.

If we compare outcomes of the students’ assessment and the tutors’ pass/fail decisions, we see a high resemblance. As Figure 1 shows, 12 students were given a fail by the tutor (red dots) and they all are located on the left side of the rank order. We can conclude that students can, by using pairwise comparison, evaluate their peers papers as good as tutors on their traditional manner.

rank Joke

However, as you can see, some blue dots remain on the left hand side, meaning that they were judged by students to be of poor quality, whereas tutors considerd them passed. Therefore, the coming year, the tutor will check the 40% lowest ranked papers to verify whether they failed. As such, using this combination of peer review and feedback together with a final check by the tutor, the workload of the tutor is reduced by at least 60% while ensuring the quality of the decision and the feedback.

Testimonial professor architecture

The next film is a testimony of an architecture professor who used D-AC for a peer assessment of mood boards. Because the movie is in Dutch, you can read a short summary of the main findings.





Summary
60 students were divided in groups of five. Each group had to create two mood boards resulting in 20 mood boards. These mood boards were uploaded in the D-PAC tool and the students made ten comparisons at home in which they judged the mood boards of their peers and provided feedback.

These comparisons resulted in a ranking of the poorest to the best mood board. So each group had two mood boards in the ranking. The students had to continue with the mood board that was ranked highest. Therefor they could use the feedback to improve their design.

The teacher used the rank order and the feedback from the students to discuss the results in group. He indicated a large time saving because all the students already had seen the mood boards and formed their opinions. Where normally the discussion of the mood boards lasted a whole day, now it lasted one hour using the rank order. According to the professor without sacrificing quality of the discussion, on the contrary.

Further, the professor indicate to save time in processing the results of the peer assessment afterwards as there was no processing because the results were automatically generated by the tool.

Also according to the professor, the learning effect by students of watching other peers’ work and formulating reasons why one was better than the other, was not to be underestimated.

SOS scoring ‘briefing notes’? College of Europe Bruges tried D-PAC!

In the last months, several D-PAC try-outs have run. In these try-outs, assessments are set up in diverse organizations. For the organizations, the aim is to experiment with D-PAC. For us as a team, the try-outs are valuable to gain information on different aspects of D-PAC: the user-friendliness of the tool, how the tool can be embedded in real life situations and on how information out of D-PAC is used.
A few weeks ago, a try-out ran in College of Europe Bruges. A team of four docents used D-PAC to assess students’ competences regarding ‘briefing notes’. The try-out was especially interesting for the D-PAC team given the small number of assessors and the fact that the rank order would be used in students’ final mark on a course. The reliability of the rank order was sufficient at 0.71 (see Table 1).

Table 1: General statistics

Number of representations 84
Number of comparisons 620
Number of assessors 4
Time/comparison 373 seconds
Reliability 0.71
Misfitting judges 0
Misfitting representations 2

During the try-out, one assessor kept behind with the comparisons that had to be made. At that point, we noticed that the reliability of the rank order was already sufficient with 510 comparisons. The reliability of the rank order did not increase adding 100 comparisons of the specific assessor. Curious about why this was, we investigated the progress of the reliability over time. Figure 1 shows our findings, suggesting that for this specific try-out, 10 comparisons per representation were needed to reach a reliability measure of 0.70. Moreover, the measure of 0.70 turned out to be a border that would be difficult to cross.
blog SOS_Roos
Figure 1: Progress of reliability over time

Additionally, the team was interested in how the docents used the rank order to define the final marks. The head teacher told that they discussed the first and the last representation of the rank order (see figure 2). They decided what score was appropriate for these representations (8/20 for the last one and 18/20 for the first one). Subsequently, they scored the rest of the representations following the rank order with intervals of 0.5 point.

blog SOS2_Roos
Figure 2: Rank order College of Europe Bruges

Asking the teachers of College of Europe Bruges for their experiences with D-PAC, they were very positive. D-PAC was perceived as clear and easy to use. According to the teachers, the method of comparative judgement was straightforward and appropriate for their assessing task. However, the teachers felt the need to provide more information than they could and suggested to include a pass/fail (or ‘very good’/ ‘very bad’) button and a “I cannot choose!” button.

The teachers perceived the time investment of the assessment via D-PAC as more or less the same as in previous assessments using other methods. But, the time investment in D-PAC was considered as better time for money, given the result of a reliable rank order.

Altogether, asking the teachers whether or not they would use D-PAC again for similar assessments in the future, they all agreed: “Yes!”. To conclude: the try-out partnering College of Europe Bruges turned out to be fruitful, both in terms of research findings as in terms of unrolling D-PAC in practice!