Judging Eurovision: how to fairly compare incomplete ranks & scores
The Eurovision Song Contest is an expression of centuries of European geo-politics and rivalry disguised as a friendly song competition, bizarrely also featuring Israel, Azerbaijan and Australia.
A friend (definitely not me) recently hosted a Eurovision party, where merrymakers completed official BBC Eurovision 2017 Party Pack Grand Final scorecards. The question now: how do we compare our scores against the final results?
Seven people have entered the competition: SB, JMS, Chris, Lea, Jon, M1 & M2. M1 and M2 are mystery people who were so embarrassed by the concept that they couldn’t bring themselves to write their names.
The most obvious method of scoring is to compare the top three scores of each person. Sadly only one person has a match so we can declare him the winner. Hurray.
This is not a very good method because a) it only considers 3 out of the 26 results, b) it only uses the rankings and ignores the scores.
Winner: JMS (me). Hurray!
Top three picks in ground truth top five?
It feels fairer to see how many of the true top three songs were in each persons top five. This is similar to the scoring of the classification aspect of the ImageNet competitions.
|In top 5||Points|
I feel that this is fairer because we have widened the comparison to consider more positions, however it still only uses the rankings.
Considering the number of votes vs final position, I first normalised each person’s score card so that their average score was 10. This seemed like a fair way of dealing with null values.
We see that the real scores are heavily weighted towards the top few countries. This is clearer if we only look at the final scores (GT score), the jury scores and the public vote. You can see how the jury vote and the public vote are combined to make the final score:
Due to the method by which the Eurovision final score is calculated (two scores added, each of those scores is a vote), I don’t think that the final score should be considered as a fair representation of how good the song is: we shouldn’t conclude that Portugal’s entry was twice as good as Moldova’s.
Next time, we’ll consider some better methods for judging our scorers. Hopefully this will show that I won!