Judging Eurovision: how to fairly compare incomplete ranks & scores

The Eurovision Song Contest is an expression of centuries of European geo-politics and rivalry disguised as a friendly song competition, bizarrely also featuring Israel, Azerbaijan and Australia.

A friend (definitely not me) recently hosted a Eurovision party, where merrymakers completed official BBC Eurovision 2017 Party Pack Grand Final scorecards. The question now: how do we compare our scores against the final results?

Eurovision scorecards

Completed BBC scorecards, each using a different scale and with frequent missing scores.

Seven people have entered the competition: SB, JMS, Chris, Lea, Jon, M1 & M2. M1 and M2 are mystery people who were so embarrassed by the concept that they couldn’t bring themselves to write their names.


The most obvious method of scoring is to compare the top three scores of each person. Sadly only one person has a match so we can declare him the winner. Hurray.

Truth SB JMS Chris Lea Jon M1 M2
Portugal Azerbaijan Portugal Romania Sweden Germany Romania Sweden
Bulgaria Sweden Armenia Ukraine Spain Portugal Moldova Moldova
Moldova Portugal The N’lands Belgium Norway Israel Norway Norway

This is not a very good method because a) it only considers 3 out of the 26 results, b) it only uses the rankings and ignores the scores.

Winner: JMS (me). Hurray!

Top three picks in ground truth top five?

It feels fairer to see how many of the true top three songs were in each persons top five. This is similar to the scoring of the classification aspect of the ImageNet competitions.

In top 5 Points
SB Portugal, Moldova


JMS Portugal




Lea Portugal


Jon Portugal


M1 Moldova


M2 Moldova


I feel that this is fairer because we have widened the comparison to consider more positions, however it still only uses the rankings.

Winner: SB

Series analysis

Considering the number of votes vs final position, I first normalised each person’s score card so that their average score was 10. This seemed like a fair way of dealing with null values.

Eurovision 2017 normalised scores

Fairly poor correlation between the entrants and the real final score (GT score).

We see that the real scores are heavily weighted towards the top few countries. This is clearer if we only look at the final scores (GT score), the jury scores and the public vote. You can see how the jury vote and the public vote are combined to make the final score:


The final score (GT score), jury score (GT jury) and public vote (GT vote).

Due to the method by which the Eurovision final score is calculated (two scores added, each of those scores is a vote), I don’t think that the final score should be considered as a fair representation of how good the song is: we shouldn’t conclude that Portugal’s entry was twice as good as Moldova’s.

Next time, we’ll consider some better methods for judging our scorers. Hopefully this will show that I won!