2017 flights


No prizes for guessing which city I have to be in, and which city I want to be in! A special mention to Montreal for (just) taking me outside of UK & US!


My 2017 flights, slightly transparent pen so that repeated routes stand out.


Constable and Turner visit Nevada

Some recently unearthed masterpieces from J M W Turner and John Constable‘s visits to Nevada. The fountains at the Bellagio are older than I thought! See below for a short explanation of neural style transfer.


Turner (left) and Constable must have sat side-by-side at the Bellagio. It was a good compromise between Turner’s love of the sea and Constable’s love of musically synced fountains.

Constable later went up to Tahoe and was rightly inspired by the scenery.


Lake Tahoe. I don’t think Constable’s very good at doing skies when there aren’t many clouds.

Neural style transfer

Neural style transfer uses three images: C, a content image, S, a style image and G, a generated image (which starts as C + noise). The loss function combines a content loss and a style loss. The content loss compares the pixel values of C and G at each layer of a pre-trained CNN (here VGG-19). This compares the similarity of each image’s content. The style loss compares the ratio of different filter activations in G and S. This compares the look and feel of the image. The pixels of G are altered by gradient descent to minimise the combined loss.

The content images I used were The Fighting Temeraire tugged to her last berth to be broken up, Turner’s 1838 oil painting showing the inevitable march of progress and Constable’s 1820 Salisbury Cathedral from Lower Marsh Close. I accidentally used Constable’s worst painting of Salisbury.


Source (by Constable), Content (by me), Generated


Source (by Turner), Content (by me), Generated

Learning to count (regression)

Last time we tried to count the number of white pixels on a black image. Using a classification approach was fundamentally limiting the counter to the number classes (eg the number of output neurons). To get round this limitation I replaced the output layers with one output node + ReLU activation.


A simple network trained on the numbers 0 to 9 is able to predict the numbers 0 to 59.

What is we make the images bigger? How high can we count? I tried one 3×3 filter (same padding) followed by three successive 3×3 filters with stride 3×3, which quickly reduced the dimensions down to a small flattened layer. This did ok, but was hardly the 100% accuracy I demand!


Mostly convolutional CNN counts to within a few pixels of the correct answer.

Of course as this is a trivial problem we could cheat:


AveragePooling + one output node = 100% accuracy!

Next time: A harder problem, where average pooling won’t work.

Learning to count (classifier)

I aim to train a CNN (convolutional neural network) to count (let’s say up to 100), starting with counting the number of white pixels on a black image. I start by making a classifier, trained on images that contain 1, 2, 3, 4 or 5 white pixels.

Results after training for 10 epochs on 12,800 images are pretty good – y is the true value (the label), y_hat is the prediction.


Classifier trained on 12,800 images. These are from the 500 image test set. Test accuracy: 100%

Sadly this does less well on a test set that includes higher numbers (below). Dr R suggested I adopt a “1, 2, 3, 4, many” approach, which would give good accuracy, but I think would be a bit unsatisfying.


Performance less impressive on higher numbers.

Network architecture

I used one 3×3 convolutional filter, then a couple of dense layers. I expected the filter to converge to one high value surrounded by low values — the perfect shape for picking out white pixels surrounded by dark ones. This wasn’t what I found, as shown by these three example, which have been stretched so that abs(max(W)) = 100.

I guess that it doesn’t really matter which convolution you use, provided you understand the output!

Continue reading

Making Your Mind Up


Judging Eurovision: how to fairly compare incomplete ranks & scores

The Eurovision Song Contest is an expression of centuries of European geo-politics and rivalry disguised as a friendly song competition, bizarrely also featuring Israel, Azerbaijan and Australia.

A friend (definitely not me) recently hosted a Eurovision party, where merrymakers completed official BBC Eurovision 2017 Party Pack Grand Final scorecards. The question now: how do we compare our scores against the final results?

Eurovision scorecards

Completed BBC scorecards, each using a different scale and with frequent missing scores.

Seven people have entered the competition: SB, JMS, Chris, Lea, Jon, M1 & M2. M1 and M2 are mystery people who were so embarrassed by the concept that they couldn’t bring themselves to write their names.


The most obvious method of scoring is to compare the top three scores of each person. Sadly only one person has a match so we can declare him the winner. Hurray.

Truth SB JMS Chris Lea Jon M1 M2
Portugal Azerbaijan Portugal Romania Sweden Germany Romania Sweden
Bulgaria Sweden Armenia Ukraine Spain Portugal Moldova Moldova
Moldova Portugal The N’lands Belgium Norway Israel Norway Norway

This is not a very good method because a) it only considers 3 out of the 26 results, b) it only uses the rankings and ignores the scores.

Winner: JMS (me). Hurray!

Top three picks in ground truth top five?

It feels fairer to see how many of the true top three songs were in each persons top five. This is similar to the scoring of the classification aspect of the ImageNet competitions.

In top 5 Points
SB Portugal, Moldova


JMS Portugal




Lea Portugal


Jon Portugal


M1 Moldova


M2 Moldova


I feel that this is fairer because we have widened the comparison to consider more positions, however it still only uses the rankings.

Winner: SB

Series analysis

Considering the number of votes vs final position, I first normalised each person’s score card so that their average score was 10. This seemed like a fair way of dealing with null values.

Eurovision 2017 normalised scores

Fairly poor correlation between the entrants and the real final score (GT score).

We see that the real scores are heavily weighted towards the top few countries. This is clearer if we only look at the final scores (GT score), the jury scores and the public vote. You can see how the jury vote and the public vote are combined to make the final score:


The final score (GT score), jury score (GT jury) and public vote (GT vote).

Due to the method by which the Eurovision final score is calculated (two scores added, each of those scores is a vote), I don’t think that the final score should be considered as a fair representation of how good the song is: we shouldn’t conclude that Portugal’s entry was twice as good as Moldova’s.

Next time, we’ll consider some better methods for judging our scorers. Hopefully this will show that I won!

The jamesgeo well-travelled map



After years of deliberation, months of data wrangling and several reviews by family, I have finally finished version 1 of the jamesgeo well-travelled map. This map aims to answer the question “how well travelled am I?”. Unsurprisingly the answer is not very.

The map is made by taking every point that I’ve been to on the Earth’s surface and buffering by 100 km, without letting this cross any international borders, unless I did. I have chosen 100 km as I argue that culture, geography, geology etc changes significantly over about this distance.


Well-travelled map, showing the Earth’s surface that I’ve experienced.


Well-travelled map, without the context, paints a somewhat dismal view.