Pub crawl

Building a matrix of the distance to the nearest pub using FSA and google data.

There are 71 pubs in Cambridge, which makes plotting the shortest possible route is going to be difficult. If I wanted to start at home, and visit all 71 pubs there would be $(72-1)! = 8.5\times 10^{101}$ potential pub crawl routes. That kind of problem is often called the Travelling Salesman Problem, which was a popular English parlour game in the 1800’s1. Pub crawls tend to be a straight route though, so I need to solve a hamilton path through the pubs.

While there is an R package called TSP devoted to solving these problems, I took a simplistic approach based off a Shiny app I hope to replicate at a later date.

The route

The plot below is a 30.3km route through all 71 pubs starting at the Wrestlers and ending at the Lord Bryon that was derived in about 40 seconds via simulated annealing.

A 30.3km pub crawl route through all 71 Cambridge pubs.

Solving the problem

Simulated annealing is a technique inspired by annealing in metallurgy. In solving my pub crawl problem, the algorithm will initially be skewed toward picking longer routes, and as it continues to iterate it will slowly ‘cool’ and become more and more likely to choose a shorter route. This helps ensure the algorithm doesn’t get caught up in a local optimum route early. Sadly - just eyeballing - it appears the plot above is probably a local maximum. It looks like some travel time could be saved if the route visited the Mitre and Baron of Beef from the Pickerel, instead of it’s current diversion from the Brewhouse. I reran the model multiple times, and it usually found a route between 30 and 36km.

The gif below is another run, but it shows the map, the cooling curve, the distance travelled, and a histogram of all the distances recorded across the iterations as the model runs. This example also illustrates how the algorithm can fail. Here it’s pretty obvious that starting at the Lord Byron Inn and ending at the Tally Ho is not going to be the optimum route.

My favourite pubs

The last gif is 5.6km route through my favourite pubs,

  1. The Cambridge Brew House
  2. The Eagle Public House
  3. The Regal
  4. The Flying Pig
  5. The Cambridge Blue
  6. The Elm Tree
  7. Old Spring Public House
  8. King Street Run Public House

Once again, a quick visual inspection shows that some travel time could be saved with minor tweaks to the last graph. While I still have long term plans to make a Cambridge interactive pub crawl app via R (and shiny), for now the following XKCD plot sums up what this has taught me - I just found a really complicated way to plot a pub crawl route which can be easily beaten by a person, as long as you’re planning on visiting less than 20ish pubs.

An XKCD cartoon on the Travelling Salesman Problem.

My function

I wrapped up the code2 in a function, which can be called from github.

Load the function

install.packages('devtools')
library('devtools')
source_gist("https://gist.github.com/epijim/8f4be4dae598e479add0")

ggmap is required to run this function.

Input variables

v_pubs - either a list of pubs c(pub1,pub2) or a dataset with the latitude in the first column and the longitude in the second. If feeding in a dataset, you also need to set cam_pubs=FALSE.

crow_distances - defaults to FALSE. If set to FALSE, and cam_pubs=TRUE (the default), the function will calculate the best route using Google maps based distance or time to walk. If crow_distances=TRUE, the function will use straight line distances (taking into account the Earth’s curvature). If cam_pubs=FALSE, the function will always use straight line (as the crow flies data).

units - Defaults to "minutes", which makes the function calculate the route based on the Google maps derived walking time. Can also be set to "metres", which will make the function use and report the distance of the crawl in metres based on Google maps directions. This option is only evaluated if crow_distances=FALSE and you are using the inbuilt pub data.

v_location - defaults to "Cambridge, UK". This value is given to ggmap when pulling the base map. Only really needed if feeding in a different dataset. If using a different basemap, v_zoom will allow the zoom on the base map to be set.

listpubs - defaults to FALSE. If set to TRUE the function will print the list of pubs and then exit the function (ignoring all other options and not running the model). I added this as the pub names need to be entered in perfectly into v_pubs for it to work.

Use

The function, when loaded, will pull data from another gist which has the pubs in it. You can see the names of the pubs by typing jb_pubdistance(listpubs=T).

A typical call to the function would be:

results <- jb_pubdistance(v_pubs=c("The Maypole P.H.","The Eagle Public House",
                                   "Pickerel Inn","Baron Of Beef"))

In the example above, we will get the default format for the results, which is based on what google claims is the default walking time. See Input variables above on how to feed in custom data, or get back the distance in metres in actual walking routes, or as the crow flies.

So by setting <-, we created an object called results. The following results are stored.

  • results$distance - the distance of the pub crawl
  • results$pubs_inorder - the pubs in order. If using my pubs data, it will give some info on the pubs. If feeding in custom lats and lons, it will be the original dataset in trip order.
  • results$temperature - the temperature values used over the iterations

The function will also return a plot showing the route.


  1. The Icosian game was a peg based game invented in 1857. ↩︎

  2. This function just gives the final route, not the gifs of how the model was fit. ↩︎

James Black
James Black
PhD (Cantab)

James Black. Kiwi | Epidemiologist | Data Scientist | Engineering enthusiast.

comments powered by Disqus

Related