What leads to happy patients for a dental practice?

I was thinking about changing dental practice and I was curious to know if data could help me choose a practice. Luckily the NHS publish a lot of dental practice data and some of this has been aggregated together. The problem for me was that there are many measures and I don’t know how one measure trades off against another. This post will describe a way to combine the measures to give an estimate for typical patient happiness in a practice with those measured attributes.

For busy readers, here’s a potential approach to choose a dental practice:

  • Look at Care Quality Commission reports
  • Look at practice level survey information but note that for many practices the sample size will be small and there’s a large range of variability
  • Consider the proportion of different treatment courses carried out. More Band 1 (non-urgent) treatments seem to be good
  • Look at the proportion of adults at the practice and the 3-month revisit rate. See below for more details but in contrast to the NHS’s view, a high revisit rate seems to be a good thing.

You may find my combined measure described below useful for combining some of these different measures.

Continue reading


Seeing ships in rain radar

I’m starting to play around with “nowcasting” – doing weather forecasts based on recent rain radar images.  The Met Office provides a feed of radar images you can get from their Datapoint service.

The first plot I made of the data was number of rainy 15 minute periods during May (shown with and without a base map):

rawradar radarandbase


There are many interesting features here.  Firstly a few probably true observations about rain:

  • You can clearly distinguish land and sea – there is more rain over land.
  • You can see hilly regions with yet more rain.

However there are many other features that I guess aren’t true observations about rain:

  • You can see the two shipping lanes in the English Channel with the separation zone between them.
  • You can see the individual radar sites don’t give uniform coverage in all directions.
  • There are other linear features of which I’m not sure of the origin – any ideas?

All these features suggest that I can’t treat this data as perfect truth data for nowcasting.  😦

Who wins The Hunger Games?

I’ve just been watching The Hunger Games and noticed that Katniss managed to win by killing a small proportion of the participants and I wondered whether this is a typical outcome.  Given the “Careers” who are trained for the event I had imagined something more like Kill Bill where the victor would have had to fight through many people to win.  Let’s consider a statistical model for The Hunger Games to consider what outcomes are typical.

We model participants as meeting randomly for one-on-one fights to the death.  This isn’t a perfect model but hopefully it’s a useful model.  The random meeting assumption seems OK as it is hard for an individual to find a particular individual in the Hunger Games arena.  We imagine that group activity can be broken down to steps of one person administering the coup-de-grace to one victim.   This makes the Hunger Games look like a random binary tree with 24 leaf nodes.

We consider two ways of determining who wins a one-on-one fight to the death: “random” and “deterministic”.  In “random” we flip a coin to determine the victor – this randomness seems realistic and could include the risk of death by infection or any of the other uncertainties that are in the arena.  “Deterministic” is the opposite extreme and we know in advance a strength order for individuals which determines the winner of any fight; this model may more closely model the “Careers” approach.

I couldn’t find a closed form solution for the outcome of these models so I simulated it in R.    I show below the probability mass function and the cumulative distribution function for the number of fights that the victor took part in (and won).

Probability mass function

Cumulative distribution

You can see that in the “random” case the modal number of fights is pretty low at 3.  95% of outcomes lie in the range [1,6].  The “deterministic” case typically gives more fights with a model number of fights of 5 and 95% of outcomes lie in the range [3,9].  In terms of a random binary tree these two models can be thought as the number of branches from the root to take at random until you reach a leaf node and the depth of a random node respectively.

Anyway Suzanne Collins seems to have got things looking statistically typical. You could count Katniss’s encounters in various ways but to me they all lie in the 95% confidence intervals of both models.  Did Suzanne Collins have a statistician on board when writing the books?

Q: Can you advise on how to solve this problem analytically rather than by simulation?

Film making around the world

I was wondering where are films made around the world.  Luckily we have IMDB as a datasource about films and cartograms as a visualisation technique.  A cartogram is a map where areas have been adjusted to a quantity of interest.  In this case we’ll adjust the area of each country so that the resultant area is proportional to the number of films in IMDB from that country.


So watching a lot of films from the US is probably to be expected but are you up enough on European and Japanese cinema?

[For info I used the Rcartogram package to make this plot.]

Shipping 1750-1855 visualised

Inspired by Spatial Analysis’s blog post I thought I would look further at CLIWOC’s dataset of historic shipping.  In particular the previous plots don’t show the direction of travel so you can’t understand triangular trade or the nature of the trade winds. This struck me as a nice opportunity to experiment with semi-transparent plots in R as transparency on paths allow the eye to see aggregate behaviour.

We can use the cyclic nature of the colour wheel to view the month and direction of travel.  The key is in the middle of the plot.  By month, red means the month of January and cyan July.  By direction, red is north, west is dark purple, south is cyan and east is greeny-yellow.

A view of shipping 1750-1855

So what can we see?

In both the Spanish and French shipping you can see the effect of the trade winds which mean you want to go near the poles to get a westerly wind to drive ships east to home.  You can also see that ships seem to set off from the West Indies around June/July (does that correspond with harvests out there?).  You can also see the Spanish ships reaching into South America unlike the French ships.

The Dutch and British shipping reach out east too.  Similarly you can see the effect of the trade winds as ships go far south to go east but take the shortest path to come back west.  British shipping is doing a lot with India and east Africa whereas the Dutch shipping is concentrated out to Dutch East Indies.  The time of year story looks a little less clear but it looks like Dutch ships come home around January.

Also on the Dutch shipping you can clearly see triangular trade from Europe, down to Africa over to the West Indies and back to Europe.