I was thinking about changing dental practice and I was curious to know if data could help me choose a practice. Luckily the NHS publish a lot of dental practice data and some of this has been aggregated together. The problem for me was that there are many measures and I don’t know how one measure trades off against another. This post will describe a way to combine the measures to give an estimate for typical patient happiness in a practice with those measured attributes.
For busy readers, here’s a potential approach to choose a dental practice:
- Look at Care Quality Commission reports
- Look at practice level survey information but note that for many practices the sample size will be small and there’s a large range of variability
- Consider the proportion of different treatment courses carried out. More Band 1 (non-urgent) treatments seem to be good
- Look at the proportion of adults at the practice and the 3-month revisit rate. See below for more details but in contrast to the NHS’s view, a high revisit rate seems to be a good thing.
You may find my combined measure described below useful for combining some of these different measures.
There’s a huge range in sizes of practices. We want to focus on practices that primarily offer NHS dentistry rather than just as an add-on to private dentistry. There’s no clear way to do this in the data. Arbitrarily we’ll filter to practices that have more than 2000 NHS dental courses in a year; that suggests they’re doing about 8 or more NHS dental courses in a day.
The NHS Dental Services survey a small number of patients per practice. They get asked “How satisfied are you with the NHS dentistry you received?” and get to answer “Completely satisfied”, “Fairly satisfied”, “Fairly dissatisfied” or “Very dissatisfied”. Most answers are “Completely satisfied” at 79% of responses. Our aim will be to use these responses to understand the properties of a practice that leads to these “Completed satisfied” responses. In other words what leads to “happy” patients?
A challenge to overcome is that there are very few surveys per practice. A typical practice has 7 patients surveyed (median) and on average there are 14 patients surveyed (mean). Given these small samples we can not score a practice by the proportion of positive responses as such an estimate would be subject to significant noise.
So we turn to all the other data about a practice released by NHS Dental Services about delivery of dental courses (FP17s). We’re looking at practices with more than 2000 FP17s so such measures are likely to be reasonably accurate about underlying proportions. The question is then can we use these measures to model practice patient happiness.
As a starting point for quality of dental treatment NHS Dental Services report: “The number of FP17s involving an adult for the same patient identity (surname, initial, gender and date of birth) where the previous course of treatment for that patient identity at the same contract ended 3 months or less prior to the most recent course of treatment. … In general, a patient who has completed a course of treatment that renders him or her “dentally fit” should not need to see a dentist again within the next three months. A high rate would indicate that further treatment has been provided outside the recall interval but could include urgent treatment etc.” We can create a logistic regression model to see how this factor (and the equivalent for children) influences happiness. As we might expect this produces a model that tells us that an increasing rate of prompt revisits does lead to lower happiness.
They also publish many other metrics. Of particular interest to me is proportion of “band 1” minor treatments. “A Band 1 course of treatment covers an examination, diagnosis (including X-rays), advice on how to prevent future problems, a scale and polish if needed, and application of fluoride varnish or fissure sealant.” It seems likely that people will be happier if their dentist can constrain themselves to minor work. A logistic regression model shows this to be the case – an increasing proportion of band 1 treatments leads to increasing happiness.
What’s then interesting is that if we create a model with both proportion of band 1 treatments and proportion of prompt revisits then prompt revisits then becomes a positive effect – more revisits make people happier. So people like the ability to make regular checkups.
So there’s interesting interactions going on between variables. There are also some extreme values that might make predictions from a linear modelling suspect. To produce our final model we turn to regression trees (formally gradient boosted machines) using the GBM package in R. We include the following parameters based on courses of treatment:
- Proportion Band 1 treatment
- Proportion Band 2 treatment
- Proportion Band 3 treatment
- Proportion Band 1 urgent treatment
- Proportion Prescription
- Proportion Arrest of Bleeding
- Proportion Adult
- Proportion adult revisits within 3 months
- Proportion child revisits within 3 months
We starting by showing partial dependence plots for the variables ordered by relative influence (left-to-right, top-to-bottom):
The central range of 95% of the data is shown by red lines. Some of the variables are approximately linear in this range but many non-linear sections can be seen.
In terms of happiness there are negative effects with treatment bands 2 and 3 and positive effects of treatment band 1. Adult and child revisits within 3 months are positive factors (the same potential surprise as we saw above). Note there’s interesting behaviour with the Proportion of Adults – it seems that 86% adults is ideal.
We can calculate model predictions for any particular practice by creating bootstrap models and then taking out-of-bag samples.
To see the quality of fit, we divide the practices into deciles by predictions from the model. We then report the mean of surveyed happiness. Firstly we do that weighted equally by practice and secondly weighted equally by response. We’re generally hitting the right prediction range for the latter case but are less accurate for the former. This suggests there may be some unexplained variability in practices with small number of responses (these tend to be smaller practices).
Decile Prac Resp 1 [0.572,0.740) 0.709 0.707 2 [0.740,0.765) 0.718 0.745 3 [0.765,0.778) 0.775 0.774 4 [0.778,0.787) 0.764 0.782 5 [0.787,0.795) 0.789 0.792 6 [0.795,0.803) 0.801 0.793 7 [0.803,0.810) 0.798 0.803 8 [0.810,0.817) 0.825 0.817 9 [0.817,0.825) 0.826 0.822 10 [0.825,0.859] 0.843 0.841
A quick glance at the data geographically (not shown) suggests that practices with low-end predictions are concentrated towards cities.
Despite knowing there’s unexplained variability in the data I found the model predictions useful to help me choose a practice. In case you do too, the 95% bootstrap confidence intervals from our model by practice are here.
Q: What else can we do with the data? Are there any other data sources so we can factor out patient demographics in some way? Are there some measures for patient healthiness we can use rather than happiness?
Disclaimer: This post only addresses correlations not causal relationships. The properties observed could be due to demographics of the patients rather than anything to do with the practice. There is unexplained variation in the data this model doesn’t capture. Happiness is not the same as healthiness. Do look at other sources of information in particular the Care Quality Commission reports on inspections.
Data licence: “NHSBSA DS Data Warehouse, NHSBSA Copyright 2014” This information is licenced under the terms of the Open Government Licence: http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3