Revisiting the display of the Clarke grid
The Clarke error grid was defined in 1987. Initially developed to evaluate the accuracy of the BG meters in use at that time compared to reference values, it has become one of the "gold standards" (with MARD) used to evaluate the performance of glucose measurement methods. Because it was initially defined from a clinical point of view, it may seem a bit arbitrary today: in the neighborhood of a reference value of 240 mg/dl and a measured value of 180 mg/dl, zone A, B and D are extremely close. In my opinion, the attribution of a zone to any value pair in that area should be taken with a grain of salt.
It's full of stars (or numbers)!For the next 20 years, from 1987 to 2007, diabetics were doing a few checks every day and only rarely had reference values at their disposal - maybe a few laboratory checks now and then. Researchers in clinical studies had more but for shorter periods of time. A quick black and white Clarke grid was all that was needed for a visual analysis of their test. With the arrival of CGMs, we are now literally drowning in data. And too much data usually means visualization trouble. Take a look at a conventional black and white Clarke error grid when it is used to compare CGM data vs meter calibration points. Somewhat useful, but a bit crowded. We can see that the CGM performs relatively well compared to the blood glucose meter, and that's it.
Adding a bit of color and transparency makes the grid much easier to interpret and provides interesting additional information. We can now see the frequency of calibrations in certain zones. And, assuming one calibrates at fairly representative moments of the day, on can almost visualize the average glucose values for the range of dates considered (the actual calculated average is 108 mg/dl). As an added bonus, the use of colors has allowed me to identify minor bugs in a couple of "standard" Clarke grid implementations. Since the grid background code is independent from the individual data pair plotting and zone attribution routine, any mis-attributed data point will appear with the wrong color for the zone.
Here's an example of such a slightly buggy plot using a "reference algorithm"
As you can see, some of the data pairs classified as being in zone A should actually be counted as in zone B at the top end, and some of the data pairs classified as in zone B should actually be in zone A. The error comes from a single variable comparison done in the wrong order. [programmer's rant - it seems that the more competent developer are - or think they are - the more they use extremely short variable names so their code looks neat and tight. Longish more explicit names don't look as cool, but they are easier to keep track of]
And here is the visual representation I am now using.
Does it even matter?For the typical user, keeping track of his calibration accuracy is probably overkill. However, visualizing his calibrations as a whole can help develop an optimal calibration strategy and assess the progress made as the strategy is implemented. CGM accuracy will be an essential parameter to track, in my opinion, when an artificial pancreas becomes available and if it still relies on a user calibrated CGM input.
And I would be remiss not to add the usual caveat: all these results are likely to be biased. Some bias coming into play here are:
- BG meter accuracy: I have a good understanding of our BG meter accuracy, but some of them are somewhat unreliable and/or used improperly.
- by using calibration points vs pre-calibration value I am, in theory, biasing the result against the CGM since it could be expected that it is when it needs a calibration that it is not performing optimaly.
- conversely, by using a calibration strategy that avoids tricky situations (fast changes, very low values, etc...) I am biasing the results in favor of the CGM.