Wednesday, August 2, 2017

Clean, but shorter, Dexcom G4 (505) Freestyle Libre comparison

Since my previous post has triggered a few private reactions. Here’s another comparison on a fairly standard situation, with clean data: clocks are in perfect synchronisation, there are climbs (pre-game carb loading) and falls, including a severe low (delayed hypo).

On the left, the data as downloaded. On the right, the data shifted for the best correlation (which basically means that the Dexcom data is rolled back in time to erase the delay). That post-mortem analysis is both realistic and a bit unfair to the Dexcom. Realistic because the Libre raw data matches historical data quite well. A bit unfair because the Libre only provides delayed and adjusted historical data. Adjusted relative to what? The spot checks. As I have shown many times on this blog, spot checks are typically even faster than the Dexcom in practice, with the drawback that they are really inaccurate at times, especially on the high side.

lookingatsensors

In this case, the best correlation is found with a shift of 5-6 minutes (Libre ahead of the Dexcom by 5-6 minutes). This is fairly typical of what we see with the Libre vs the 505, when everything works well for both sensors. That’s the tricky part in practice of course: adhesion issues, desynchronisation between insertions (ie comparing a fresh Dexcom to a Libre in its second week) all play a role.

Broadly speaking, the sensors see the same thing. The 505 data is a bit more bumpy: that is a consequence of the adaptive 505 algorithm and, of course, of the smoothing introduced by the Libre historical data.

One important point: as you can see in the left Bland Altman plot, two well working sensors can show very significant differences based on timing and rate of change.

Regardless of the absolute magnitude of the differences, a consistent behavior emerges: the Libre overshoots highs compared to the Dexcom and undershoots lows to a lesser (absolute) extent. This type of behavior could be the consequence of the calibration slope of the BGM used to calibrate the Dexcom, but we have observed the same behaviors with different BGMs (Menarini Glucomen LX, Roche Accucheck Mobile, Abbott’s Libre BGM). If you are interested in that behavior, the 2014 and 2015 posts on this blog provide additional insight.

The third screen is a log/log plot privately suggested by L. and is basically a Bland Altman on steroids that amplifies the visualization of the differences in behavior in a way that is less dependent on absolute differences. (I am sure I will be corrected if I didn’t get that right).


Beautifying the data


Now, let’s look at the old Clarke plot of the Dexcom vs the Libre. (yes, I know, Clarke plots are out of fashion, but I have had the function for ages, so why not…

First the un-shifted data plot.

beforeshift


Quite decent match, you would not have killed yourself by relying on either device.

Now, the delay corrected data plot.
aftershift

Isn’t that something? We have gained almost 8% in the A zone.

Now, this doesn’t mean anything in absolute terms. For all we know, the Dexcom could have been right and the Libre could have been overshooting. Only one thing is certain: the delay.

But this tells us something else: it is extremely easy to tweek test results to your liking. Something as simple as asking patients to tests 2 hours after a meal vs asking them to test 1.5 hours after a meal, something seemingly as innocuous as using standard meals or standard sport sessions can have a drastic impact on the numbers. In a market where T1D fanboys love to argue about the 1% MARD advantage of their sensor (while at the same time losing 10% MARD or more through home made hacks), a couple of percent of differences can mean a huge amount of good publicity…

Tuesday, August 1, 2017

Non clean Dexcom vs Libre comparison

Real life has interfered – that would probably be a good “psychological burden of chronic disease” post is I was in the mood – and, while the blog hasn’t been updated, it isn’t dead yet.
Here’s a new comparison between the Libre and the Dexcom 505. Unlike one of the previous comparison posted here, this one is utterly “unclean”. In short
  • this was a tennis tournament week, with frequent games.
  • Max forgot to scan with the Libre, or simply forgot the Libre reader. The straight green lines are those no data periods.
  • both sensors were on the arms: we experienced several adhesion issues and patched as we went.
  • variability is much higher than usual because we were “pre-loading” a bit for games (not very useful, but better than starting too low anyway) and experienced severe delayed hypos on a couple of occasions, despite minimal levemir doses (5U / 24 hours)
In other words, ultra messy real life…

ERRATUM: G4 505 vs Libre - legend copy paste error. Thanks to KS for spotting it.
image
While I would not draw too many conclusions out of such an awful data set, some comments

It is good to have backup. We lost a Dexcom sensor almost at once (not shown here) and the Libre started dangling after a few days. Interestingly, the Libre started to read a bit too low and sensing delay increased a lot. The yellow marker on the above chart marks the near sensor loss moment. When Max noticed (or paid attention), we used a bit of opsite to stabilize the sensor and normal operation resumed.

The Libre remains, in general, faster than the Dexcom 505 algorithm, and even more so if one looks at spot checks (with the draback that those can be off when the trend changes suddenly). We now have a year or so of side by side data and experience and the result is always the same. Yes, on occasions the Dexcom will pick up a trend before the Libre does (as reflected in historical data) but I don’t remember seeing it picking up a trend before Libre spot checks. Depending on the data set, the optimal correlation between the two signals consistently gives a 6 to 10 minutes advantage to the Libre.

Note: I am not really that interested in collecting additional very clean data. In order to make a rigorous comparison, we need to sync the device clocks on a regular basis, we need precise reference points such as “timecode” BG tests, we need mechanically stable sensors, reminders to scan at least once every 8 hours, etc… All of this adds to the management burden of a teen T1D and that is something I don’t really need.

In practice, that speed advantage needs to be taken with some caution:
  • the Libre historical data is computed and corrected a posteriori (as shown here). It is not useful in real time.
  • the Libre spot checks are typically faster than historical data, but the delay compensation (combined to the so-so temperature compensation) often introduces overshoots.
Still, the Libre remains our favorite sensor for sports.
Excluding the excursions introduced by the interpolation, the Bland Altman plot is relatively flat. Still I wouldn’t draw any conclusion in terms of absolute slopes/biases because the G4 505 depends to a large extent on the calibration it receives (the nasty non linearity of the original G4 has been reduced in the current sensors/algorithm combo).

I realize quite a few issues I addressed here need a more detailed discussion, more data and detailed examples. Please treat this post as a simple keep-alive ping.