SMU Data Science Review


Millions of people live with diabetes worldwide [7]. To mitigate some of the many symptoms associated with diabetes, an estimated 350,000 people in the United States rely on insulin pumps [17]. For many of these people, how effectively their insulin pump performs is the difference between sleeping through the night and a life threatening emergency treatment at a hospital. Three programmed insulin pump therapy settings governing effective insulin pump function are: Basal Rate (BR), Insulin Sensitivity Factor (ISF), and Carbohydrate Ratio (ICR). For many people using insulin pumps, these therapy settings are often not correct, given their physiological needs. While existing reinforcement learning models can predict actual physiological values for these settings, they require iteration and can be slow.

The primary contribution of this research is to present a pipeline capable of providing instant predictions of close to actual patient physiological ISF, ICR, and BR from 30 days worth of data. In theory, this reduces patient waiting periods from roughly 6-8 weeks for existing reinforcement learning models to 30 days. This can serve as an aide in recommending pump therapy settings.

Data used in this study include 1,000 simulated multivariate insulin pump time series. These time series were generated by a proprietary simulator developed by Tandem Diabetes Care. This multivariate time series data also integrates simulated continuous glucose monitor (CGM) data.

This research proposes a pipeline for predicting actual patient BR, ISF, and ICR. Feature engineering, a component of this pipeline, included contextual consensus time series motif analysis. Models in the pipeline include time series native techniques such as Deep Convolutional Neural Networks (DNN) with a Long Short Term Memory input layers (LSTM) and aggregation based models such as Ridge regression and Lasso.

Aggregation based ridge regression showed the most promising results, outperforming a naive model and a DNN model. For the data evaluated and with a 20% holdout test set, aggregate based ridge regression predicted the following normalized patient pump settings: ISF with a Mean Absolute Error of roughly 9.0%, ICR with a Mean Absolute Error of roughly 5% and BR with a Mean Absolute Error of roughly 6%. This is likely due to the reduction that aggregation based methods perform on each patient time series, reducing each one into a single tuple. This makes aggregation based methods less susceptible to noise and sparse signals.

One limitation in this study is that the simulated data assumes a constant value of ISF, ICR, and BR over 24 hour periods for people with diabetes. In practice, this is not the case; ISF, ICR and BR fluctuate throughout the course of a day. A future consideration would be to use simulated data with non constant 24-hour ISF, ICR, and BR profiles.

Insulin pumps greatly improve management and outcomes for people with diabetes. Ideally, by instantly improving programmed values of ISF, ICR, and BR, people relying on insulin pumps can spend less time worrying about their pump working ineffectively, and sleep through the night knowing it is less likely they will suffer a diabetes related medical emergency. To this end, it is the hope of the researchers that the ideas, pipelines, and inference presented are further explored and tested.

Creative Commons License

Creative Commons Attribution-Noncommercial 4.0 License
This work is licensed under a Creative Commons Attribution-Noncommercial 4.0 License