Comparing Non-EEG Deep Learning Seizure Forecaster to a Rate-Matched Random Forecaster
Abstract number :
1.12
Submission category :
2. Translational Research / 2D. Models
Year :
2019
Submission ID :
2421116
Source :
www.aesnet.org
Presentation date :
12/7/2019 6:00:00 PM
Published date :
Nov 25, 2019, 12:14 PM
Authors :
Daniel M. M. Goldenholz, Beth Israel Deaconess Medical Center; Robert Moss, SeizureTracker LLC; Haoqi Sun, Massachusetts General Hospital; M. Brandon Westover, Massachusetts General Hospital
Rationale: The unpredictable nature of seizures contributes to poor quality of life for people with epilepsy. Forecasting seizure risk using electronic diaries alone without EEG has been proposed as a potential technique to decrease this unpredictability. Here we present such a tool, and demonstrate that performs substantially better than chance, as measured by a rate-matched random forecaster. Methods: Data was obtained from SeizureTracker.com from 6,803 patients who had sufficient valid data. A deep learning approach (multilayer artificial neural network) sequentially used electronic patient reported diaries from 84 days in the past to forecast the risk of a seizure occurring within the next 24 hours (hereafter: AI forecaster). The model was trained on 4,947 patients’ data using K-fold internal cross validation to optimize hyperparameters. A 7-layer network comprising 53,601 free parameters was selected. Then, a holdout set of 1,856 patients that the optimized network had not been trained on were used for testing. For comparison, the average daily seizure rate from each 84-day history was used to forecast the chance of a seizure within the coming 24 hours (hereafter: rate matched random forecaster) in the same holdout set. Forecasts were produced from both models for all available diary data from each patient. Bootstrapping whole patient diaries with replacement 5000 times was used to obtain the confidence interval of the estimates. To quantify the calibration of forecasts (how well the probabilities predicted by the model match with observed event frequencies), we used Brier scores (0 is best) and Brier skill scores (positive means valid, 1 is best). To quantify performance when forecasts are compared to a threshold optimized to make binary predictions, we calculated the area under the receiver operating curve (AUC; 1 is best). Results: Histograms were generated showing the forecast values obtained during true seizure and true no-seizure days for each forecaster. These histograms demonstrate that the AI forecasts outperform rate-matched random forecasts. A calibration curve comparing the two forecasting methods shows the AI forecaster to be much closer to ideal than the alternative. The Brier score was 0.094+/-0.004 for AI and 0.131+/-0.005 for random, demonstrating AI is superior. Computing a modified Brier score for only true positives yielded 0.377+/-0.013 for AI and 0.675+/-0.008 for random, showing AI is more accurate in forecasting true positives. A modified Brier score for only true negatives yielded 0.029+/-0.001 for AI and 0.007+/-0.000 for random showing that random is more accurate at forecasting true negatives. The Brier skill score was 0.284+/-0.016 suggesting AI is a valid improvement over random. The AUC for binary prediction was 0.869+/-0.016 for AI and 0.831+/-0.009 for random, again showing AI to be superior to random. Conclusions: The deep learning seizure forecasting model compares substantially outperforms a rate matched random forecasting method when applied to a holdout set of patients based on a variety of metrics. This suggests that the method provides a valid 24-hour seizure risk forecast. Future studies will be needed determine the clinical applicability of this technique. It may be of value to use these forecasts as Bayesian priors for biosensor-based forecasting techniques for short timescales. Funding: This study was funded in part by NINDS grant T32NS048005 and from a BIDMC department of neurology grant.
Translational Research