fitgrid: A Python package for multi-channel event-related time series regression modeling

Thomas Urbach, Andrey Portnoy
2021 Journal of Open Source Software  
Electrical brain activity related to external stimulation and internal mental events can be measured at the scalp as tiny time-varying electric potential waveforms (electroencephalogram; EEG), typically a few tens of microvolts peak to peak (Berger, 1930) . Even tinier brain responses, too small to be seen by naked eye in the EEG, can be detected by repeating the stimulation, aligning the EEG recordings to the triggering event and averaging them at each time point (Dawson, 1951 (Dawson, , 1954
more » ... Under assumptions that the brain response (signal) is the same in each recording and the ongoing background EEG (noise) varies randomly, averaging improves the estimate of the "true" brain response at each time point as the random variation cancels. The average event-related brain potential (ERP) and its counterpart for event-related magnetic fields (ERFs) are cornerstones of experimental brain research in human sensation, perception, and cognition (Luck & Kappenman, 2013) . Smith and Kutas pointed out that the average ERP at each time t is mathematically identical to the estimated constantβ 0 (t) for the regression model y(t) = β 0 (t) + ε(t), fit by minimizing squared error (Smith & Kutas, 2015a). The average ERP can be viewed as a time series of model parameter estimates. Generalizing to more complex models such as multiple regression y = β 0 + β 1 X 1 + . . . + β p X p + ε, likewise produces time series of estimates for the constant and each regressor coefficient, theβ 0 (t),β 1 (t), . . . ,β p (t) dubbed regression ERP (rERP) waveforms (see Smith & Kutas, 2015a, 2015b for discussion of related approaches). This holds for straight-line fits ("slope" rERPs) as well as models of curvilinear relationships such as spline regression (Smith & Kutas, 2015b) . Besides the estimated coefficient rERPs, the approach also produces time series for all the basic and derived quantities of the fitted model: coefficient standard errors, residuals, likelihood, Akaike information criterion (AIC), and so forth. With the shift from averaging to regression modeling, however, comes a new problem: fitting, diagnosing, comparing, evaluating and interpreting large numbers of regression models. Statement of need Interpreting recordings of brain responses and drawing inferences from patterns of systematic variation is based on statistical comparison and evaluation of candidate models. Whereas fitting a regression model is straightforward on current scientific computing platforms, informative modeling, by contrast, is a laborious process that iterates cycles of data quality control, fitting, data diagnosis, model evaluation, comparison, selection, and interpretation with numerous decision points that require thought and judgment. Modeling digitized multichannel EEG data as regression ERPs at each time point and data channel multiplies the iterative cycles in a combinatorial explosion of times × channels × models × comparisons. For instance, at a digital sampling rate of 250 samples per second, in 3 seconds of 32-channel EEG data there are 24,000 data sets (= 3 × 250 × 32). To fit a set of three candidate models requires 72,000 separate model fits, where the size of each * corresponding author, Urbach et al., (2021). fitgrid: A Python package for multi-channel event-related time series regression modeling. Journal of Open Source Software, 6(63), 3293. https://doi.
doi:10.21105/joss.03293 fatcat:4w7awtglcbbphk6icnyosa6p4m