The timing of exploratory decision-making revealed by single-trial topographic EEGanalyses

Athina Tzovara, Micah M. Murray, Nicolas Bourdaud, Ricardo Chavarriaga, José del R. Millán, Marzia De Lucia
2012 NeuroImage  
Decision-making in an uncertain environment is driven by two major needs: exploring the environment to gather information or exploiting acquired knowledge to maximize reward. The neural processes underlying exploratory decision-making have been mainly studied by means of functional magnetic resonance imaging, overlooking any information about the time when decisions are made. Here, we carried out an electroencephalography (EEG) experiment, in order to detect the time when the brain generators
more » ... sponsible for these decisions have been sufficiently activated to lead to the next decision. Our analyses, based on a classification scheme, extract time-unlocked voltage topographies during reward presentation and use them to predict the type of decisions made on the subsequent trial. Classification accuracy, measured as the area under the Receiver Operator's Characteristic curve was on average 0.65 across 7 subjects. Classification accuracy reached a plateau for each of the subjects after ~510ms on average. We speculate that decisions were already made before this critical period, as confirmed by a positive correlation with reaction times across subjects. On an individual subject basis, distributed source estimations were performed on the extracted topographies to statistically evaluate the neural correlates of decision-making. For trials leading to exploration, there was significantly higher activity in dorsolateral prefrontal cortex and the right supramarginal gyrus; areas responsible for modulating behavior under risk and deduction. No area was more active during exploitation. We show for the first time the temporal evolution of differential patterns of brain activation in an exploratory decision-making task on a single-trial basis. the subjects' decisions already from the presentation of reward, at an average across trials and subjects level (Cohen and Ranganath 2007). Modulations of the EEG responses following reward presentation are also present at a single-trial level (Philiastides et al., 2010) , allowing to discriminate between switch/stay decisions, although the temporal aspects of this discrimination are not yet explored. In the present study, in order to investigate fine-grained temporal information, we carried out an EEG experiment while subjects were facing the 4-armed bandit problem with four classes (Daw et al., 2006; Bourdaud et al., 2008) . In such a high-level cognitive task, inter-subject variability cannot be neglected as individual subjects employ different strategies (Daw et al., 2006), an effect also linked to genetic polymorphisms (Frank et al., 2009). We therefore carried out analyses at the single-subject level, using a classification scheme, which allows to discover the neural correlates underlying decision-making that can best predict subjects' behavior (see Hampton and O'Doherty, 2007 and Bourdaud et al., 2008 , for similar approaches based on functional magnetic resonance imaging -fMRIand EEG, respectively). The main difference here is that prediction is not the goal of the study per se as in Bourdaud et al., 2008, but rather a strategy for evaluating statistically when enough information is available for accurately classifying future decisions, as measured by EEG. Without making explicit assumptions about the neural underpinning of decision-making, we consider voltage topographies that best discriminate exploratory and exploitatory behaviours in a time-unlocked manner. Classification based on voltage topographies has been reported in lower-level tasks in the visual and auditory domains (De Lucia et al., 2007; Murray et al., 2009; Tzovara et al., 2011) . Here, we show in a more challenging context that EEG topographies can 5 accurately predict behaviour with the advantage of being neuropysiologically interpretable: any change in them is the result of a change in the underlying brain generators. Materials and Methods Experimental paradigm Participants Seven healthy individuals (2 females), aged from 25 to 27 years (mean age 26.4 years), participated. Data from these individuals have been previously published in an investigation on the role of EEG oscillatory activity on single electrodes during the exploration -exploitation task (Bourdaud et al. 2008 ). In the present study we further analyze the temporal aspects of these data, in association to reward evaluation. Procedure and Task The experimental protocol was adapted from a similar fMRI study (Daw et al. 2006). Participants were sitting in front of a computer screen where four squares were displayed representing four slot machines (Figure 1a) , where each machine corresponds to a bandit arm. They were instructed to fixate on a red dot at the center of the screen to reduce ocular artifacts. On each trial participants had to choose one machine by pressing a key with their index or middle finger on the corresponding hand (left hand for machines 1 and 3, and right hand for machines 2 and 4). The payoff of the selected machine was displayed one second after the key press and remained on display for another second, followed by the beginning of a new trial. Participants were asked to 6 select the machines so as to maximize their total gain (i.e., sum of individual payoffs) over a session of 400 trials. Three sessions were recorded for each participant. The payoff of each machine, a numerical value between 0 and 100, was drawn from a Gaussian distribution whose mean changed slowly across the experiment. Before the experiment, nine random but common across participants examples of the payoff evolution for all the machines were shown to each of them (for such an example see Figure 1b ). Participants, knowing that the machines' payoffs were not static, had to regularly update their knowledge about them and were therefore encouraged to explore. EEG acquisition Continuous 64-channel EEG was acquired through a Biosemi Active II system with a sampling rate of 2048 Hz and was referenced to the CMS-DRL ground, which functions as a feedback loop driving the average potential across the electrode montage to the amplifier zero. EEG recordings were not performed inside a faraday cage, so as to ease reproducibility of any findings for possible future online applications in real-life conditions. The acquired signal was filtered offline by an eighth-order low-pass Chebyshev Type I filter with a cutoff frequency of 205 Hz and down-sampled to 512 Hz. The filters were applied in both the forward and reverse directions to remove all phase distortion, effectively doubling the filter order. In addition, electrooculogram was recorded using two electrodes located below and at the outer canthus of the right eye. Preprocessing Trials were extracted with respect to the display of payoff, spanning 100 ms before the display and 780 ms post-stimulus onset (Figure 1a, red thick line). Trials with blinks or Cohen JD, McClure S.M, Yu AJ (2007). Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration. Phil. Trans. R. Soc. B. 362(1481):933-942. Cohen MX, Ranganath C. (2007). Reinforcement learning signals predict future decisions. J Neurosci. 10;27(2):371-8. Daw ND, O'Doherty JP, Dayan P, Seymour B, Dolan RJ. (2006). Cortical substrates for exploratory decisions in humans. Nature 441(7095):876-9. De Lucia, M., Michel, C.M., Clarke, S., and Murray, M.M., (2007). Single-trial topographic analysis of human EEG: A new 'image' of event-related potentials. Proceedings Information Technology Applications in Biomedicine. Einhäuser W, Koch C, Carter OL. (2010). Pupil dilation betrays the timing of decisions. Front Hum Neurosci. 26;4:18. Ernst M, Bolla K, Mouratidis M, Contoreggi C, Matochik JA, Kurian V, Cadet JL, Kimes AS, London ED. (2002). Decision-making in a risk-taking task: a PET study. Neuropsychopharmacology. 26(5):682-91. Frank MJ, Woroch BS, Curran T. (2005). Error-related negativity predicts reinforcement learning and conflict biases. Neuron.47(4):495-501.
doi:10.1016/j.neuroimage.2012.01.136 pmid:22342874 fatcat:zpuidx5udbgc5j7555fssqbrrm