Cerebellar deep nuclei involvement in cognitive adaptation and automaticity
Learning & memory (Cold Spring Harbor, N.Y.)
To determine the role of the interpositus nuclei of cerebellum in rule-based learning and optimization processes, we studied (1) successive transfers of an initially acquired response rule in a cross maze and (2) behavioral strategies in learning a simple response rule in a T maze in interpositus lesioned rats (neurotoxic or electrolytic lesions). Even though lesioned animals showed no impairment in learning the initial stimulus-response association, they had difficulties in transferring the
... uired adapted response rule, and in optimizing their response strategy. These results add information on the role of interpositus nuclei in adaptation to environmental changes. Over the past decades, cerebellar circuits have been shown to play a major role in the process by which an action becomes automatic with practice in humans (Jenkins et al. 1994; Doyon et al. 1998 Doyon et al. , 2002 Hubert et al. 2007) , and thus in the automatic execution of over-learned cognitive routines. Moreover, whereas the initial acquisition guided by reward or other feedback stimuli seems to be normal in cerebellar patients, reversal learning or transfer of an acquired response rule to a new task is impaired Ramnani 2008, 2011; Thoma et al. 2008) . In animal models of cerebellar dysfunction, rule-learning deficits and impaired optimization were similarly observed. A deficit in learning the procedural component of spatial tasks was pointed out after hemicerebellectomy (for reviews, see Petrosini et al. 1998; Rondi-Reig and Burguière 2004) or in L7-PKCI transgenic mice (Burguière et al. 2005) . Deep nuclei lesions were shown to play a particular role in acquisition and reversal learning (Joyal et al. 2001) . More recently, lesions of the interpositus nuclei were shown to prevent the development of habit-based behavior with overtraining, without altering the initial goal-directed behavior, suggesting that deep nuclei are part of the network involved specifically in optimization processes and habit formation (Callu et al. 2007) . Similarly, using the mouse mutants L7-PKCI, Burguière et al. (2010) showed that prefrontal cortex-Purkinje cell long-term depression (LTD) is not required for the initial goal-directed behavior but is required for optimization of motor responses. To determine the role of the cerebellum in rule-based learning and optimization processes, we first compared interpositus lesioned rats (IN) with Sham-operated rats in learning a simple response rule in a T maze and then transferring this rule to different maze configurations. Second" we used Packard and McGaugh's paradigm (1996) in which animals (IN lesioned and Sham) were trained in a T maze to determine the preferential strategy by which they resolved the task after training or overtraining. Specifically, in the first experiment, bilateral electrolytic lesions of interpositus nuclei were stereologically performed, whereas in the second experiment, the IN lesioned group received bilateral microinjections of Colchicine. Details of surgery methods have been reported previously (Callu et al. 2007 (Callu et al. , 2009a . To be brief, (1) bilateral electrolytic lesions were stereologically performed using a 2-mA direct current lasting 15 sec through a varnished stainless steel 150-mm microelectrode at the following sites relative to lambda: AP 22.3 mm, ML + 2.4 mm, and DV 24.2 mm (Paxinos and Watson 1986); (2) bilateral microinjections of Colchicine in the interpositus nuclei were stereotaxically injected (2 mg in 0.2 mL of tridistilled water) through a glass micropipette which was left in place for 5 min after injection at the following two sites in the cerebellum relative to lambda: AP 22.3 mm, ML + 2.2 mm, DV 25.2 mm, and AP 22.6 mm, ML + 2.0 mm, DV 25.2 mm (Paxinos and Watson 1986). After completion of the tasks, rats were transcardially perfused with 4% PFA in PBS, their brains extracted and cryosectioned (40-mm thick) and stained with thionine. In the first experiment, the lesion extended from 22 to 22.8 mm according to lambda, with a maximum of destroyed tissue at AP ¼ 22.5 and partial destruction of the overlying fibers in some animals (Fig. 1A ). Histological analyses of rats' brains from experiment 2 showed an antero-posterior neurotoxic extension from AP ¼ 22.6 mm and AP ¼ 21.8 mm, with a maximum at 22.3 mm. In some animals, the fastigial and/or the lateral nuclei were partially lesioned. There was no major difference in the lesion extension between the two experiments ( Fig. 2A) . We first asked whether interpositus nuclei of cerebellum were involved in rule-based learning. After 3 d of habituation to a black Plexiglas cross maze (each arm 10 cm wide × 50 cm long, 52 cm off the floor) and to the reward (Choco rice), rats were trained to find the reward by turning 90˚to the right from the starting point for four consecutive trials. Three different steps were carried out. During step 1, two possible starting points (north or south) and two possible choices (left or right) were used. During step 2, four possible starting points (north, east, west, or south) were used, but still two possible choices (left or right). For step 3, four possible starting points (north, east, west, or south) were used with three possible choices (left, straight, or right). Entries into the unbaited arms were scored as incorrect responses. When rats reached four correct responses on two consecutive days, they were run in the following step of the experiment. Rats who failed to reach the criteria after 12 training days were eliminated. IN-lesioned and Sham animals reached the learning criteria with a similar number of sessions (F , 1), and the same percentage of rats per group reached the criteria on each session (step 1, Fisher Exact Test P's . 0.05 for every session and all comparisons) (Fig. 1B) . Moreover, latencies of correct responses decreased with training during the last four sessions of this initial learning step for both groups (Sham, F (3,18) ¼ 9.96, P , 0.01; IN, F (3,21) ¼ 4.55, P , 0.05) (Fig. 1C) . There was no statistical effect of lesion (F (1,13) ¼ 2.67, ns) or Lesion × Day interaction during these four days (F (3,39) ¼ 2.30, ns). On steps 2 and 3, Sham rats decreased their number of 3 Corresponding author Downloaded from sessions to reach the criteria from the first to the third step (F (2,12) ¼ 46.35, P , 0.001). IN rats needed to relearn the task at steps 2 and 3 with a decrease of performance between the last session of step 1 and the first criterion session of step 2 (Fisher Exact Test, Sham group, ns; IN group, P , 0.01) and between the last session of step 2 and the first criterion session of step 3 (Fisher Exact Test, Sham, ns; IN, P , 0.01). Sham rats' performance was significantly better than IN rats' performance during the first 3 d of step 3 (Fisher Exact Test, P's , 0.05), and IN rats needed more sessions to reach the criterion than Sham rats (F (1,13) ¼ 5.72, P , 0.05) (histograms in Figure 1B ). Mean response latencies still decreased during step 2 and 3 sessions for all rats (step 2, F (1,7) ¼ 7.29, P , 0.05; step 3, F (1,7) ¼ 7.19, P , 0.05). Interpositus lesioned animals were impaired at elaborating an appropriate response rule as rapidly as control animals, or at transferring the learning rule acquired in a certain context to a new, more complex experimental situation. Even though fixed relationships between start and goal arms were supposed to evaluate the praxic components of navigation, there was no possibility to know in the final analysis the exact strategy used by animals. To assess the preferential strategy used by IN and Sham rats in the maze, we used Packard and McGaugh's paradigm (1996) in which animals were trained or overtrained in a wooden cross maze made of four arms (north, south, east, and west; 79 cm long and 12 cm wide) of an eight-arm radial maze (the other arms were closed with white boxes). After 3 d of habituation to the maze and reward (Choco Pops), rats were placed in the start arm (south arm) facing the outside of the maze and were allowed to reach the end of the goal arm of the maze where four or five pieces of chocolate cereal were placed in the food cup. A correc-tion procedure was used on the first 3 d, and then rats entered only one arm. Entries into the unbaited arm of the cross maze (east) were scored as incorrect responses during the training trials, and entries into the baited arm of the cross maze (west) were scored as correct responses. Rats were trained for four trials per day to a criterion of 7/8 correct responses over two consecutive days. Once they reached this criterion, they were subjected to a nonreinforced probe trial on the following day during which rats were placed in the opposite (north) arm to the usual start arm (south) and the entrance to the south arm was blocked. Rats entering the east arm were designated "place" learners (i.e., animals going to the place where food was located during training) and animals entering the west arms were designated "response" learners (i.e., animals making the same body turn as during training). Immediately after the probe trial, a rewarded trial was run in the same condition as during training to eliminate animals who responded by chance during this probe trial. Animals that gave an incorrect response were eliminated from the experiment. A second probe trial was given to animals after five consecutive days of overtraining. On the following day, rats were then trained in a single reversal session during which the start arm was the same as during training (south arm) but the rewarded goal arm was the east arm (former unbaited arm) and the incorrect arm was the west arm (former baited arm). The number of trials necessary to give the new correct response was noted. Behavioral performances show that Sham animals needed more days to reach the criterion than IN-lesioned animals (Student's t-test, P , 0.05). On the fifth day of training, significantly more IN-lesioned animals reached the criterion than Sham animals (Fischer Exact Test, P , 0.02). During the first probe trial (Fig. 2B) , two Sham rats (one place learner, one response learner) and three IN-lesioned rats (two place learners and one response learner) were discarded because their response was incorrect on the rewarded trial immediately following the probe trial. During this test, Sham animals preferentially used the place strategy (binomial test, P , 0.001), whereas lesioned rats showed no preference (P . 0.05). After overtraining (Fig. 2C) , Sham animals shifted from place strategy to response strategy whereas no behavioral change was observed for lesioned rats (corrected X 2 of McNemar, Sham, P , 0.04; IN, ns). More Sham rats changed from spatial to response strategy than IN rats (P , 0.05) (Fig. 2D) . During the reversal session, lesioned rats needed more trials than Sham animals to reverse their response (univariate Student's t-test, IN/Sham, P , 0.02). During the first task, Sham-operated rats showed a very good transfer of the response rule from the first to the second and third shifts with little decrease in performance on the first sessions. In contrast, even though lesioned rats acquired the first task as rapidly as control rats, thus showing a normal ability to associate cues (egocentric or allocentric) with a reward, they were unable to transfer this initial learning: during step 2; only 20% of the IN rats were able to reach criterion on the first two sessions and only 10% reached it on the first sessions of step 3, indicating no real gain of performance during the different transfer steps. Therefore, it Figure 1. (A) Diagrams of coronal sections (in reference to bregma) of the cerebellum representing the extent of cell loss observed after bilateral electrolytic lesions of the interpositus nuclei and overlying fibers. (B) Performance during task acquisition (step 1) and during generalization (steps 2 and 3) of the egocentric-based task for Sham-operated rats (black squares) and IN-lesioned rats (clear triangles). (C) Response latencies of Sham and IN-lesioned rats during the different steps of the task. (C1) First criterion day; (C2) second criterion day; (C-1) 1 d before C1; (C-2) 2 d before C1. ( * ) P , 0.05. Data are expressed as means + S.E.M. Note that there was no difference in learning the initial task (step 1), whereas IN-lesioned animals showed difficulties transferring the response rule when the starting point was changed (steps 2 and 3). Cerebellum and cognitive optimization www.learnmem.org 345 Learning & Memory Cold Spring Harbor Laboratory Press on July 20, 2018 -Published by learnmem.cshlp.org Downloaded from would appear that while Sham-operated rats learned the rule of getting the food reward by turning 90˚right and could apply it to new configurations of the maze, rats with damage to the interpositus nuclei were unable to use this rule, improving their performance most likely with spatial information, animals associating distal cues with food. The perseverative behavior of lesioned rats when the response rule was reversed (turn 90˚left instead of turn 90˚right) in experiment 2 suggests that the lack of response rule generalization could be due to difficulties giving up the previously acquired stimulus-response (S-R) association, preventing animals adapting their response to new environmental cues. It has previously been shown that the cerebellum is involved in the organization of procedural learning, defined as the learning of skills and rules (Pascual-Leone et al. 1993; Molinari et al. 1997; Leggio et al. 1999) . In water maze tasks, cerebellar lesioned rats were unable to adapt a first learned task to a novel situation (Leggio et al. 1999; Colombel et al. 2004) . Using a maze procedure, Mandolesi et al. (2001) pointed out a behavioral inflexibility in hemicerebellectomized animals which perseverated with the same strategy without modifying it even after an error. Similarly, hemicerebellectomized rats were significantly impaired in adapting to ever-changing response rules, being unable to develop "learning to learn" ability as control rats did (De Bartolo et al. 2009 ). The present results bring new arguments to the literature, showing that interpositus nuclei are a key node within the cerebellum, necessary for the development of a "learning to learn" capability, and thus for normal behavioral adaptation. In contrast to the first experiment in which the start arms and, consequently, the goal arms changed from trial to trial, thereby pushing rats to base their behavior exclusively on egocen-tric proprioceptive cues, the second task used a fixed start-fixed arrival procedure. It allowed the parallel and/or successive use of allocentric cues necessary for a spatial strategy of navigation and of egocentric cues for a praxic strategy of navigation. The first strategy is classically considered as evaluating declarative memory and the second strategy as evaluating aspects of procedural learning (Schenk and Morris 1985) . In line with previous studies, Sham rats trained in this experiment were displaying a place strategy during early training and then predominately a response strategy after extensive training. IN-lesioned rats do not show a massive use of the spatial strategy during the first probe test and also do not show the classic shift to an egocentric-based behavior after overtraining. The low percentage of spontaneous alternation exhibited by IN rats, shown in a previous study (Callu et al. 2009a) , as well as the greater number of trials to change the response during the reversal test, suggest that IN-lesioned rats do not choose a strategy based on a selection of particular cues (spatial or egocentric), but base their response on the first reinforced trials and continue to use this first successful choice without any evolution or change with training. This could explain why IN rats surprisingly reached the learning criteria more rapidly during the initial phase of learning. These results suggest a deficit in the selection process allowing response rule optimization with overtraining. These results add information to previous results showing that, in animals, discrete lesions to interpositus nuclei prevented rats from showing the normal evolution from a goal-directed behavior to a habit-based behavior in a simple S-R operant learning (Callu et al. 2007) and that, in patients, deep nuclei lesions are major risk factors of cognitive impairments (Callu et al. 2009b; Tedesco et al. 2011) . It has been postulated that increasing experience induces important cerebellar-frontal interactions (Torriero et al. 2007; Balsters and Ramnani 2011) , suggesting that the process of automation may require the more flexible (but slower and less efficient) cortical systems to establish the routines, and then to relinquish control of these routines to cerebellar circuits. Moreover, transfer of learning, which involves retrieval and modification of previously acquired internal models, induces brain activations similar to those observed during the late phase of learning, showing that transfer of learning involves a reduction in the contribution of early learning circuits and increased reliance on the cerebellum (Seidler and Noll 2008; Seidler 2010). Lesioning of interpositus nuclei could interrupt these optimization processes by disrupting the cerebello-cortical circuit. Interpositus nuclei, by their central interacting position, seem to be crucial in this type of optimization processes and behavioral adaptation.