SIMULATION AND EXPLANATION IN NEUROPSYCHOLOGY AND BEYOND

Randall C. O'Reilly
1999 Cognitive Neuropsychology  
Introduction Like our colleagues Young and Burton (in press) (YB), we believe that good models explain a wide range of data, in ways that are motivated by independent theoretical considerations, and bad models explain a narrow range of data, by the ad hoc fitting of the model to the data, divorced from any more general theoretical considerations. Alas, YB's commentary demonstrates the difficulty of applying these seemingly straightforward principles to real models in a given research area. One
more » ... eeds an understanding of both empirical and computational issues before one can meaningfully judge "wide" versus "narrow" and "principled" versus "ad hoc." For example, accounting for a number of highly similar tasks should not be taken as evidence for "wide" explanatory scope, nor should explanations based on general computational principles be judged "ad hoc" because their independent motivation is not drawn from the realm of existing psychological models. We will argue that YB's preference for the IAC model (and localist models more generally) over our Farah, O'Reilly, and Vecera (1993) (FOV) model (and distributed models more generally) is based on a mistaken accounting of breadth of applicability and a neglect of fundamental computational principles, along with more prosaic errors such as a number of apparent mistakes in implementing simulations and a failure to note that several basic predictions of their model are disconfirmed by the available evidence. Underlying this broad pattern of failure to appreciate and attend to computational issues in modeling (from technical issues of implementation to theoretical issues of model predictions and neurobiological plausibility) is a fundamentally different view of the role of computation in psychological explanation. YB deny that features of the computation (such as the distributedness of the representations) are part of the model proper, and can play an explanatory role, instead relegating the computational aspects of psychological models to theory-irrelevant implementation. We have organized our response into three parts that parallel YB's, addressing their three questions: 1) Which model gives the most complete account of covert recognition in prosopagnosia? 2) Which model has wider applicability to related phenomena in the literature on face recognition? 3) What are the relative merits of the different modeling styles? We begin by introducing and clarifying a central point of contention between the two modeling styles, the use of localist versus distributed representations. We return to this issue again in part 3. In response to the first question, we point out that YB's model initially explained only a narrow range of data, and their new model explains two new phenomena only by using basic features of FOV, including distributed representations. Further, while their model now captures the two additional covert recognition phenomena, its predictions conflict with other more basic findings about prosopagnosia. FOV, on the other hand, provides a principled explanation of a disparate set of covert recognition tasks, including tasks that YB incorrectly state are beyond its scope, and also accounts naturally for several other features of prosopagnosia. In response to the second question, we show that their critique of our model is based on a number of mistaken beliefs about the capabilities of the FOV model and distributed representation more generally. In response to the third and most general question, we identify a basic difference of approach to modeling that appears to underlie the many other differences between YB's views and our own. Whereas YB regard computation as a tool for simulating already-articulated psychological theories, we view computation itself as potentially explanatory. We then present a sample of the overwhelming body of empirical and computational evidence supporting the reality of, and explanatory value of, distributed representation in human cognition. To YB's lament that distributed systems are more difficult to understand than local, we say "perhaps so," but while this is a relevant criterion for workers in the field of Human-Computer Interaction, it is not relevant for scientists selecting among theories of the natural world. O'Reilly & Farah 3 Explaining the Overt/Covert Dissociation The strength of the FOV model, in our view, was that it explained the overt/covert dissociation in three fundamentally different tasks on the basis of some very general properties of distributed network computation. Thus a reasonably wide scope of data was explained without invoking any assumptions specifically for that purpose, but rather by showing that they are a natural consequence of independently motivated and commonly used assumptions concerning computation by neural networks. YB's characterization of FOV as a case of "the ad hoc development of models to account for specific phenomena" thus misses both the principled basis of the model's success (e.g., distributed representations were not invented by us for the purpose of explaining covert recognition) and the generality of its scope (three very different manifestations of covert recognition). As we will detail below, the model is also successful in simulating various additional types of tasks, including sequential associative priming, overt familiarity judgments, forced choice cued recognition, and provoked overt recognition, demonstrating an even wider explanatory scope. There is no possibility of ad hoc fitting in these cases, as we simulated these tasks only in response to YB's allegation that it could not be done, and produced successful simulations that depend on the same small set of principles as in our original model. In contrast, the original IAC model explained behavior on just one general type of covert recognition task, which we originally termed "priming." Furthermore, as we will explain below, it did so by the ad hoc application of a decision criterion, external to the model itself and invoked only for overt tasks. Although the new version accounts for the two other tasks modeled by FOV, it does so with the help of two other features, for a 1:1 ratio of data to assumptions. With these features, shared with FOV (distributed representations and a learning mechanism), IAC can account for almost the same scope of covert recognition data as FOV. However, even with these features it makes a number of wrong predictions about prosopagnosia. The FOV Account of Overt/Covert Dissociations YB argue that the FOV model fails to account for three important aspects of overt/covert dissociations that the IAC model can account for: covert associative priming, overt familiarity judgments, and cued recognition. Here we show this to be wrong in all three cases. Further, we show that a fourth aspect, provoked recognition, which cannot be accounted for by the IAC model, can be simulated with the FOV model. Sequential Associative Priming When a face precedes a name by some short time interval, and the two are semantically related (e.g., both members of Britain's royal family), judgments about the name, such as a "famous" versus "not famous" judgment, can be made more quickly than with no face. This finding, called by YB "sequential associative priming," is also shown by some prosopagnosics, and is therefore a form of evidence for covert face recognition. In our 1993 article, we simulated a similar task involving simultaneous face and name presentations, and obtained associative priming as well as interference (delayed response to a name when the face is semantically dissimilar). Because the effects were so similar we grouped them together as one simulation of "priming". As far as covert face recognition is concerned, there is no reason to distinguish between priming by an unrecognized face presented simultaneously with a name, and priming by an unrecognized face presented a few seconds before. In contrast, YB credit IAC with broad scope partly for its ability to simulate both effects, and allege that FOV cannot account for sequential associative priming. Setting aside, momentarily, the question of whether FOV is really unable to simulate sequential associative priming, consider precisely what YB say FOV cannot do. They do not call our attention to a problem in priming a name judgment with a face, which FOV has already simulated. Nor do they report a failure to 4 Simulation and Explanation in Neurospychology and Beyond obtain priming per se when the face precedes the name. Rather, they were unable to obtain any response to the name preceded by a face, and so could not determine whether a face would or would not prime a subsequent name. YB correctly point out that FOV's attractor states are so strong that subsequent inputs have little effect, making it impossible to simulate any task involving sequentially presented stimuli. This is a well-known problem for attractor networks, and would likely be a problem for our brains as well if not for such factors as the discrete spiking nature of real neurons as compared to the continuous, real-valued outputs of model units, and neuronal fatigue. Fatigue can easily be captured in the model by introducing activation decay after the network settles into an activation state. This is commonly done when networks are used to model sequential processes (e.g., Burgess, 1995; Dayan, 1998) . Because we had not set out to simulate any sequential tasks, we did not originally incorporate decay. However, when FOV's activations are decayed after the presentation of the face stimulus, the subsequent presentation of the name stimulus was able to propagate through the network, and the presence or absence of sequential associative priming could be tested. Using the original FOV model, we did not find a substantial priming effect, presumably because of the relatively tiny difference in amount of overlap between the semantic representations of people from the same category and different categories (only one unit). We therefore altered the patterns learned by the model to include semantic representations in which the members of the same category (e.g., "actors"), all had overlapping distributed representations constructed as random variations of a common prototype. With this greater within-category semantic overlap, the network exhibited significant sequential associative priming at levels of damage up to 75%, the same degree of damage at which the system performs at chance on an overt task. Note that changing the patterns in this way would not be expected to affect the qualitative pattern of results in any of the previously reported simulations. We confirmed this by replicating the results from the original model. Appendix 1 gives the modeling parameters and results. Familiarity Many of YB's criticisms of our simulation of the overt/covert dissociation hinge on tasks involving familiarity judgments. We intentionally avoided simulating such tasks, because they require the modeler to take a stance on the mechanistic basis for familiarity judgments. Although familiarity seems to be a simple concept, the ways in which subjects make familiarity decisions are anything but simple. Perusal of the psychological literature from memory research (e.g., Jacoby, 1991) to lexical decision (which concerns familiarity decisions about letter strings; e.g., Seidenberg, Waters, Sanders, & Langer, 1984) makes clear the variety of factors that come into play, including automatic processes of both a perceptual and conceptual nature, and strategic processes. Modelers have made various attempts to find reasonable computational interpretations of familiarity (Plautno consensus has emerged at this point. YB attempt a very easy solution, simply stipulating that familiarity is PIN activation. Rather than incorporate questionable assumptions into our model, we prefer to remain agnostic about the mechanisms of familiarity judgment. What we give up is the possibility of attempting to simulate some overt-covert dissociations, specifically those designed to include familiarity judgments. Sensible people can disagree, and YB apparently place greater value than we do on simulating all variants of the overt-covert dissociation, as opposed to representative results from each type of task (relearning, priming, and speed of perception), even at the price of incorporating additional assumptions into a model. Therefore they attempted to simulate familiarity judgments with FOV, by assigning settling speed of visual or semantic units the interpretation of familiarity. They report that their simulations using this implementation of familiarity in FOV failed to capture the overt-covert dissociation. Given our reservations about the possibility of any simple implementation of familiarity, we were not surprised to learn of this failure. We were therefore doubly surprised when we could not replicate their reported failure with FOV! Con-O'Reilly & Farah 5 trary ¡ to our own conservatism regarding the computational tractability of familiarity, and also contrary to the reported simulation results of YB, we easily found the overt-covert dissociation when speed of settling in semantic units was used as an overt familiarity measure in our model. Specifically, at 50% damage to the face hidden units, a level of damage at which the various covert measures simulated by us show positive evidence of covert recognition, the averaged results from 50 random samples of forced-choice familiarity decisions showed settling time for familiar faces was not significantly different than the settling time for unfamiliar faces; indeed it was nonsignificantly longer. Further, when we used a different measure of familiarity known as the goodness (aka negative energy) of the network's activation state, which has been used in several other models (Becker et al., 1997; Borowsky & Masson, 1996; Rueckl, 1995) , we found that the advantage for trained ("familiar") faces over unfamiliar ones disappeared at only 25% damage. Thus, consistent with our reservations, the overt familiarity behavior of the model depends substantially on which familiarity measure is used. Nevertheless, two candidates for a familiarity measure both yielded the desired dissociation. Further details of these simulations are included in Appendix 2. We do not know why YB did not obtain the same result using the settling time familiarity measure. Although they state that they have "attempted to capture loss of familiarity in forced-choice tests" they say nothing about their simulation attempts and at least part of their conclusion is based not on simulation but on the reasoning that "Whenever there is any residual effect of learning, the model will favor a known over an unknown pattern," which they support empirically by reference to our finding (FOV, 1993, Simulation 1) of faster settling in visual units for familiar patterns. This reasoning reveals a misunderstanding of the behavior of distributed interactive networks. The visual units may settle faster with familiar patterns after damage, but as long as they are settling into incorrect states, the inputs to the semantic units for familiar patterns may be as far from well-structured semantic attractor basins as the inputs for unfamiliar patterns. In conclusion, we remain agnostic concerning the correlates of familiarity in neural networks, and therefore assign little weight to the success of the overt/covert simulations using either PIN activation or semantic settling time as a measure of familiarity. But to the extent that semantic settling time, or goodness, are reasonable candidates for familiarity in a neural network, the FOV model easily accounts for the dissociations in question. We cannot explain why YB did not obtain this result empirically, but note that their a priori reasoning was flawed concerning the impossibility of dissociating semantic settling time from covert measures. Forced Choice Cued Recognition Cued recognition is another form of covert recognition, in which prosopagnosics can make a correct forced choice decision between two names while viewing a face, even though they cannot overtly judge the face familiar or unfamiliar, or name the face. Contrary to the claim that FOV cannot be made to simulate this phenomenon, our model explains it very naturally, and we are glad for the opportunity to demonstrate FOV's success in another qualitatively different type of task. The important thing to note is that the forced choice cued recognition paradigm provides a strong source of additional constraint on the settling process in the form of the name input to the semantic layer via an intact pathway. This name input is capable of producing the correct corresponding semantic representation by itself, whereas the face input via the damaged pathway is not capable of producing the correct semantic representation by itself. One must be careful not to confuse the behavior of the network with only the weak constraint provided by the damaged face input with that when both this weak constraint and the strong name input are provided. In the latter case (i.e., forced choice cued recognition), the weak additional constraints provided by the face input can have a measurable impact because the network is brought into a region of greater sensitivity to this input by virtue of the strong name input. In other words, the weak face input by itself produces something like a floor effect, and the additional name input brings the system off this floor so that the damaged face input can now have measurable effects. 6 Simulation and Explanation in Neurospychology and Beyond This reasoning was confirmed by simulation using the FOV model. Using either semantic settling time or goodness as a measure of familiarity, we were able to simulate this cued recognition effect without any additional changes to the model, as described in Appendix 3. For example, at 75% damage to the face hidden units, where overt familiarity measures had long since failed, the system retained the ability to distinguish between correct and incorrect names for the faces. Provoked recognition Provoked recognition is another form of preserved face recognition in prosopagnosia, in which the subject ultimately experiences overt recognition. After seeing a number of faces from a single semantic category, such as actors, faces can be named and, reportedly, experienced as familiar. YB assert that neither IAC nor FOV can account for this finding, but in fact the phenomenon is compatible with a distributed constraint satisfaction architecture and can be simulated by FOV. The gist of the explanation is that repeated presentations of different faces with common semantic subpatterns will result in a build-up of residual activation primarily in that subpattern. This activation will sometimes provide the needed additional constraint to make up for the loss of constraints coming from damaged face representations to allow for successful semantic retrieval and naming. In order to test this interpretation, we presented a set of face input patterns all from the same semantic category (with the same decay manipulation as used in semantic associative priming between each input), and recorded measures of naming and familiarity as before. We found that overt recognition was more likely to occur after viewing multiple faces from the same category, as measured by greater familiarity and, at some levels of damage, greater success in naming. Factors contributing to the size of the effect include the amount of semantic pattern overlap and the amount of decay used. Our face-semantic-name patterns were not optimally designed for this simulation, with a common semantic subpattern of only 2 units, and because the model was not originally set up to simulate sequential effects, a relatively large amount of decay was necessary to overcome the strong attractor dynamics of the network, which reduced the level of accumulated activation in those units. Despite these limitations, the effect is reliable. See Appendix 4 for simulation details and results. So far we have seen that the FOV model is capable of explaining the full range of overt-covert dissociations discussed by YB, and that it does so in a natural way, without alterations designed solely for this purpose. We now turn to the IAC model, which differs both in failing to account for some of the key data on overt and covert recognition in prosopagnosia, and in relying on an ad hoc addition to the IAC model itself to account for the overt-covert dissociation. Ad hoc Nature of the IAC Account The original IAC model accounted for priming-based covert measures (associative priming and interference) by assuming that prosopagnosia uniformly attenuates weights from the face recognition units (FRUs) to the person identity nodes (PINs), and that overt task performance such as familiarity judgment requires that a threshold on the activation of the PINs be exceeded. The first time we read this, we assumed that this threshold was of the standard type used in neural network models, and could see how this might play a role in such a dissociation. But upon rereading, we realized that the explanatory work in this model is being done by a type of threshold that is unlike others discussed in the neural network literature -their threshold serves absolutely no computational purpose within the network, and its function is solely as an overt-covert-dissociation-maker. What does it normally mean for a unit to have a threshold? Units in neural networks summate input activation and also pass on output activation to other units. In many networks, units only pass on activation if the summated activation exceeds a certain value -the unit's threshold. Real neurons also have thresholds O'Reilly & Farah 7 in ¢ this sense of the word. In the IAC model, however, activation is continuously cascaded between units during both overt and covert tasks. Thus, their overt-covert threshold is not about determining when enough activation has accumulated to be propagated onwards. This applies to all of the units in the IAC model, including the PINs, and indeed it is the continued output from the PINs, despite their attenuated inputs, that underlies the preservation of the covert priming tasks. The PIN "threshold" that underlies the overt-covert dissociation in the IAC model is not part of the IAC model proper. Instead, it is a decision criterion applied to PIN activation levels only when they are used for overt familiarity judgments, and is external to the model, affecting none of the model's activations or weights. In the authors' own words, "Note that these threshold values are (of course) arbitrary. However, the exact thresholds chosen do not affect the processing of the model in any way. Activation is continually passed in a cascade fashion, and the threshold affects only the decision criterion" (Young & Burton, in press). It is because, and only because, the overt task of familiarity judgment has been stipulated to involve a decision criterion, using this external-to-the-model, arbitrary "threshold" that the IAC model dissociates overt and covert recognition. The essence of the IAC account of the overt-covert dissociation is this: "If one form of recognition is impaired after damage and another is spared, then hypothesize that an arbitrary criterion for minimal quality of processing is required just for the impaired ability and not for the spared one." This account is unsatisfying in the same way that that accounts of overt-covert dissociations that feature a "consciousness box" are unsatisfying: While both account for the basic dissociation in a straightforward way, it is just a little too easy to explain a selective impairment in one type of task by postulating a special component of the mind (consciousness system or decision criterion) that happens to be required only for the impaired task, without any other, independent motivation for including that component in the model or involving it in just the impaired tasks. Indeed, although YB seem to regard the IAC account as an improvement over the earlier hypothesis that face recognition had been disconnected from a consciousness system, we do not. A "decision criterion" may sound more mechanistic than a "consciousness system," but we have already shown that it in fact plays no mechanistic role in the behavior of the model, whereas there is at least ample independent precedent for hypothesizing systems involved in conscious awareness. Finally, we note that the ability of the IAC model to account for the two other tasks originally modeled by FOV depend on two additional assumptions, for a 1:1 ratio of model features to effects explained. A learning mechanism was added to model savings in relearning, and distributed face representations were added to account for familiarity effects in face matching, bringing the IAC account closer to FOV. Even with these features, however, the two models are not equally successful. In the four sections that follow, we review four of the IAC model's predictions about prosopagnosia that are clearly wrong. IAC Predicts Intact Forced Choice Overt Recognition in Prosopagnosia When psychologists suspect that performance in a task is limited by a decision criterion, that prohibits subthreshold knowledge from being expressed, they turn to a forced-choice paradigm. Instead of asking the subject "Is this an X, yes or no?" they show the subject and an X and a Y and ask "Which of these is an X?" Although the strength of the X-hood signal for the X might be below the criterion for deciding "Yes," it could still be greater for the X than for the Y. For this reason, accuracy in forced-choice tasks is sometimes called a criterion-free measure of subjects' ability (Green & Swets, 1966) . As we have already seen, the IAC model dissociates overt and covert recognition through the use of a criterion for PIN activation in an overt "yes/no" familiarity task. However, when we switch to this criterionfree forced-choice measure of overt performance, the IAC model always produces perfect performance, even with "prosopagnosic" levels of damage (FRU-PIN link attenuation). Evidence for this can be found in YB's Figure 7a , which shows that a familiar face will always cause higher PIN activation than an unfamiliar or less familiar face. Similarly, YB's Figure 4b shows that although the activation in name units might be too low 8 Simulation and Explanation in Neurospychology and Beyond to ¡ exceed a decision criterion after FRU-PIN link attenuation, the unit for the correct name will always be more active than the units for incorrect names, predicting accurate forced choice among names. Of course, we know that the overt recognition impairment of prosopagnosia is just as evident on forced-choice tests as on "yes/no" and naming tests. Figure 3 in our original 1993 paper shows that FOV reaches chance levels of performance on a forced choice task above 50% damage while continuing to manifest covert recognition by a number of measures. IAC Predicts Prosopagnosia is All-or-none Neuropsychological disorders can be mild or severe, and may change their level of severity over time. After an acute injury, the disorder may be severe and then gradually recover, either partially or completely. In degenerative conditions, the reverse may be seen. There is no neuropsychological impairment that is seen only in full-blown form or not at all. In particular, prosopagnosia can exist in mild, moderate or severe forms. Yet the IAC model predicts that patients are either normal at overt face recognition or totally unable to recognize any faces overtly. This problem results directly from the way the model accounts for covert recognition, namely the combination of a threshold on local PINs and uniform weight reduction between face recognition units (FRUs) and PINs. As the weights are attenuated, overt performance will remain unchanged until the familiarity threshold is reached. At that point performance will drop to chance levels, and remain there. One might try to fix this problem by making the weight reductions nonuniform. For example, the most realistic way of implementing damage in a network would be to eliminate some connections altogether while leaving others intact, allowing levels of overt recognition performance to fall anywhere between perfect performance and chance depending on the proportion of connections eliminated. Unfortunately, this implementation of damage eliminates the overt-covert dissociation: Some faces will be recognized, both overtly and covertly because their FRU-PIN connections are intact, and others will not be recognized either overtly or covertly, because their FRU-PIN connections have been severed. In light of this problem, one might aim for intermediate overt performance by varying the degree of attenuation of FRU-PIN connections without eliminating connections. This would have the desirable result of overt performance measures intermediate between perfect performance and chance, with the possibility of covert recognition for faces not overtly recognized. Unfortunately, it has the undesirable result of predicting perfect test-retest reliability, that is, certain faces always recognized and all other faces never recognized. Weak item effects may be seen with some prosopagnosics, but it is not the case that certain faces are reliably recognized, across different depictions, whereas others are never recognized. A final solution is to combat the perfect consistency of the model by directly adding variability to the activation values of the units. Although this could produce intermediate overt performance without strong item effects, it would be yet another ad hoc addition to the model. IAC Predicts Prosopagnosia Affects only Familiar Faces Although the literature contains claims of selective impairment of familiar face processing in prosopagnosia, whenever the perception of unfamiliar faces has been carefully tested it has been found to be impaired (see Farah, 1990; Shuttleworth, Syring, & Allen, 1982 , for reviews). Young, Newcombe, de Haan, Small, and Hay (1993) have shown that apparent dissociations between the processing of familiar and unfamiliar faces disappear when time to perform perceptual tests with unfamiliar faces is taken into account; patients may achieve a good accuracy score by abnormally slow and slavish checking of facial features. Indeed, cases PH, in whom covert face recognition has been demonstrated by preserved familiarity effects in face matching, performs simple perceptual face matching poorly (16% errors) and slowly (almost 3 seconds on average) even when the faces are unfamiliar (de Haan, Young, & Newcombe, 1987) . In contrast, the IAC model is based on the assumption that the impairment in prosopagnosia lies downstream from the perceptual representation of faces, in a part of the system that exists only for familiar faces, O'Reilly & Farah 9 namely £ connections between the perceptual FRUs and the PINs. The IAC model could be defended by hypothesizing that, for reasons of anatomical proximity, visual face representations are also likely to be damaged in cases of prosopagnosia, and have so far invariably been damaged. The FOV model has the advantage, however, of not requiring coincidental damage to two parts of the system; it is based on the assumption that perception of faces, familiar and unfamiliar, is impaired in prosopagnosia. IACL Predicts Prosopagnosia is Temporary The addition of a learning mechanism to the IAC model, resulting in the IACL model, creates another problem: it commits the modelers to the prediction that prosopagnosia is temporary, in that it can be entirely overcome by relearning. Given the way damage and relearning are simulated in IACL, there is nothing that requires the relearning to stop short of perfect performance. Indeed, comparing the results shown in their Figures 4a, b and c, one can see that after 5 trials of relearning, the network has completely recovered to an unlesioned level of performance. This would predict that prosopagnosic patients could recover all of their lost knowledge by simply studying all the faces they once knew for some (apparently relatively short) period of time! In contrast, relearning in FOV has a low asymptote, because a reduced number of units and weights are available to store the new knowledge -the network, like prosopagnosics, has actually suffered irreparable damage. IAC's Incorrect Predictions Follow from Theory-Relevant Features Of course, for every scientific model, some features are theory-relevant and some are not. We have highlighted several ways in which the predictions of the IAC model fail to accord with reality. An important question to ask is whether the IAC model's incorrect predictions result from theory relevant or theoryirrelevant features. In all cases, the failures derive directly from theory-relevant features, and directly or indirectly from the use of local representations. In both models, covert recognition is the result of a partially functioning system. With FOV's distributed representations, the "partiality" of the system's knowledge of faces consists of a subset of the weights that originally embodied knowledge of the faces' appearance. There is no equivalent way of damaging face representations with IAC's local representations, and so the partiality of functioning instead results from attenuated connection strengths between FRUs and PINs. This difference in the way partial functioning can take place in distributed and local systems accounts for all of the problems that the IAC model encounters in simulating prosopagnosia. The attenuation of FRU-PIN connections cannot account for impaired overt recognition without the imposition of a decision criterion external to the model, but this leaves the model unable to account for impaired overt recognition in criterion-free tasks. The choice between all-or-none prosopagnosia and strong item effects is forced upon the IAC model by its use of local representations, in conjunction with the criterion needed to create the overt-covert dissociation. Either weights are uniformly attenuated, giving rise to the all-or-none problem, or they are nonuniformly attenuated, giving rise to the perfect test-retest problem. There is no natural way to obtain a gradient of performance with varied amounts of damage other than having one specific face at a time drop from the "always recognized" to the "never recognized" category without building in variability explicitly for this purpose. In contrast, with distributed representations, each of the units and weights participate in the representation of many faces, and damage to each unit or weight therefore impacts on many faces. And because each face is represented by many units and weights, damage to each unit or weight has only a moderate effect on recognition of that face. Increasing damage therefore results in a gradient of performance for all faces, and because any particular lesion may by chance affect more of the units and weights involved in one face's representation than another's, there may be weak item effects. The different locations of the lesions in the IAC and FOV models, and their consequent predictions concerning unfamiliar face processing in prosopagnosia, can also be traced to the difficulty of implementing partial or graded performance in systems of local representation. For the reasons just stated, distributed face 10
doi:10.1080/026432999380979 fatcat:p6rrd4g6ibfcnexa725r3atkpe