Probabilistic models of cognition: exploring representations and inductive biases

Thomas L. Griffiths, Nick Chater, Charles Kemp, Amy Perfors, Joshua B. Tenenbaum
2010 Trends in Cognitive Sciences  
Cognitive science aims to reverse-engineer the mind, and many of the engineering challenges the mind faces involve induction. The probabilistic approach to modeling cognition begins by identifying ideal solutions to these inductive problems. Mental processes are then modeled using algorithms for approximating these solutions, and neural processes are viewed as mechanisms for implementing these algorithms, with the result being a topdown analysis of cognition starting with the function of
more » ... ve processes. Typical connectionist models, by contrast, follow a bottom-up approach, beginning with a characterization of neural mechanisms and exploring what macro-level functional phenomena might emerge. We argue that the top-down approach yields greater flexibility for exploring the representations and inductive biases that underlie human cognition. Strategies for studying the mind Most approaches to modeling human cognition agree that the mind can be studied on multiple levels. David Marr [1] defined three such levels: a 'computational' level characterizing the problem faced by the mind and how it can be solved in functional terms; an 'algorithmic' level describing the processes that the mind executes to produce this solution; and a 'hardware' level specifying how those processes are instantiated in the brain. Cognitive scientists disagree over whether explanations at all levels are useful, and on the order in which levels should be explored. Many connectionists advocate a bottom-up or 'mechanism-first' strategy (see Glossary), starting by exploring the problems that neural processes can solve. This often goes with a philosophy of 'emergentism' or 'eliminativism': higherlevel explanations do not have independent validity but are at best approximations to the mechanistic truth; they describe emergent phenomena produced by lower-level mechanisms. By contrast, probabilistic models of cognition pursue a top-down or 'function-first' strategy, beginning with abstract principles that allow agents to solve problems posed by the worldthe functions that minds performand then attempting to reduce these principles to psychological and neural processes. Understanding the lower levels does not eliminate the need for higher-level models, because the lower levels implement the functions specified at higher levels. Opinion Glossary Backpropagation: a gradient-descent based algorithm for estimating the weights in a multilayer perceptron, in which each weight is adjusted based on its contribution to the errors produced by the network. Bottom-up/mechanism-first explanation: a form of explanation that starts by identifying neural or psychological mechanisms believed to be responsible for cognition, and then tries to explain behavior in those terms. Emergentism: a scientific approach in which complex behavior is viewed as emerging from the interaction of simple elements. Gradient-descent learning: learning algorithms based on minimizing the error of a system (or maximizing the likelihood of the observed data) by modifying the parameters of the system based on the derivative of the error. Hypothesis space: the set of hypotheses assumed by a learner, as made explicit in Bayesian inference and potentially implicit in other learning algorithms. Inductive biases: factors that lead a learner to favor one hypothesis over another that are independent of the observed data. When two hypotheses fit the data equally well, inductive biases are the only basis for deciding between them. In a Bayesian model, these inductive biases are expressed through the prior distribution over hypotheses. Inductive problem: a problem in which the observed data are not sufficient to unambiguously identify the process that generated them. Inductive reasoning requires going beyond the data to evaluate different hypotheses about the generating process, while maintaining uncertainty. Likelihood: the component of Bayes' rule that reflects the probability of the data given a hypothesis, p(djh). Intuitively, the likelihood expresses the extent to which the hypothesis fits the data. Posterior distribution: a probability distribution over hypotheses reflecting the learner's degree of belief in each hypothesis in light of the information provided by the observed data. This is the outcome of applying Bayes' rule, p(hjd). Prior distribution: a probability distribution over hypotheses reflecting the learner's degree of belief in each hypothesis before observing data, p(h). The prior captures the inductive biases of the learner, because it is a factor that contributes to the extent to which learners believe in hypotheses that is independent of the observed data. Top-down/function-first explanation: a form of explanation that starts by considering the function that a particular aspect of cognition serves, explaining behavior in terms of performing that function.
doi:10.1016/j.tics.2010.05.004 pmid:20576465 fatcat:rhtsthe6pncu3dldvyf67wdpfu