Deep Learning Architectures for Amortized Bayesian Inference in Cognitive Modeling

Stefan Radev
2021
I am thankful to my colleagues-turned-friends who accompanied me through this stage of life: Annika Stump, who represents the non-linearity of a creative life, who basically de ned the taste of whiskey and always has an open ear for me; Marco D'Alessandro, the all-round Italian genius with an unceasing well of ideas in his pocket; Mischa von Krause, the master of work-life-balance; and Lukas Schumacher, my fellow investor, who reminded me of the unceasing enthusiasm and curiosity a scientist
more » ... to carry in his or her heart. Extra credits go to my friend-turned-colleague, Maximilian Knapp, and his amazing family. In one way or another, I am indebted to all my international friends, collaborators, student assistants, and the entire SMiP community (especially David Izydorczyk and Martin Schnürch, who know how and what to drink during workshops and conferences). Contents At a whole di erent level, I am deeply grateful to my loving family, who never really let me feel away, and especially to my mother, who bravely struggled through all my boring papers and did not miss a single error. Last but not least, this whole journey would have been unthinkable without Karin Prillinger, who was a bit of everything and everywhere. I What I cannot create, I do not understand. -Richard Feynman Mathematical models are becoming increasingly important for describing, explaining, and predicting human behavior in terms of underlying mechanisms and systems of mechanisms. Although the ontology of such mechanisms remains largely unknown, their epistemic value and inferential power are now widely acknowledged throughout the behavioral sciences. Broadly speaking, whenever an assumed mechanism transforms information into behavior, it is referred to as a cognitive process. Cognitive processes are the conceptual fabric used to ll the explanatory gap between the mysterious ring of neurons and the mundane recognition of a long-forgotten acquaintance in the morning train. Consequently, modelers of cognitive processes earn their livelihood in an attempt to make the "ghost in a machine" tractable by replacing the ghost with hidden parameters embedded in an abstract functional framework. The purpose of such parametric models is twofold. On the one hand, they can be viewed as formal expedients for understanding the messy and noisy human data in much the same way as the models physicists employ to make sense of the data coming from spiral galaxies and interstellar clouds. On the other hand, parametric models can be viewed as behavioral simulators and used to mimic the output of cognitive processes by generating synthetic behavior. Interestingly, there is a strange asymmetry in the challenges surrounding these two goals. Simulating behavior requires only specifying a cognitive model as a computer program and running the program with a desired parameter con guration. It is thus a generative process mainly constrained by the creativity and imagination of individual modelers. Di erently, reverse engineering human data to recover hidden parameters is hampered by two external factors: the resolution and abundance of data and the availability of universal and e cient inferential methods. As for the latter, behavioral scientists have often sacri ced delity and complexity in order to adjust their models not to reality but to the limitations of existing inferential methods. Such a strategy is de nitely viable in the early (often linear and beguilingly clear) stages of scienti c inquiry, but it does not live up to the challenges and questions posed by later (often non-linear and disconcertingly fuzzy) stages. The main argument of this thesis is that questions of inferential tractability are of secondary importance for enhancing our understanding of the processes under study. Accordingly, the core purpose of this thesis is to develop frameworks which leave such questions to specialized "black-box" arti cial neural networks and enable researchers to focus on developing and validating faithful "white-box" models of cognition. Instead of a ready-made solution, the thesis explores a beginning of a solution. It presents a potentially fruitful coupling between human and arti cial intelligence, an approach which is expected to gain more and more momentum as the world lls with arti cial agents. Ultimately, this thesis strives to increase creativity by embracing complexity. • Chapter introduces our general BayesFlow framework for solving the task of amortized Bayesian parameter estimation. We demonstrate how to perform inference on data sets with di erent sizes and probabilistic structure by using specialized network architectures which preserve the probabilistic symmetry of the target Bayesian posterior. We formally derive a training procedure which ensures that neural networks in our framework recover the true target posteriors under perfect convergence of the optimization algorithm. We end the chapter with a simulation-study demonstrating the utility of our method.
doi:10.11588/heidok.00030807 fatcat:6vdp3u3buzec3pc32napfche5u