How Do Interruptions Affect Productivity?
Duncan P. Brumby, Christian P. Janssen, Gloria Mark
Rethinking Productivity in Software Engineering
Introduction When was the last time you were interrupted at work? If you use a computer for work and if it has been more than a couple of minutes, count your blessings and be prepared for an upcoming interruption. Modern information work is punctuated by a constant stream of interruptions  . These interruptions can be from external events (e.g., a colleague asking you a question, a message notification from a mobile device), or they can be self-initiated interruptions (e.g., going back and
... orth between two different computer applications to complete a task). A recent observational study of IT professionals found that some people interrupt themselves after just 20 seconds of settling into focused work  . Given the omnipresence of interruptions in the modern workplace, researchers have asked what impact these have on productivity. This question has been studied in many application domains, from the hospital emergency room to the open-planned office, using a variety of different research methods. 86 In this chapter, we provide a brief overview of three prominent and complementary research methods that have been used to study interruptions. The methods we review are as follows: • Controlled experiments that demonstrate that interruptions take time to recover from and lead to errors • Cognitive models that offer a theoretical framework for explaining why and how interruptions are disruptive • Observational studies that give a rich description of the kinds of interruptions that people experience in the workplace For each of these three research approaches, we will explain the aim of the method, why it is relevant to the study of interruptions, and some of the key findings. Our aim is not to offer a comprehensive review of all studies in this area but rather an introduction focusing on our own past research, which spans each of these three methods. We direct the interested reader to more comprehensive reviews of the interruptions literature [28, 44, 45] . Controlled Experiments There is a long tradition of experiments being conducted to learn about the effect of interruptions on task performance. The earliest studies were conducted in the 1920s and focused on how well people remembered tasks that they had previously worked on. In these experiments, Zeigarnik  demonstrated that people were better at recalling the details of incomplete or interrupted tasks than tasks that had been finished. Since the advent of the computer revolution, research has focused on investigating the impact that interruptions have on task performance and productivity. This shift was probably spurred on by people's annoyances with poorly designed computer notification systems that interrupted them to attend to incoming e-mails or perform software updates while trying to work on other important tasks. Experiments offer a suitable research method to address the question of whether these feelings of being annoyed by interruptions and notifications translate into systematic and observable decrements in task performance. Before we review what has been learned from interruption experiments, it is worth taking a moment to reflect on the purpose of an experiment. Experiments are designed to test a hypothesis. For example, do people work slower when interrupted compared to when they have not been interrupted? To test this hypothesis, the researcher manipulates a feature of interest (the independent variable), which in our case might be the presence or absence of an interrupting task. The researcher wants to learn whether this manipulation has an effect on an outcome measure (the dependent variable), which in our case might be how quickly a task is completed. Experiments are designed to test the causal relationship between variables. To do this, the researcher will attempt to control all other extraneous variables. This is why experiments are usually conducted in a controlled setting using a fixed set of instructions and tasks given to all participants who take part in the experiment. In doing so, the researcher wants to be able to isolate whether a change in the independent variable has a reliable (i.e., statistically significant) effect on the dependent variable. If an effect exists, then it should show up time and again through the independent replication of results. As we will learn in a moment, experiments have consistently shown that interruptions negatively impact task performance. In a typical interruptions experiment, the researcher will ask a participant to work on a contrived task that they have designed. For example, the participant might be asked to use a computer interface to order some tasty donuts  . The cover story is provided to give some context to the task that the participant has been asked to work on, and it can be easily adjusted to suit the target domain of the study. For example, naval researchers have asked participants to place orders for the construction of ships  , and healthcare researchers have asked participants to place orders for prescription medicines  . Regardless of the domain, the researcher gives the participant detailed instructions on how to complete the task using the interface and plenty of opportunities to practice it before starting the main part of the experiment. In the main part of the experiment, participants will be asked to complete a number of tasks (e.g., place ten orders for doughnuts) using the instructed procedure. While the participant is working on this task, the researcher will occasionally interrupt them and ask them to work on a secondary task instead. The secondary task might require Chapter 9 how Do InterruptIons affeCt proDuCtIvIty? 88 the participant to solve some mental arithmetic problems  or use a mouse to track a moving cursor on the screen  . In these experiments, the arrival of this interrupting task is carefully controlled by the experimenter, and the participant is often given no choice but to switch from the primary task to the interrupting task. This is because the researcher wants to learn whether the interrupting task affects the quality and pace of the work produced on the primary task. How Is Disruptiveness of an Interruption Measured? This discussion leads us to consider how we measure the impact of an interruption on task performance. The primary measure that has been used is the time it takes a participant to resume work on the primary task after dealing with an interruption. This time-based measure is referred to in the literature as the resumption lag [4, 45] . The resumption lag measures the time it takes a person to re-engage with a task following an interruption. A longer resumption lag following an interruption reflects a general decrease in productivity: people are taking more time to complete a task, even when the time spent working on the interrupting task is deducted. In this way, the resumption lag is taken to reflect the time that is needlessly "wasted" as a consequence of being interrupted and later having to resume an unfinished task. Over recent years a number of experiments have been reported that use the resumption lag measure to carefully unpack which features of an interrupting task make it disruptive. Experiments have investigated whether longer interruptions are more disruptive than shorter interruptions-finding that longer interruptions result in longer resumption lags [19, 39] . Studies have also been conducted to learn whether there are better or worse points in a task to be interrupted-shorter interruption lags are found when interruptions occur at natural breakpoints in a task, such as the completion of a subtask [2, 7] . The content of an interrupting task also matters-interruptions that are relevant to the primary task are less disruptive than interruptions that have nothing to do with the primary task [17, 21] . As we will discuss, the resumption lag has been explained by assuming that interruptions interfere with people's ability to remember what they were doing prior to the interruption. Chapter 9 how Do InterruptIons affeCt proDuCtIvIty? 89 Interruptions Cause Errors When a person resumes a task following an interruption, it often matters whether they get it right or make a mistake. Previous research has shown that interruptions increase the likelihood of errors being made on a task, in that important components of the task are either repeated or missed [9, 32, 46] . This finding has been taken as evidence to support the idea that following an interruption people fail to remember what they were doing in a task prior to being interrupted. It has also been informative to consider whether there is a link between how quickly a task is resumed and the likelihood that an error is made. As discussed, interruption researchers have generally considered a longer resumption lag to be a bad thing-reflecting time needless wasted following an interruption. In contrast, Brumby et al.  found that longer resumption lags following an interruption were in fact beneficial in terms of reducing the occurrence of errors. This has important practical implications for the design of systems to encourage more reflective task resumption behavior in situations where interruptions are commonplace. Based on these findings, Brumby et al. developed and tested a post-interruption interface lockout that allowed users to look at the task interface but prohibited actions to be made. This interface lockout led to a significant reduction in resumption errors because it encouraged users to take the time to cognitively re-engage with a task before diving back into it and making a mistake. Moving Controlled Experiments Out of the Lab A criticism that is often leveled at the kind of interruption experiments that we've reviewed is that the controlled setting in which they are conducted bears little resemblance to people's actual work environments and how they manage the interruptions that they experience at work. In other words, our experiments can lack ecological validity because an important aspect of the phenomena that we are attempting to investigate is missing. This is an important concern because it means that the results of these interruption experiments might be of limited practical value or that they might not be valid at all when taken away from the controlled setting of the lab and applied to an actual work setting. Chapter 9 how Do InterruptIons affeCt proDuCtIvIty? 90 How might an interruption experiment lack ecological validity? Interruption experiments are often conducted in controlled environments in which the researcher actively works to remove unwanted distractions and interruptions (e.g., participants will be asked to turn off their phone and give their complete attention to the researcher's task). The reason for this is that the experimenter wants to carefully control the nature and the timing of any interruptions so as to learn how they affect performance. Ironically, this desire for control presents a major threat to the ecological validity of the experiment. This is because most of the everyday interruptions that we experience are not forced but are instead discretionary. For example, an e-mail notification might appear on a screen, but we can choose whether to act on it or ignore it. By using enforced interruptions that participants have to attend to, interruption experiments can fail to capture this important aspect of the phenomena that they are attempting to study in the lab. To overcome concerns about low ecological validity, Gould et al.  has taken an approach that relaxes experimental control over the environment in which participants work to study how naturally occurring interruptions affect performance. To do this, Gould et al. used an online crowdsourcing platform, Amazon's Mechanical Turk, to host an interruptions experiment. Just like in a regular interruptions experiment, participants were asked to use a browser-based task interface to place orders for prescription medicines. But unlike a traditional lab experiment, participants worked on this task in their regular everyday environment: an office, a coffee shop, or their home. These are naturalistic environments that are filled with everyday interruptions and distractions. In addition, workers on crowdsourcing platforms, like Amazon's Mechanical Turk, often work on multiple tasks at the same time; the environment is designed to encourage workers to complete as many tasks as possible so as to maximize their pay. This means that a competing (interrupting) task is often present, vying for the participant's attention. By running an interruptions experiment on a crowdsourcing platform, Gould et al.  found that workers switched to other tasks once every five minutes. This was revealed by window switching events and pauses in progression through the task. These interruptions were not inserted by the experimenter but were naturally occurring and at the discretion of the participant. Interestingly, this rate of interruptions corresponds to that seen in observational studies  . While these interruptions tended to be quite brief (around 30 seconds on average), Gould et al. found that they were sufficient to negatively impact performance on the primary task: participants who interrupted more often were considerably slower at completing the task, even after accounting for the time spent not working on the task. We know this only because the primary task interface was under the control of the researchers; this was not a naturalistic observation study. Chapter 9 how Do InterruptIons affeCt proDuCtIvIty? 91 Gould et al.'s study provides a bridge between controlled experiments and observation studies; it provides evidence that the disruptiveness of interruptions can be readily detected out in the field and that it is not an artificial product of the controlled setting used in interruption experiments. Summary: Controlled Experiments By conducting controlled experiments, researchers have been able to establish that task interruptions take time to recover from and lead to errors. Experiments offer an empirical approach for systematically testing whether the manipulation of an independent variable (e.g., the duration of a task interruption) has an effect on a dependent variable (e.g., the duration of the post-interruption resumption lag). Establishing whether the manipulation of an independent variable has an effect on the dependent variable is of both practical and theoretical value. In practical terms, knowledge is developed about what makes an interruption disruptive, allowing practical intervention to be developed and tested. For example, Brumby et al.  established that when people made faster task resumptions, they were more likely to make an error. Learning about this prompted the development of an interface lockout mechanism that stopped users from resuming a task quickly following an interruption, reducing task errors. In theoretical terms, experiments support the development of theories that seek to explain why longer interruptions result in a longer resumption lag. What is the mechanism that causes this? How can it be explained? In the next section, we turn our attention to reviewing efforts to develop theory using cognitive models. Cognitive Models Once findings have been made in experiments, the data and results can be used to develop theories about human behavior and thought. Cognitive models can be used to formalize the cumulative knowledge that is gained from experiments into formal theories (e.g., mathematical equations) that can generate predictions for future situations. For example, a mathematical model can be used to predict the likelihood that an error will be made on a task based on the duration of an interruption [4, 7] . Stated differently, cognitive models help to explain why and how interruptions are disruptive. Chapter 9 how Do InterruptIons affeCt proDuCtIvIty? 92 What Are Cognitive Models? An important characteristic of cognitive models is that they generate an exact prediction (i.e., generate a number) as an outcome (e.g., likelihood of an error), given an input (e.g., time away from the main task), and a formal description of how input is transformed into output (i.e., a computer program that captures theory of the process of forgetting). Other more conceptual theories of interruptions  or multitasking  also provide insight into human behavior and thought but typically tend to miss at least one of these three components (output, input, or transformation step) or describe them in less formal terms, such that the details that are needed to give an exact prediction are not available. The value of cognitive models lies in their ability to predict aspects of human behavior and thought in detail. Cognitive modeling aims to unravel human thought by uncovering the details and making those details open for scientific debate  . As an example, take the Memory for Goals theory of forgetting  , which has been applied to explain the results of interruption experiments. The model can be used to make a prediction for how quickly tasks will be resumed after an interruption. To do so, the model uses a mathematical function, derived from psychological theory, to determine how quickly a person will be able to recall what they were doing prior to dealing with an interruption based on the strength of this memory. The value of the model is that it gives a prediction for how quickly someone will resume a task (i.e., the resumption lag). Moreover, the general theory of memory retrieval that underpins this model helps explain why these resumption lags occur (namely, because of forgetting). Since the inception of the basic Memory for Goals theory, the theory has been refined in many ways. Examples include the prediction of errors due to interruptions , the prediction of task switching performance  , and the prediction of concurrent multitasking performance  . The initial modeling effort was crucial in this regard: by specifying a theory (of forgetting) in detail, it allowed researchers to make predictions regarding how memory impacts other settings, which could then be tested. In the end, these new experiments led to further refinements of the theory and to an even broader understanding of the cognitive mechanisms involved in recovering from an interruption. Although the value of cognitive models lies in the details, this is also its Achilles' heel. If a model is to be used to make predictions for a new task, then a researcher or practitioner needs to be able to specify those details ahead of time. To then specify those details, they also need to have a detailed understanding of the modeling framework and how these details should be specified within it. This is not feasible for every researcher and practitioner. Chapter 9 how Do InterruptIons affeCt proDuCtIvIty? 93 Fortunately, building on a long tradition in human-computer interaction research , more and more tools are being made to allow for predictions in applied settings, including dynamic settings such as driving [8, 43] . Moreover, in some cases not all details might be needed to make a prediction. For example, based on the mathematical equations behind Memory for Goals theory, recent work by Fong, Hettinger, and Ratwani  was able to predict the likelihood that emergency physicians resumed their original task after an interruption on their everyday emergency ward. What Can Cognitive Models Predict About the Impact of Interruptions on Productivity? One of the main insights to come from modeling work using the Memory for Goals theory is that the longer an interruption, the more likely it is that errors are to occur, including forgetting to resume the task altogether (and for specific cases, the models can give even more specific and exact predictions). Therefore, the implication of this work is that there is value in avoiding being interrupted. Models can also be used to inform our understanding of discretionary selfinterruptions. Previous studies have found that people often choose to interrupt themselves, switching between different activities every few minutes [16, 18] . For example, an information worker who is focusing on a particular work activity will still likely choose to monitor and check their e-mail regularly, switching back and forth between application windows. How often should the person switch between these two different activities? In our own research, we have used cognitive models to examine how the demands of a task affect the benefit of different switching strategies (i.e., how long to focus on one task before switching back to another task). We studied this in the context of a dual-task experiment in which participants had to control a dynamic task while performing a textentry task [13, 26, 27] . We used a cognitive model to identify the best possible strategy for dividing attention between these two tasks and then compared this to what people actually chose to do in the experiments. Across several studies, we found that people were very quick at locating the best possible strategy for dividing their time between tasks. We learn from this work that people are actually pretty good at multitasking, when the relative importance of each task is made clear to them. Cognitive modeling was a vital step in this work as it was used to identify the best possible switching strategy; without this, it would not have been possible to objectively benchmark how well people were multitasking. Chapter 9 how Do InterruptIons affeCt proDuCtIvIty? 94 Summary: Cognitive Models Cognitive models develop our understanding of why and how interruptions are disruptive. They do this by instantiating theory using mathematical models and simulations. This puts into practice the ideas we have for what is causing an interruption to impact performance. Through this line of research, Memory for Goals has emerged as an important theory. The core idea is that when dealing with an interruption, people forget what it is they were working on. Resuming a task therefore involves remembering what one was doing before the interruption. By casting this as a memory retrieval process, the Memory for Goals theory is able to draw on general theories about the nature of human memory. In practical terms, cognitive models can be used to both explain existing data and make predictions about what will happen in novel situations or settings. Observational Studies Whereas controlled experiments and cognitive models enable a focus on testing specific variables while controlling other factors, observational studies (also referred to as in-situ studies) offer ecological validity. For example, in the laboratory, the effects of interruptions may focus on a single interruption type from a single task. In a real-world environment, people generally work on multiple tasks, receiving interruptions from a range of sources. In-situ studies can serve to uncover reasons for people's behavior (i.e., the "why" of people's practices). It is a trade-off, however, of generalizability with ecological validity. Observational studies can be very labor-intensive, limiting the scope and scale of study. Yet, with the current revolution in sensor technologies and wearables, in-situ studies are beginning to leverage these technologies for researchers to conduct observational studies at a larger scale. Nevertheless, sensors still introduce limitations on what can be observed and how the data can be interpreted.