Controlled experimentation in continuous experimentation: Knowledge and challenges

Florian Auer, Rasmus Ros, Lukas Kaltenbrunner, Per Runeson, Michael Felderer
2021 Information and Software Technology  
Context: Continuous experimentation and A/B testing is an established industry practice that has been researched for more than 10 years. Our aim is to synthesize the conducted research. Objective: We wanted to find the core constituents of a framework for continuous experimentation and the solutions that are applied within the field. Finally, we were interested in the challenges and benefits reported of continuous experimentation. Methods: We applied forward snowballing on a known set of papers
more » ... and identified a total of 128 relevant papers. Based on this set of papers we performed two qualitative narrative syntheses and a thematic synthesis to answer the research questions. Results: The framework constituents for continuous experimentation include experimentation processes as well as supportive technical and organizational infrastructure. The solutions found in the literature were synthesized to nine themes, e.g. experiment design, automated experiments, or metric specification. Concerning the challenges of continuous experimentation, the analysis identified cultural, organizational, business, technical, statistical, ethical, and domain-specific challenges. Further, the study concludes that the benefits of experimentation are mostly implicit in the studies. Conclusion: The research on continuous experimentation has yielded a large body of knowledge on experimentation. The synthesis of published research presented within include recommended infrastructure and experimentation process models, guidelines to mitigate the identified challenges, and what problems the various published solutions solve. (M. Felderer). 1 Both authors contributed equally to this work. comparing different variants of the product to the unmodified product (i.e. A/B testing). This is done by exposing different users to different product variants and collecting data about their behavior on the individual variants. Thereafter, the gathered information allows making data-driven decisions and thereby reducing the amount of guesswork in the decision making. In 2007, Kohavi et al. [1] published an experience report on experimentation at Microsoft and provided guidelines on how to conduct so-called controlled experiments. It is the seminal paper about continuous experimentation and thus represents the start of the academic discussion on the topic. Three years later, a talk from the Etsy engineer Dan McKinley [5] gained momentum in the discussion. In the talk, the term continuous experimentation was used to describe their experimentation practices. Other large organizations, like Facebook [33] https://doi. Information and Software Technology 134 (2021) 106551 2 F. Auer et al. and Netflix [34], which adopted data-driven decision making [35] , shared their experiences [36] and lessons learned [37] about experimentation over the years with the research community. In addition, researchers from industry as well as academia developed methods, models and optimizations of techniques that advanced the knowledge on experimentation. After more than ten years of research, numerous work has been published in the field of continuous experimentation, including work on problems like the definition of an experimentation process [38], how to build infrastructure for large-scale experimentation [39, 159] , how to select or develop metrics [40] , or the considerations necessary for various specific application domains [41] .
doi:10.1016/j.infsof.2021.106551 fatcat:bygtzpqotjc5jipg64dcelk4km