Anti-Patterns in Infrastructure as Code
2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST)
In continuous deployment, infrastructure as code (IaC) scripts are used by practitioners to create and manage an automated deployment pipeline that enables information technology (IT) organizations to release their software changes rapidly at scale. Low quality IaC scripts can have serious consequences, potentially leading to wide-spread system outages and service discrepancies. By systematically identifying which characteristics are correlated with low quality IaC scripts, we can identify
... patterns i.e. recurring practices with negative consequences, which may help practitioners to take informed actions in creating and maintaining defect-free IaC scripts. The goal of this thesis is to help practitioners in increasing quality of IaC scripts by identifying development and security anti-patterns in the development of infrastructure as code scripts. Using open source repositories, we conduct five research studies and identify (i) defect categories in IaC scripts; (ii) operations that characterize defective IaC scripts; (iii) code properties that correlate with defective IaC scripts; (iv) development anti-patterns for IaC scripts; and (v) security anti-patterns in IaC scripts that are indicative of security weaknesses. From our conducted empirical studies, we observe defect category distribution of IaC scripts are different to that of general purpose programming languages. We observe three operations that characterize defective IaC scripts, namely file system operations, infrastructure provisioning, and user account management. We identify 10 source code properties that correlate with defective scripts, of which size and hard-coded strings as configuration values show the strongest correlation with defective scripts. We identify 13 development anti-patterns that are development activities, which correlate with defective scripts. We also identify 9 security anti-patterns i.e. coding patterns that are indicative of security weaknesses. We hope outcomes of the thesis i.e. findings, tools, and datasets will facilitate further research in the area of IaC. DEDICATION To my mother, Dr. Parveen Akhter, the greatest mentor of my life, whose teachings and endless sacrifices inspired me to all my success. ii ACKNOWLEDGEMENTS First of all I thank the Almighty. Next, I thank my parents especially my mother Dr. Parveen Akhter for her endless sacrifices. I thank my brothers Partho and Babu, and my wife Effat for their support and inspiration. I want to thank my adviser, Laurie Williams, one of the nicest human beings I have met in my life. Special thanks to my dissertation committee members: Chris Parnin, Tim Menzies, and Jonathan Stallings. In particular, my research and technical skills expanded significantly after I worked with Tim and Chris on multiple research projects. I thank my colleagues from the RealSearch, Alt-Code, RAISE, WSPR, and DLF research group members at North Carolina State University from whom I learned a lot. I want to thank my Bangladeshi friends from Bangladesh University of Engineering and Technology, University of Connecticut, and North Carolina State University for their support. Special thanks to Tauhidur Rahman of University of Alabama Huntsville, who guided me throughout my PhD career as an unofficial mentor and a brother. I also thank my colleagues at ABB Corporate Research, RedHat, IBM, and IBM Research who helped me learn about industry perspectives of software engineering research. During my PhD tenure I authored/co-authored 23 publications, with 23 co-authors. I thank all my co-authors whose feedback made my research better.