Comprehending large code bases - the skills required for working in a "brown fields" environment

Tony Clear
2005 ACM SIGCSE Bulletin  
In the search for answers to the effective teaching of programming at the beginner level, we are now seeing broader programs of research investigate the distinctions between reading, comprehending and writing small programs [1], [2] . In New Zealand we have joined this work with the "Bracelet" project, in which multiple institutions will investigate how students comprehend small computer programs. We hope this may help answer critical teaching and assessment questions. A contrasting stream of
more » ... search [8], [9], [10] has been investigating how professional programmers comprehend the often large and complex software artefacts which they must maintain. The importance of this work is demonstrated in the figures provided in [8] who assert that "program comprehension is a major part of software development...up to 70% of lifecycle costs are consumed within the maintenance phase and that up to 50% of maintenance costs relate to comprehension alone". As I prepare to teach our undergraduate software engineering course this semester, I find myself grappling with the question of how to effectively convey to students the twin notions, critical to Software Engineering, of scale and complexity. Our SE course comes as a mid-degree course in the three year AUT Bachelor of Computer and Information Science, in the semester prior to the final year capstone. The course attempts to simulate reality by adopting an "authentic learning" approach [12], and providing a project context to which the concepts taught in the accompanying lecture program may be related. The single semester duration and size of the course naturally constrain the scope and complexity of the tasks that may be assigned. The challenge is to select a project of a suitable scale and complexity to enable SE processes and practices to be sensibly exercised, while having an assignment that can be successfully completed within the allotted time. Invariably we find ourselves in the situation where perceived complexity is insufficient for students to actively adopt the relevant practice, e.g. configuration management by use of a source control tool; careful use of work break down structures in the assignment of roles, tasks and responsibilities; selection of a suitable O.O. architecture and relevant design patterns; planning for quality assurance and risk management strategies and techniques; and regular monitoring and recalculating of estimates. One alternative to this approach that we have been considering is to have the teams work on an existing, large code base to produce a manageable extension module. This, at first glance, encapsulates all the aspects that we are seeking to include in the course: scope, complexity, use of others' code; integration into an existing architecture; possible refactoring of designs, the motivation of a possible contribution to a code base for an existing open source application; exposure to an existing set of development practices and standards; and a clear demonstration of the
doi:10.1145/1083431.1083439 fatcat:t63267avqjbjdn2s26clbx5kv4