High-performance benchmarking with Web Polygraph

Alex Rousskov, Duane Wessels
2004 Software, Practice & Experience  
This paper presents the design and implementation of Web Polygraph, a tool for benchmarking HTTP intermediaries. We discuss various challenges involved in simulating Web traffic and in developing a portable, high performance tool for generating such traffic. Polygraph's simulation models, as well as our experiences with developing and running the benchmark, may be useful for Web proxy developers, performance analysts, and researchers interested in Web traffic simulation. We dedicate a
more » ... t amount effort to the development of industry-standard workloads. Standard workloads, developed in cooperation with caching companies and research groups, make product comparisons feasible and meaningful. These standard workloads are also the recommended starting point for new Polygraph users. A novice user can configure a sophisticated, standardized test * by specifying just a few key parameters such as peak request rate and cache size. Experienced testers can fine-tune standard workload parameters or define new workloads from scratch, usually as a result of many test trials and errors. Development of Polygraph, and introduction of new workloads, is an interactive process. The complexity of real-world traffic (and hence our models) drives the development of the Polygraph software. Applying the workloads to real proxies gives essential feedback and generates desire to add new features. This paper covers about five years of Polygraph development and experimentation. Web Polygraph was the benchmark of choice for several industry-wide benchmarking events. The collection of standardized Polygraph-based results is already quite comprehensive and continues to grow. The benchmark is routinely used by companies who market HTTP intermediaries, and by network engineers around the world. It is important to note that political and organizational issues related to developing and maintaining a benchmark of this scale did affect some of our design decisions, but they are beyond the scope of this paper. Web Polygraph's Contribution A good benchmark must generate traffic with realistic fundamental characteristics, such as the distribution of file sizes and request inter-arrival times. Extracting important parameters and patterns from various sources of real traffic constitutes a well established Web characterization activity (see Section 5). We do not claim to have made any contribution in that area, but simply use known characterization results in parameterizing Web Polygraph models. For example, standard Polygraph workloads use a mix of content types ("markup," "images," "downloads," and other), with various distributions of object sizes, including heavy-tailed distributions. This paper discusses the problems we have encountered while developing a comprehensive performance benchmark, and describes our solutions to those problems. Our contribution is in integrating basic isolated results, adjusting known simulation models, and making them work in a real, high performance production environment. As we learned, integrating simple models is often more complex than characterizing or modeling isolated traffic patterns. In fact, direct application of existing models is often impossible due to conflicts with other models, or incurred performance penalties. A good benchmark must come with a collection of well designed and tested workloads as well as a database of past results. We have started building the set of workloads to be used with Polygraph. Designing and testing new workloads is a complex and time consuming process that deserves a stand-alone research study. In this paper, we will discuss several workload * The words "test," "run," and "experiment" are used interchangeably in this paper. ; 1:1-10 Prepared using speauth.cls WEB POLYGRAPH 3 related problems and their possible solutions. We will focus on the relation between desired workload properties and their simulation in a high performance benchmark. Benchmark Architecture This section presents a high-level overview of the major Polygraph components. Before delving into design details, we outline the objectives that guided our work. Design Goals Several characteristics can be considered standard for any quality performance benchmark: realistic workloads, repeatable experiments, meaningful and comprehensive measurements, and reproducible results. These characteristics are well understood and create the foundation of Polygraph's design. The specifics of Web proxy benchmarking, and our ambition to develop an industry standard, led to the following additional goals: Scalability One should be able to test any single Web cache unit without changing workload parameters, except for claimed peak request levels and/or cache capacity. Individual caching units may support anywhere from ten to ten thousand requests per second. Flexibility The tool must be able to produce a wide range of workloads, from low level microtests to comprehensive macro-benchmarks. The tool should easily support new workloads related to proxy caching, such as workloads for server accelerators (a.k.a., surrogates or reverse proxies) and load-balancing L7 switches. Portability The tool should be usable in a variety of environments, such as different operating systems and hardware platforms. Efficiency The tool should utilize the available hardware to the greatest extent possible. Some benchmarking software requires excessive amounts of hardware in order to generate sufficient load. This limits the number of users who can perform their own tests. As we shall see, these design goals were paramount in most of our implementation decisions. Architectural Overview The Web Polygraph benchmark consists of virtual clients and servers glued together with an experiment configuration file. Clients (a.k.a. robots) generate HTTP requests for simulated objects. Polygraph uses a configurable mix of HTTP/1.0 and HTTP/1.1 protocols, optionally encrypted with SSL or TLS. Requests may be sent directly to the servers, or through an intermediary (proxy cache, load balancer, etc.). As Polygraph runs, measurements and statistics are saved to log files for detailed postmortem analysis. Polygraph generates synthetic workloads, rather than using real client or proxy traces. We feel that use of trace-based workloads does not allow us to meet our scalability and flexibility
doi:10.1002/spe.576 fatcat:v3skfbh3hfgkrigr44tmpz3lzm