Empirical evaluation of hash functions for multipoint measurements

Christian Henke, Carsten Schmoll, Tanja Zseby
2008 Computer communication review  
A broad spectrum of network measurement applications demand passive multipoint measurements in which data from multiple observation points has to be correlated. Examples are the passive measurement of one-way delay or the observation of the path that a packet takes through a network. Nevertheless, due to high data rates and the need for fine granular measurements, the resource consumption for passive measurements can be immense. Furthermore, the resource consumption depends on the traffic in
more » ... network, which usually is highly dynamic. Packet and flow-selection methods provide a solution to reduce and control the resource consumption for passive measurements. In order to apply such techniques to multipoint measurements the selection processes need to be synchronized. Hash-based selection is a deterministic packet selection based on a hash function computed on selected parts of the packet content. This selection decision is consistent throughout the network and enables packet tracing and the measurement of delay between network nodes. Because the selection is based on deterministic function it can introduce bias which leads to wrong estimation of traffic characteristics. In this paper we define a set of quality criteria and select methods to investigate which hash function is most suitable for hash-based packet selection. We analyze 23 non-cryptographic and 2 cryptographic hash functions. Experiments are performed with real traffic traces from different networks. Based on the results we recommend 2 fast hash functions which show low bias and sample a representative subset of the population.
doi:10.1145/1384609.1384614 fatcat:y7bbide2qzarpcykr5pak2bw4u