On Some Techniques for Streaming Data: A Case Study of Internet Packet Headers

Edward J Wegman, David J Marchette
2003 Journal of Computational And Graphical Statistics  
We consider the implications of streaming data for data analysis and data mining. Streaming data are becoming widely available from a variety of sources. In our case we consider the implications arising from Internet traffic data. By implication, streaming data is unlikely to be time homogeneous so that standard statistical and data mining procedures do not necessarily apply. Because it is essentially impossible to store streaming data, we consider recursive algorithms, algorithms which are
more » ... tive and discount the past and also algorithms that create finite pseudo-samples. We also suggest some evolutionary graphics procedures that are suitable for streaming data. We begin our discussion with a discussion of Internet traffic in order to give the reader some sense of the time and data scale and visual resolution needed for such problems.
doi:10.1198/1061860032625 fatcat:6uu3qxje75hyrbrptvika4rtgq