Data Stream Algorithms for Large Graphs and High Dimensional Data [thesis]

Hoa Vu
In contrast to the traditional random access memory computational model where the entire input is available in the working memory, the data stream model only provides sequential access to the input. The data stream model is a natural framework to handle large and dynamic data. In this model, we focus on designing algorithms that use sublinear memory and a small number of passes over the stream. Other desirable properties include fast update time, query time, and post processing time. In this
more » ... ng time. In this dissertation, we consider different problems in graph theory, combinatorial optimization, and high dimensional data processing. The first part of this dissertation focuses on algorithms for graph theory and combinatorial optimization. We present new results for the problems of finding the densest subgraph, counting the number of triangles, finding max cut with bounded components, and finding the maximum $k$ set coverage. The second part of this dissertation considers problems in high dimensional data streams. In this setting, each stream item consists of multiple coordinates corresponding to different attributes. We consider the problem of testing or learning about the relationships among the attributes, and the problem of finding heavy hitters in subsets of attributes.
doi:10.7275/12760106 fatcat:yr4zwodzh5gxpdg5kv5zx3cr5e