A large time-aware web graph

Paolo Boldi, Massimo Santini, Sebastiano Vigna
2008 SIGIR Forum  
We describe the techniques developed to gather and distribute in a highly compressed, yet accessible, form a series of twelve snapshot of the .uk web domain. Ad hoc compression techniques made it possible to store the twelve snapshots using just 1.9 bits per link, with constant-time access to temporal information. Our collection makes it possible to study the temporal evolution link-based scores (e.g., PageRank), the growth of online communities, and in general time-dependent phenomena related to the link structure.
doi:10.1145/1480506.1480511 fatcat:gbvwqvvbbrfj7fenfmrp3qkvkm