How Large Is the World Wide Web? [chapter]

Adrian Dobra, Stephen E. Fienberg
2004 Web Dynamics  
The problem of assessing the size of the World Wide Web is extremely difficult because sampling directly from the Web is not possible. Several groups of researchers have invested considerable effort to develop sound sampling schemes which involve submitting a number of queries to several major search engines. In this paper we present a statistical approach for the analysis of datasets collected by query-based sampling, utilizing a hierarchical Bayes formulation of the Rasch model for multiple
more » ... st population estimation. We show that our procedures accord with the real-world constraints and consequently they let us make credible inferences about the size of the World Wide Web.
doi:10.1007/978-3-662-10874-1_2 fatcat:kr3kfqsgvrfpzedpdsxiudhqii