A fair history of the Web? Examining country balance in the Internet Archive

Mike Thelwall, Liwen Vaughan
2004 Library & Information Science Research  
The Internet Archive, an important initiative that maintains a record of the evolving Web, has the promise of being a key resource for historians and those who study the Web itself. The Archive's goal is to index the whole Web without making any judgments about which pages are worth saving. The potential importance of the Archive for longitudinal and historical Web research leads to the need to evaluate its coverage. This article focuses upon whether there is an international bias in its
more » ... e. The results show that there are indeed large national differences in the Archive's coverage of the Web. A subsequent statistical analysis found differing national average site ages and hyperlink structures to be plausible explanations for this uneven coverage. Although the bias is unintentional, researchers using the Archive in the future need to be aware of this problem.
doi:10.1016/j.lisr.2003.12.009 fatcat:rur7mtb7kbc27posduma46uxcy