Search the past with the portuguese web archive

Daniel Gomes, David Cruz, João Miranda, Miguel Costa, Simão Fontes
2013 Proceedings of the 22nd International Conference on World Wide Web - WWW '13 Companion  
The web was invented to quickly exchange data between scientists, but it became a crucial communication tool to connect the world. However, the web is extremely ephemeral. Most of the information published online becomes quickly unavailable and is lost forever. There are several initiatives worldwide that struggle to archive information from the web before it vanishes. However, search mechanisms to access this information are still limited and do not satisfy their users who demand performance
more » ... milar to live-web search engines. This demo presents the Portuguese Web Archive, which enables search over 1.2 billion files archived from 1996 to 2012. It is the largest full-text searchable web archive publicly available [17] . The software developed to support this service is also publicly available as a free open source project at Google Code, so that it can be reused and enhanced by other web archivists. A short video about the Portuguese Web Archive is available at The service can be tried live at
doi:10.1145/2487788.2487934 dblp:conf/www/GomesCMCF13 fatcat:wraquyhr4jbbhfln2aj5o4gbly