VizByWiki

Allen Yilun Lin, Joshua Ford, Eytan Adar, Brent Hecht
2018 Proceedings of the 2018 World Wide Web Conference on World Wide Web - WWW '18  
Data visualizations in news articles (e.g., maps, line graphs, bar charts) greatly enrich the content of news articles and result in well-established improvements to reader comprehension. However, existing systems that generate news data visualizations either require substantial manual effort or are limited to very specific types of data visualizations, thereby greatly restricting the number of news articles that can be enhanced. To address this issue, we define a new problem: given a news
more » ... le, retrieve relevant visualizations that already exist on the web. We show that this problem is tractable through a new system, VizByWiki, that mines contextually relevant data visualizations from Wikimedia Commons, the central file repository for Wikipedia. Using a novel ground truth dataset, we show that VizByWiki can successfully augment as many as 48% of popular online news articles with news visualizations. We also demonstrate that VizByWiki can automatically rank visualizations according to their usefulness with reasonable accuracy (nDCG@5 of 0.82). To facilitate further advances on "news visualization retrieval problem", we release our ground truth dataset and make our system source code publicly available.
doi:10.1145/3178876.3186135 dblp:conf/www/LinFAH18 fatcat:jcvdpdolznadvkp3hpbucggcoi