Quality assurance in document conversion

Christoph Becker
2011 Proceedings of the 4th ACM workshop on Online books, complementary social media and crowdsourcing - BooksOnline '11  
This paper discusses challenges and opportunities of using human computation and crowdsourcing for the task of quality assurance in document conversion processes and proposes a hybrid computer-human system approach. Digital content is never presented to a user directly, but always needs an intermediate presentation that is generated through an algorithm (such as a document viewer) that interprets data. When converting data such as documents, the question of authenticity of the derived
more » ... tion of these documents requires a comparison of the intellectually perceivable outcome of different interpretations. Such Quality Assurance is a key obstacle to scalability in document conversion processes. Currently, there is a severe lack of scalable techniques. We argue that this comparison is a Human Intelligence Task (HIT). To investigate the feasibility, potential pitfalls and key challenges in leveraging the wisdom of the crowd for this task, we have conducted several pilot experiments. We describe and discuss these experiments, and identify a number of key challenges that need to be addressed. In particular, we discuss the questions of motivation; task semantics; presentation and interaction design; and quality control. Finally, we outline a proposal to address these challenges in a hybrid computer-human system.
doi:10.1145/2064058.2064061 dblp:conf/cikm/Becker11 fatcat:4vlcn634nfbj7mhzh4jet54phi