Business Forms Classification Using Earth Mover's Distance

Syed Saqib Bukhari, Markus Ebbecke, Michael Gillmann
2014 2014 11th IAPR International Workshop on Document Analysis Systems  
Form Classification has not been focused on for the last decade. Unfortunately the algorithms published mainly in the 80s and 90s do not meet the requirements in our present commercial document analysis projects. There we are confronted with conditions and requirements unanticipated by that research, such as fax distortions and -even worse -form variations. In this work we introduce a new color-coded pixel-based form classification method using Earth Mover's Distance (EMD) that is robust
more » ... fax distortions and content variations. Experimental results prove the effectiveness of the presented method. It achieved more than 90% classification accuracy on a realworld business forms dataset, which is significantly better than the competing state-of-the-art methods.
doi:10.1109/das.2014.59 dblp:conf/das/BukhariEG14 fatcat:ojbzfs72zfgc5fny2mx5uebv5e