Semi-supervised Persian font recognition

Maryam Bahojb Imani, Mohamad Reza Keyvanpour, Reza Azmi
2011 Procedia Computer Science  
Font recognition is one of the fundamental tasks in document recognition, because it is an important factor in optical character recognition. Classical supervised methods need lot of labeled data to train a classifier. Since it is very costly and time consuming to label large amounts of data, it is useful to use data sets without labels. So many different semi-supervised learning methods have been studied recently. Among the semi-supervised methods, self-training is one of the important
more » ... algorithms that classify the unlabeled samples with small amount of labeled ones and add the most confident samples to the training set. In this paper, we apply majority vote approach to classify the unlabeled data to reliable and unreliable classes. Then, we add the reliable data to training set and classify the remaining data including unreliable data in iterative process. We test this method on the extracted features of ten common Persian fonts. Experimental result indicates that proposed method improves the classification performance and it's effective.
doi:10.1016/j.procs.2010.12.057 fatcat:hsjonhfxqrfl3lpi5ihbnmxu3e