A Typeface Searching Technique Using Evaluation Functions for Shapes and Positions of Alphabets Used in Ancient Books for Image Searching

Jun-Ho Huh, Kyungryong Seo
2016 International Journal of Hybrid Information Technology  
Various projects to digitize the document and book inventories around the world are proceeding recently. At the same time, utilizing the IT/ICT technologies for the paleographic works is becoming an interesting issue among the historical and academic circles especially in the Republic of Korea where both Chinese characters and native Korean alphabets have been used in their ancient articles. There are some subtle differences in the typefaces and Great Seals used in each dynasty or year
more » ... on the writers or the popular writing style of the time. The typeface recognition technique proposed in this article includes an alphabet analysis DTW algorithm and an alphabet position analysis method to determine the periods and possible authors of old documents. The results of analysis showed that the former was more reliable compared to the latter. As for the future task, the documents and literatures (e.g., original Hunminjongum Haerebon, Yongbieocheonga, Seokbosangjeol and Wolin-seokbo) in the period of Hunminjongum invention whose typeface characteristics are similar or the same will be used. Also, those editions and printed books [e.g., Hunminjeongeum eonhaebon (King Sejong"s invention), Oryunhaengsildo, Songgang-gasa, Banggakbon Novels, king"s and noblemen"s writings (Jeong Cheol, Kim Jeong hee, Heo Mok and yang Sa-heon)] written or printed with a calligraphic penmanship and have a unique brush-stroke touch will be used. Formula 4 Extract each consonants and vowel with a method of increasing width and length by 1 pixel unit starting from the points where each separation starts and move on to the next line when non-letter spaces appear. The separation result of this method is shown in Figure 3 . Figure 3. Typeface Alphabet (Consonants and Vowels) Separation By applying DTW algorithm to the graph on which each separated alphabets have been projected, the authors obtained an output as in Figure 4 . With the formula, the authors extracted a value of 86.90789291 as a value of the form evaluation function for final evaluation. Alphabet Location Search Evaluation Function The Korean letters are typically formed with an initial consonant and a vowel, frequently having an additional consonant(s) under the vowel. The position of each alphabet can be distinctive depending on the writer so that the authors used this to make comparisons. Below Figure 5 indicates the characteristic points for two differing penmanship. Here, observing the letter "림" one can realize that the sizes of central spaces are different.
doi:10.14257/ijhit.2016.9.9.27 fatcat:l4t4nrzuujhzphh4yrugfbifze