Normalized Table Matching Algorithm for Classifying News Articles

Taeho Jo
In this research, we propose encoding texts into normalized tables for categorizing texts, automatically. Previously , the table based approach was proposed, but the categorical scores indicating how much the text is relevant to the given category may be overestimated or underestimated by the given text length. As the solution to the problem, in this research, we encode texts into fixed sized tables, define the operation for computing the similarity between two tables as a normalized value, and
more » ... characterize it mathematically. As the benefits from this research, we are able to compute category scores independently of a given text length, consider weights from both texts, and expect the more stable performance. We validate empirically the proposed approach with respect to the performance and the stability by comparing it with the traditional approaches.