Comparison of Distance Measurement Methods on K-Nearest Neighbor Algorithm For Classification

Taca ROSA, Rifkie PRIMARTHA, Adi WIJAYA
2020 Proceedings of the Sriwijaya International Conference on Information Technology and Its Applications (SICONIAN 2019)   unpublished
K-Nearest Neighbor is a non-parametric classification algorithm that does not use training data and initial assumptions or models in the calculation process. The quality of the k-Nearest Neighbor classification results is very dependent on distance between object and value of k specified, so the selection for distance measurement method determines the results of classification. This study compares several distance measurement method, including Euclidean distance, Manhattan distance, Tchebychev
more » ... stance, Tchebychev distance and Cosine distance to see which distance measurement method can work optimally on the k-Nearest Neighbor algorithm. The selection of k values also determines the results of k-Nearest Neighbor classification algorithm, so determining the k value also needs to be considered. The data used in this study is a dataset of cervical cancer. The highest accuracy results obtained using the Cosine distance measurement method that is equal to 92.559% with a value of k = 9. Based on the accuracy values that have been compared, the most optimal distance measurement method is Cosine distance with the best k value obtained is k = 9 even though this distance measurement method has the highest computing time which is equal to 0.898 seconds.
doi:10.2991/aisr.k.200424.054 fatcat:hcpcdbgqljhkrh5jbl4lfkp27q