Term Clustering Using a Corpus-Based Similarity Measure [chapter]

Goran Nenadić, Irena Spasić, Sophia Ananiadou
2002 Lecture Notes in Computer Science  
In this paper we present a method for the automatic term clustering. The method uses a hybrid similarity measure to cluster terms automatically extracted from a corpus by applying the C/NC value method. The measure comprises contextual, functional and lexical similarity, and it is used to instantiate the cell values in a similarity matrix. The clustering algorithm uses either the nearest neighbour or the Ward's method to calculate the distance between clusters. The approach has been tested and
more » ... valuated in the domain of molecular biology and the results are presented.
doi:10.1007/3-540-46154-x_20 fatcat:o7yphqivivc37ekkn5knyevvse