Document Retrieval in Consideration of the Amount of Term Frequencies

Hiroshi Umemoto, Tadanobu Miyauchi, Yoshihiro Ueda
2001 NTCIR Conference on Evaluation of Information Access Technologies  
We propose a document retrieval that evaluates the degree of similarity between a query and a document in consideration of not only term-weights but also the amount of term frequencies. Different from tf-idf term-weighting schemes, the proposed scheme never reflects a term frequency in calculating the term-weight. We carried out an experiment in retrieval performance evaluation using a subset of NTCIR-1. It turned out that appropriate parameters of calculating the similarity are depend on the
more » ... mber of query terms and that the proposed scheme is superior to well-known tf-idf schemes in retrieval performance.
dblp:conf/ntcir/UmemotoMU01 fatcat:ucuoc3ufefam7da7thnu6w33v4