Addressing Overgeneration Error: An Effective and Effcient Approach to Keyphrase Extraction from Scientific Papers

Haofeng Jia, Erik Saule
2018 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval  
Keyphrases provide a concise summary of a document and play an important role for many other tasks like searching and clustering. With the large and increasing amount of online documents, automatic keyphrase extraction has attracted much attention. Existing unsupervised methods suffer from overgeneration error, since they typically identify key "words" and then return phrases that contain keywords as keyphrases. To alleviate this problem, we propose an unsupervised ranking scheme directly on
more » ... rases" by exploring essential properties of keyphrases such as informativeness and positional preference. Experiments on two datasets show our approach significantly alleviates the overgeneration error and obtains improvement in performance over stateof-the-art keyphrase extraction approaches.
dblp:conf/sigir/JiaS18 fatcat:jounq4cj3fax3elknrxuqmfcui