A patent keywords extraction method using TextRank model with prior public knowledge

Zhaoxin Huang, Zhenping Xie
2021 Complex & Intelligent Systems  
AbstractFor large amount of patent texts, how to extract their keywords in an unsupervised way is a very important problem. In existing methods, only the own information of patent texts is analyzed. In this study, an improved TextRank model is proposed, in which prior public knowledge is effectively utilized. Specifically, two following points are first considered: (1) a TextRank network is constructed for each patent text, (2) a prior knowledge network is constructed based on public dictionary
more » ... data, in which network edges represent the prior interpretation relationship among all dictionary words in dictionary entries. Then, an improved node rank value evaluation formula is designed for TextRank networks of patent texts, in which prior interpretation information in prior knowledge network are introduced. Finally, patent keywords can be extracted by finding top-k node words with higher node rank values. In our experiments, patent text clustering task is used to examine the performance of proposed method, wherein several comparison experiments are executed. Corresponding results demonstrate that, new method can markedly obtain better performance than existing methods for patent keywords extraction task in an unsupervised way.
doi:10.1007/s40747-021-00343-8 fatcat:lphjxxj72ncwnhqj56zmfxjxkm