Joint Hierarchical Semantic Clipping and Sentence Extraction for Document Summarization

Wanying Yan, Junjun Guo
2020 Journal of Information Processing Systems  
Extractive document summarization aims to select a few sentences while preserving its main information on a given document, but the current extractive methods do not consider the sentence-information repeat problem especially for news document summarization. In view of the importance and redundancy of news text information, in this paper, we propose a neural extractive summarization approach with joint sentence semantic clipping and selection, which can effectively solve the problem of news
more » ... summary sentence repetition. Specifically, a hierarchical selective encoding network is constructed for both sentence-level and documentlevel document representations, and data containing important information is extracted on news text; a sentence extractor strategy is then adopted for joint scoring and redundant information clipping. This way, our model strikes a balance between important information extraction and redundant information filtering. Experimental results on both CNN/Daily Mail dataset and Court Public Opinion News dataset we built are presented to show the effectiveness of our proposed approach in terms of ROUGE metrics, especially for redundant information filtering.
doi:10.3745/jips.04.0181 dblp:journals/jips/YanG20 fatcat:4b755cvpvzemndotvbw5nxje3a