Filters








715 Hits in 6.5 sec

The impact of document structure on keyphrase extraction

Katja Hofmann, Manos Tsagkias, Edgar Meij, Maarten de Rijke
2009 Proceeding of the 18th ACM conference on Information and knowledge management - CIKM '09  
Document structure may contain useful information about which parts or phrases of a document are important, but has rarely been considered as a source of information for keyphrase extraction.  ...  Keyphrases are short phrases that reflect the main topic of a document.  ...  Despite the large amount of structured and semi-structured documents available on the web and in organizations, there is little work on exploiting document structure for keyphrase extraction.  ... 
doi:10.1145/1645953.1646215 dblp:conf/cikm/HofmannTMR09 fatcat:3bjz3q5xgzh7bmui5usoaecnpm

Addressing Overgeneration Error: An Effective and Effcient Approach to Keyphrase Extraction from Scientific Papers

Haofeng Jia, Erik Saule
2018 Annual International ACM SIGIR Conference on Research and Development in Information Retrieval  
With the large and increasing amount of online documents, automatic keyphrase extraction has attracted much attention.  ...  Experiments on two datasets show our approach significantly alleviates the overgeneration error and obtains improvement in performance over stateof-the-art keyphrase extraction approaches.  ...  Acknowledgments This material is based upon work supported by the National Science Foundation under Grant No. 1652442.  ... 
dblp:conf/sigir/JiaS18 fatcat:jounq4cj3fax3elknrxuqmfcui

How Document Pre-processing affects Keyphrase Extraction Performance [article]

Florian Boudin, Hugo Mougard, Damien Cram
2016 arXiv   pre-print
In previous work, a wide range of document preprocessing techniques were described but their impact on the overall performance of keyphrase extraction models is still unexplored.  ...  The SemEval-2010 benchmark dataset has brought renewed attention to the task of automatic keyphrase extraction.  ...  to sophisticated document logical structure detection on richly-formatted documents recovered from Google Scholar .  ... 
arXiv:1610.07809v1 fatcat:7h25w3225ze5zhnltqovubdzri

A Review of Unsupervised Keyphrase Extraction Methods Using Within-Collection Resources

Chengyu Sun, Liang Hu, Shuai Li, Tuohang Li, Hongtu Li, Ling Chi
2020 Symmetry  
In this paper, the mainstream unsupervised methods to extract keyphrases are summarized, and we analyze in detail the reasons for the differences in the performance of methods then provided some solutions  ...  An essential part of a text generation task is to extract critical information from the text.  ...  Structural features: The difficulty of keyphrase extraction will be reduced by their fixed structures.  ... 
doi:10.3390/sym12111864 fatcat:izjth5vddfg7hp6ql4ybfygw7q

Simple Unsupervised Keyphrase Extraction using Sentence Embeddings

Kamil Bennani-Smires, Claudiu Musat, Andreea Hossmann, Michael Baeriswyl, Martin Jaggi
2018 Proceedings of the 22nd Conference on Computational Natural Language Learning  
Keyphrase extraction is the task of automatically selecting a small set of phrases that best describe a given free text document.  ...  Supervised keyphrase extraction requires large amounts of labeled training data and generalizes very poorly outside the domain of the training data.  ...  The results also show that the choice of document embeddings has a high impact on the keyphrase quality.  ... 
doi:10.18653/v1/k18-1022 dblp:conf/conll/Bennani-SmiresM18 fatcat:7jijbce5g5du7ccwmnyepvx2vm

Query-Based Keyphrase Extraction from Long Documents

Martin Dočekal, Pavel Smrž
2022 Proceedings of the ... International Florida Artificial Intelligence Research Society Conference  
This paper overcomes this issue for keyphrase extraction by chunking the long documents while keeping a global context as a query defining the topic for which relevant keyphrases should be extracted.  ...  The presented results show that a shorter context with a query overcomes a longer one without the query on long documents.  ...  The computation used the infrastructure supported by the Ministry of Education, Youth and Sports of the Czech Republic through the e-INFRA CZ (ID:90140).  ... 
doi:10.32473/flairs.v35i.130737 fatcat:azo7psgumveqdlojeaicmgdgsy

Simple Unsupervised Keyphrase Extraction using Sentence Embeddings [article]

Kamil Bennani-Smires, Claudiu Musat, Andreea Hossmann, Michael Baeriswyl, Martin Jaggi
2018 arXiv   pre-print
Keyphrase extraction is the task of automatically selecting a small set of phrases that best describe a given free text document.  ...  Supervised keyphrase extraction requires large amounts of labeled training data and generalizes very poorly outside the domain of the training data.  ...  The results also show that the choice of document embeddings has a high impact on the keyphrase quality.  ... 
arXiv:1801.04470v3 fatcat:wbbamh2unzcnbfo3qo25gziyfu

Select, Extract and Generate: Neural Keyphrase Generation with Layer-wise Coverage Attention [article]

Wasi Uddin Ahmad and Xiao Bai and Soomin Lee and Kai-Wei Chang
2021 arXiv   pre-print
However, one of the major challenges in neural keyphrase generation is processing long documents using deep neural networks.  ...  The experimental results on seven keyphrase generation benchmarks from scientific and web documents demonstrate that SEG-Net outperforms the state-of-the-art neural generative methods by a large margin  ...  Since POS tagging requires learning the syntactic structure of the keyphrases, we hypothesize that it can guide the decoder to generate structurally coherent keyphrases.  ... 
arXiv:2008.01739v2 fatcat:mqnhe3cjkva6nojewf7r2cpq2e

Unsupervised Keyphrase Extraction by Jointly Modeling Local and Global Context [article]

Xinnian Liang and Shuangzhi Wu and Mu Li and Zhoujun Li
2021 arXiv   pre-print
In terms of the local view, we first build a graph structure based on the document where phrases are regarded as vertices and the edges are similarities between vertices.  ...  Embedding based methods are widely used for unsupervised keyphrase extraction (UKE) tasks.  ...  This work was supported in part by the National Natural Science Foundation of China (Grant Nos.U1636211, 61672081,61370126), the 2020 Tencent Wechat Rhino-Bird Focused Research Program, and the Fund of  ... 
arXiv:2109.07293v1 fatcat:2ht4y4a7nvg4phhibdb3qtuwgu

Accurate Keyphrase Extraction from Scientific Papers by Mining Linguistic Information

Mounia Haddoud, Aïcha Mokhtari, Thierry Lecroq, Saïd Abdeddaïm
2015 International Conference on Scientometrics and Informetrics  
In this paper we investigate the impact of candidate terms filtering using linguistic information on the accuracy of automatic keyphrase extraction from scientific papers.  ...  We estimated experimentally the accuracy of a keyphrase extraction system using different noun phrase filters in order to determine which noun phrase definition yields to the best results.  ...  In this paper we investigate the impact of candidate terms filtering using linguistic information on the accuracy of automatic keyphrase extraction.  ... 
dblp:conf/issi/HaddoudMLA15 fatcat:k2umsojdvffxpb67qeotxpox2i

Hyperbolic Relevance Matching for Neural Keyphrase Extraction [article]

Mingyang Song, Yi Feng, Liping Jing
2022 arXiv   pre-print
Keyphrase extraction is a fundamental task in natural language processing and information retrieval that aims to extract a set of phrases with important information from a source document.  ...  Identifying important keyphrase is the central component of the keyphrase extraction task, and its main challenge is how to represent information comprehensively and discriminate importance accurately.  ...  of candidate phrases in the keyphrase extraction task.  ... 
arXiv:2205.02047v1 fatcat:7ao27o6nv5cw3a6s2rfk5rhupu

A New Keyphrases Extraction Method Based on Suffix Tree Data Structure for Arabic Documents Clustering

Issam SAHMOUDI, Hanane FROUD, Abdelmonaime LACHKAR
2013 International Journal of Database Management Systems  
To overcome this problem, in this paper, we propose a new and efficient Keyphrases extraction method based on the Suffix Tree data structure (KpST), the extracted Keyphrases are then used in the clustering  ...  The obtained results show that our method for extracting Keyphrases increases the quality of the clustering results.  ...  The Figure 5 presents the same experiments steps as Figure. 4 by using the stemming process before Keyphrases extraction in order to study it impact on the clustering results.  ... 
doi:10.5121/ijdms.2013.5602 fatcat:4eecve5x7bhi5mehogmjuyfi3i

Multiple Domain Answering by Analysing Semantic Relationship in QA Forum

J. Johnson Rajasingh
2018 International Journal for Research in Applied Science and Engineering Technology  
Though, previous techniques were introduced a key phrase extraction model, still the issues like word mismatching, misidentification of the words are not yet focused.  ...  Key phrases is the subfield that contains metadata that summarizes and characterize the documents.  ...  Feature Selection Structural features encode how different instances of a candidate keyphrase are located in different parts of a document.  ... 
doi:10.22214/ijraset.2018.2001 fatcat:xowfhw5ojzb2xjlukqjetvwvry

Keyphrase Extraction Using Semantic Networks Structure Analysis

Chong Huang, Yonghong Tian, Zhi Zhou, Charles Ling, Tiejun Huang
2006 IEEE International Conference on Data Mining. Proceedings  
Structural dynamics of these networks can easily identify key nodes, which can be used to extract keyphrases unsupervisedly.  ...  Keyphrases play a key role in text indexing, summarization, and categorization. However, most of the existing keyphrase extraction approaches require human-labeled training sets.  ...  Granted that the structure of a network represents the structure of its source document and the edges represent syntactic relation, to extract a keyphrase is to find a key node in the network.  ... 
doi:10.1109/icdm.2006.92 dblp:conf/icdm/HuangTZLH06 fatcat:ebmhu3gdnfdd7gyyxwam74uyuy

Text Preprocessing using Annotated Suffix Tree with Matching Keyphrase

Ionia Veritawati, Ito Wasito, T Basaruddin
2015 International Journal of Electrical and Computer Engineering (IJECE)  
Content of text is represented by keyphrases, which consist of one or more meaningful words. Keyphrases can be extracted from text through several steps of processing, including text preprocessing.  ...  Annotated Suffix Tree (AST) built from the documents collection itself is used to extract the keyphrase, after basic text preprocessing that includes removing stop words and stemming are applied.  ...  Boris Mirkin for his contributions to this research.This research was supported partially by Grant from Directorate of Higher Education of Indonesia.  ... 
doi:10.11591/ijece.v5i3.pp409-420 fatcat:xm42wcqaufckfbvacogct2x7cm
« Previous Showing results 1 — 15 out of 715 results