520 Hits in 4.3 sec

Mining the Web for Lexical Knowledge to Improve Keyphrase Extraction: Learning from Labeled and Unlabeled Data [article]

Peter D. Turney
2002 arXiv   pre-print
I present experiments that show that the new features result in improved keyphrase extraction, although they are neither domain-specific nor training-intensive.  ...  of training documents in the given domain, with manually assigned keyphrases).  ...  GNU General Public License, and for sharing their results with me.  ... 
arXiv:cs/0212011v1 fatcat:23berap4sfbphaesdbnfoiepxm

KeaKAT: An Online Automatic Keyphrase Assignment Tool

Rabia Irfan, Sharifullah Khan, Irfan Ali Khan, Muhammad Asif Ali
2012 2012 10th International Conference on Frontiers of Information Technology  
However using Kea++ and its refinement as a system for assigning keyphrases to documents is not simple for users of a domain other than computing. The system needs to be installed and configured.  ...  The extended refinement methodology was developed to fine tune the results of Kea++ for multiple domains.  ...  Extraction phase uses the model to assign keyphrases to the document. Keyphrase assignment task can also be performed by Maui [13] along with keyphrase extraction and tagging.  ... 
doi:10.1109/fit.2012.14 dblp:conf/fit/IrfanKKA12 fatcat:3mb6qt2ehzdxncrvi4sya4ayp4

Improved Keyword and Keyphrase Extraction from Meeting Transcripts

J. I.Sheeba, K. Vivekanandan
2012 International Journal of Computer Applications  
In addition of traditional frequency or position-based clues, term specificity features, decision-making sentence-related features, as well as a group of features derived from summary sentences.  ...  Identifying Keywords Using Feature Extraction N-gram based Keyphrase Extraction Keyphrases are the combination of 2 or more words which describe a meaningful and important content in a document.  ... 
doi:10.5120/8260-1800 fatcat:oa6qozzoyrdt7cbxg62mbe34iq

Exploiting and Evaluating a Supervised, Multilanguage Keyphrase Extraction Pipeline for Under-Resourced Languages

Marco Basaldella, Muhammad Helmy, Elisa Antolli, Mihai Horia Popescu, Giuseppe Serra, Carlo Tasso
2017 RANLP 2017 - Recent Advances in Natural Language Processing Meet Deep Learning  
This paper evaluates different techniques for building a supervised, multilanguage keyphrase extraction pipeline for languages which lack a gold standard.  ...  Starting from an unsupervised English keyphrase extraction pipeline, we implement pipelines for Arabic, Italian, Portuguese, and Romanian, and we build test collections for languages which lack one.  ...  Features Extraction After the identification of the candidate keyphrases we assign to each of them seven features.  ... 
doi:10.26615/978-954-452-049-6_012 dblp:conf/ranlp/BasaldellaHAPST17 fatcat:x5qlfaivkffvho3eapxhsxtyo4

A Review of Keyphrase Extraction [article]

Eirini Papagiannopoulou, Grigorios Tsoumakas
2019 arXiv   pre-print
Keyphrase extraction is a textual information processing task concerned with the automatic extraction of representative and characteristic phrases from a document that express all the key aspects of its  ...  This article introduces keyphrase extraction, provides a well-structured review of the existing work, offers interesting insights on the different evaluation approaches, highlights open issues and presents  ...  | Keyphrase Extraction Software Both commercial and free software is developed for keyphrase extraction.  ... 
arXiv:1905.05044v2 fatcat:xeweqtrjrfbefi2h5g42uld4pe

Keyphrase Extraction using Sequential Labeling [article]

Sujatha Das Gollapalli, Xiao-li Li
2016 arXiv   pre-print
Several unsupervised techniques and classifiers exist for extracting keyphrases from text documents.  ...  In addition to a more natural modeling for the keyphrase extraction problem, we show that tagging models yield significant performance benefits over existing state-of-the-art extraction methods.  ...  Ranking approaches were also investigated for keyphrase extraction for specific domains where preference information among keyphrases is available (Jiang et al., 2009) .  ... 
arXiv:1608.00329v2 fatcat:xjxmry4ae5eg7doek277i3dvtm


Peter D. Turney
2012 Information retrieval (Boston)  
We developed the GenEx algorithm specifically for automatically extracting keyphrases from text.  ...  We approach the problem of automatically extracting keyphrases from text as a supervised learning task.  ...  Thanks to Elaine Sin of the University of Calgary for creating the keyphrases for the email message corpus.  ... 
doi:10.1023/a:1009976227802 fatcat:jmsmm3tgb5gh5flo6z3vcezhwa

DKPro Keyphrases: Flexible and Reusable Keyphrase Extraction Experiments

Nicolai Erbs, Pedro Bispo Santos, Iryna Gurevych, Torsten Zesch
2014 Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations  
DKPro Keyphrases is a keyphrase extraction framework based on UIMA. It offers a wide range of state-of-the-art keyphrase experiments approaches.  ...  At the same time, it is a workbench for developing new extraction approaches and evaluating their impact. DKPro Keyphrases is publicly available under an open-source license.  ...  Although Maui provides training data along with their software, this training data is highly domain-specific.  ... 
doi:10.3115/v1/p14-5006 dblp:conf/acl/ErbsSGZ14 fatcat:h5pdpbd6iney5jpoppj6qhndru

Automatic keyphrase annotation of scientific documents using Wikipedia and genetic algorithms

Arash Joorabchi, Abdulhussain E. Mahdi
2013 Journal of information science  
We have devised a set of twenty statistical, positional, and semantical features for candidate phrases to capture and reflect various properties of those candidates which have the highest keyphraseness  ...  Topical annotation of documents with keyphrases is a proven method for revealing the subject of scientific and research documents to both human readers and information retrieval systems.  ...  In some domains or datasets more general keyphrases are preferred by human annotators, whereas in others more specific keyphrases are preferred.  ... 
doi:10.1177/0165551512472138 fatcat:scx3iyeum5ds3onhow47uunj7i

A Distributed Framework for NLP-Based Keyword and Keyphrase Extraction From Web Pages and Documents

Paolo Nesi, Gianni Pantaleo, Gianmarco Sanesi
2015 Proceedings of the 21st International Conference on Distributed Multimedia Systems  
Each database record is populated with an extracted keyword or keyphrase, its corresponding POS-tag (or a different custom tag if it is a keyphrase), TF-IDF value and the source web domain. IV.  ...  subsequent estimation of extracted keywords/keyphrases relevance at web domain level (as later described).  ... 
doi:10.18293/dms2015-024 dblp:conf/dms/NesiPS15 fatcat:ieenhxagojenfdivup23wt42h4

Learning Algorithms for Keyphrase Extraction [article]

Peter D. Turney
2002 arXiv   pre-print
We developed the GenEx algorithm specifically for automatically extracting keyphrases from text.  ...  We approach the problem of automatically extracting keyphrases from text as a supervised learning task.  ...  Thanks to Elaine Sin of the University of Calgary for creating the keyphrases for the email message corpus.  ... 
arXiv:cs/0212020v1 fatcat:figcbj33vnd2jld2m7z2i3xioy

Machine Learning Based Keyphrase Extraction: Comparing Decision Trees, Naïve Bayes, and Artificial Neural Networks

Kamal Sarkar, Mita Nasipuri, Suranjan Ghose
2012 Journal of Information Processing Systems  
The three machine learning based keyphrase extraction methods that we use for experimentation have been compared with a publicly available keyphrase extraction system called KEA.  ...  The paper presents three machine learning based keyphrase extraction methods that respectively use Decision Trees, Naïve Bayes, and Artificial Neural Networks for keyphrase extraction.  ...  They used a domain specific glossary database to determine the domain specificity of the candidate phrases and integrated the domain specific feature and the traditional term frequency feature to rank  ... 
doi:10.3745/jips.2012.8.4.693 fatcat:bzsl6zau45f3hcwlvbfuoz5fre

Identifying important concepts from medical documents

Quanzhi Li, Yi-Fang Brook Wu
2006 Journal of Biomedical Informatics  
The latter assigns weights to extracted noun phrases for a medical document based on how important they are to that document and how domain specific they are in the medical domain.  ...  KIP combines two functions: noun phrase extraction and keyphrase identification. The former automatically extracts noun phrases from medical literature as keyphrase candidates.  ...  KIP's algorithm KIP is a domain-specific keyphrase extraction program, not a keyphrase assignment program, which means the generated keyphrases must occur in the document text.  ... 
doi:10.1016/j.jbi.2006.02.001 pmid:16545986 fatcat:mysdq64h55hylduaihta3lgcni

Building a Scientific Concept Hierarchy Database (SCHBase)

Eytan Adar, Srayan Datta
2015 Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)  
We present SCHBASE, a hierarchical database of keyphrases extracted from large collections of scientific literature.  ...  Extracted keyphrases can enhance numerous applications ranging from search to tracking the evolution of scientific discourse.  ...  Acknowledgments The authors thank the Microsoft Academic team, Jaime Teevan, Susan Dumais, and Carl Lagoze for providing us with data and advice.  ... 
doi:10.3115/v1/p15-1059 dblp:conf/acl/AdarD15 fatcat:yjyvp2wayrf3jhqd3jsiqukw6a

Representing Documents via Latent Keyphrase Inference

Jialu Liu, Xiang Ren, Jingbo Shang, Taylor Cassidy, Clare R. Voss, Jiawei Han
2016 Proceedings of the 25th International Conference on World Wide Web - WWW '16  
In this paper, we propose a data-driven model named Latent Keyphrase Inference (LAKI ) that represents documents with a vector of closely related domain keyphrases instead of single words or existing concepts  ...  Compared with the state-of-art document representation approaches, LAKI fills the gap between bag-of-words and concept-based models by using domain keyphrases as the basic representation unit.  ...  Extract domain keyphrases from a domain-focused document corpus; and 2.  ... 
doi:10.1145/2872427.2883088 pmid:28229132 pmcid:PMC5318165 dblp:conf/www/LiuRSCVH16 fatcat:7bnq3lg7areatavtfkjaw5pane
« Previous Showing results 1 — 15 out of 520 results