28,096 Hits in 5.0 sec

A study on automatically extracted keywords in text categorization

Anette Hulth, Beáta B. Megyesi
2006 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the ACL - ACL '06  
This paper presents a study on if and how automatically extracted keywords can be used to improve text categorization.  ...  In summary we show that a higher performance -as measured by micro-averaged F-measure on a standard text categorization collection -is achieved when the full-text representation is combined with the automatically  ...  Acknowledgments The authors are grateful to the anonymous reviewers for their valuable suggestions on how to improve the paper.  ... 
doi:10.3115/1220175.1220243 dblp:conf/acl/HulthM06 fatcat:lgelsjrxtja5rhymmdrkyrrbg4

Nuclear Exports Control System Using Semi-Automatic Keyword Extraction

Uihyun Kim
2014 International Journal of Information and Electronics Engineering  
Index Terms-Nuclear exports control system, case-based reasoning, text categorization, keyword extraction.  ...  Among the three methods, semi-automatic approach is the most efficient and effective in extracting keywords, demonstrating that the combination of machine and human is a promising solution that can effectively  ...  ACKNOWLEDGMENT Research reported in this paper was supported by Korea Institute of Nuclear Nonproliferation and Control (KINAC) and financially supported by Nuclear Safety and Security Commission (NSSC  ... 
doi:10.7763/ijiee.2014.v4.451 fatcat:b6mcmdci5zghzces52zmufo3em

A Collective Study for Document Recommendation Using Textual Conversation Keywords

2016 International Journal of Science and Research (IJSR)  
Keyword Extraction is an important technique in many areas of document processing such as text clustering, text summarization, and text retrieval.  ...  In this paper, a survey of keyword extraction technique have been presented that can be applied to extract the keyword that uniquely identified the documents.  ...  Keyword extraction from text is a tool commonly used by search engines and indexes alike to quickly categorize and locate specific data based on explicitly or implicitly supplied keywords.  ... 
doi:10.21275/v5i1.nov152670 fatcat:76y73lkoyzbczccfjy36qzsu3a

The use of additional evidence in mining usercreated descriptions for content structural design

Yan Wu, Nader Asnafi
2018 MATEC Web of Conferences  
The study collected the data and run the text mining analysis with text analysis, clustering and topic extraction.  ...  The use of a text mining approach for full automatic taxonomy creation for content management has proven with serious limitations.  ...  Many studies have been carried out using data mining techniques to extract topics and subtopics from the text documents and to automatically generate taxonomy directly from the document texts [3, 4] .  ... 
doi:10.1051/matecconf/201818903004 fatcat:e6narwmsubfz5f4dtiyv5lwwiq

A survey on different text categorization techniques for text filtration

Shashank H. Yadav, Balu L. Parne
2015 2015 IEEE 9th International Conference on Intelligent Systems and Control (ISCO)  
In this paper, we have done a survey on different text categorization techniques and algorithms used for text categorization.  ...  We started our survey by studying various text categorization techniques used for recognizing of offensive texts.  ...  For doing this we have done a survey on various text categorization techniques used for text filtration in NLP.  ... 
doi:10.1109/isco.2015.7282375 fatcat:k3js44zqwbgplckcay74igoaey

Automatic categorization and summarization of documentaries

Kezban Demirtas, Nihan Kesim Cicekli, Ilyas Cicekli
2010 Journal of information science  
In this paper, we propose automatic categorization and summarization of documentaries using subtitles of videos. We propose two methods for video categorization.  ...  The second has the same extraction steps but uses a learning module to categorize. Experiments with documentary videos give promising results in discovering the correct categories of videos.  ...  [4] performed automatic news video story categorization based on the closed-captioned text.  ... 
doi:10.1177/0165551510382070 fatcat:gzotetnhprdsjjvsvd3pyc5cba

Using CISMeF MeSH "Encapsulated" terminology and a categorization algorithm for health resources

Aurélie Névéol, Lina F. Soualmia, Magaly Douyère, Alexandrina Rogozan, Benoı̂t Thirion, Stefan J. Darmoni
2004 International Journal of Medical Informatics  
We evaluate this algorithm on a random set of 123 resources extracted from the CISMeF catalogue.  ...  Material and methods: A two-step categorization process consisting of mapping resource keywords to CISMeF metaterms and ranking metaterms by decreasing coverage in the resource has been developed.  ...  Bodenreider [2] then conducted an automatic text categorization study similar to the one presented here, using the Unified Medical Language System (UMLS) metathesaurus semantics, after a mapping of index  ... 
doi:10.1016/j.ijmedinf.2003.09.004 pmid:15036079 fatcat:bte3sdzkmzet7dnzlvyr4dcf6y

Improved Unsupervised Framework for Solving Synonym, Homonym, Hyponymy & Polysemy Problems from Extracted Keywords and Identify Topics in Meeting Transcripts

J I Sheeba
2012 International Journal of Computer Science Engineering and Applications  
one .In this proposed frame work, a dataset has been designed to solve the above mentioned four problems automatically.  ...  Keyword is the important item in a document that provides efficient access to the content of a document.  ...  The text categorization by boosting automatically from extracted concepts by Cai and Hoffman [14] is almost certainly the study most related to this framework.  ... 
doi:10.5121/ijcsea.2012.2508 fatcat:4wpztnz76bbpbhrrdewxa2jb4e

Text Document Classification by using WordNet Ontology and Neural Network

Manisha Gawade, Tejashree Mane, Dhanashree Ghone, Prasad Khade, Nihar Ranjan
2018 International Journal of Computer Applications  
We propose a method of automatic text classification using Convolutional Neural Network based on the disambiguation of the meaning of the word we use the WordNet ontology and word embedding algorithm to  ...  The Automated text classification consists of automatically organizing clustered data.  ...  One of the promising region is the automatic text categorization.  ... 
doi:10.5120/ijca2018918229 fatcat:kjf7cugzwjhrnn7fiw7dhaswva

Keyphrase Extraction using supervise learning
IJARCCE - Computer and Communication Engineering

2014 IJARCCE  
Text categorization is a kind of "supervised" learning where the categories are known beforehand and determined in advance for each training document.  ...  Text mining is knowledge intensive process in which a user communicates with a collection of documents.  ...  Task of automatic keyword extraction is to identify a set of words representive for document.  ... 
doi:10.17148/ijarcce.2014.31135 fatcat:l5mr6rbxbfg6hciqjgizm5fmgq

Automatic knowledge extraction from manufacturing research publications

P. Boonyasopon, A. Riel, W. Uys, L. Louw, S. Tichkiewitch, N. du Preez
2011 CIRP annals  
This paper presents the results of a study of the application of document retrieval and text mining techniques to extract knowledge from CIRP research papers.  ...  One is based on Latent Dirichlet Allocation of a huge document set, the other uses Wikipedia to discover significant words in papers.  ...  Section 2 explains the target in greater detail, section 3 introduces a particular text mining tool for automatic topic identification. Section 4 introduces a Wikipedia-based keyword extraction tool.  ... 
doi:10.1016/j.cirp.2011.03.043 fatcat:ecob3mbalvfppj5opwo7fivegi

Construction of a Domain Dictionary for Fundamental Vocabulary and its Application to Automatic Blog Categorization Using Dynamically Estimated Domains of Unknown Words

Chikara Hashimoto, Sadao Kurohashi
2014 Journal of Natural Language Processing  
As a task-based evaluation of the domain dictionary, we categorized blogs by assigning a domain for each word in a blog article and categorizing it as the most dominant domain.  ...  ., those not listed in the domain dictionary), resulting in our blog categorization achieving an accuracy of 94.0% (564/600).  ...  In Section 8, we compare our study with previous ones, and in Section 9, we conclude the paper.  ... 
doi:10.5715/jnlp.21.817 fatcat:g225xr6lzbe4fb6bv7njz32doi

News Classification: A Data Mining Approach

Dipak Ramchandra Kawade, Kavita S. Oza
2016 Indian Journal of Science and Technology  
A comparative study of these algorithms is done based on Accuracy, Time, Errors and ROC to predict the best algorithm for news data set classification.  ...  Objectives: Text classification is one of the important applications of data mining.  ...  Automatically extracted keyword technique is used to improve text categorization and also to identify impact of keywords on text categorization 4 .  ... 
doi:10.17485/ijst/2016/v9i46/84444 fatcat:lrxpfbsgtnap5mcxtungtctw5y

Text Mining Business Policy Documents

Marco Spruit, Drilon Ferati
2020 International Journal of Business Intelligence Research  
This framework relies on three natural language processing techniques, namely information extraction, automatic summarization, and automatic keyword extraction.  ...  In a time when the employment of natural language processing techniques in domains such as biomedicine, national security, finance, and law is flourishing, this study takes a deep look at its application  ...  using meeting transcripts (F Liu et al., 2009) NA NA 19.6% A study on automatically extracted keywords in Text Categorization (Hulth & Megyeesi, 2006) 92.89% 72.94% 81.72% Automatic keyword extraction  ... 
doi:10.4018/ijbir.20200701.oa1 fatcat:e2ru2ewfhrhr5mz6rkcaxc2w3e

Vietnamese Text Classification with TextRank and Jaccard Similarity Coefficient

Hao Tuan Huynh, Nghia Duong-Trung, Dinh Quoc Truong, Hiep Xuan Huynh
2020 Advances in Science, Technology and Engineering Systems  
However, it is time-consuming to consider all words in a text, but rather several key tokens.  ...  Text classification is considered one of the most fundamental and essential problems that deal with automatically classifying textual resources into pre-defined categories.  ...  The task of automatic text categorization has been studied by comparing the performance of several term weighting schemes rather than analyzing the actual classification task [17] .  ... 
doi:10.25046/aj050644 fatcat:zc6nfgbc3zef7festggk45yola
« Previous Showing results 1 — 15 out of 28,096 results