Filters








3,087 Hits in 4.3 sec

Enhancing text clustering by leveraging Wikipedia semantics

Jian Hu, Lujun Fang, Yang Cao, Hua-Jun Zeng, Hua Li, Qiang Yang, Zheng Chen
2008 Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '08  
Then, we develop a unified framework to leverage these semantic relations in order to enhance traditional content similarity measure for text clustering.  ...  Most traditional text clustering methods are based on "bag of words" (BOW) representation based on frequency statistics in a set of documents.  ...  In this paper, we show that by fully leveraging the structural relationship information in Wikipedia, we can enhance the clustering result by obtaining a more accurate distance measure.  ... 
doi:10.1145/1390334.1390367 dblp:conf/sigir/HuFCZLYC08 fatcat:fdtcfrunzbb57jfmkhmjzugerq

Named entity disambiguation by leveraging wikipedia semantic knowledge

Xianpei Han, Jun Zhao
2009 Proceeding of the 18th ACM conference on Information and knowledge management - CIKM '09  
Based on the constructed semantic network, a novel similarity measure is proposed to leverage Wikipedia semantic knowledge for disambiguation.  ...  By leveraging Wikipedia's semantic knowledge like social relatedness between named entities and associative relatedness between concepts, we can measure the similarity between occurrences of names more  ...  Social networks can enhance similarity measure by leveraging social relatedness between named entities.  ... 
doi:10.1145/1645953.1645983 dblp:conf/cikm/HanZ09 fatcat:iolptgpztbbvlhgwplndnlpyqe

An Efficient Approach For Semantically-Enhanced Document Clustering By Using Wikipedia Link Structure

Iyad AlAgha, Rami Nafee
2014 International Journal of Artificial Intelligence & Applications  
This paper presents a new approach to enhance document clustering by exploiting the semantic knowledge contained in Wikipedia.  ...  Traditional techniques of document clustering do not consider the semantic relationships between words when assigning documents to clusters.  ...  Results CONCLUSION AND FUTURE WORK In this work, we proposed an approach for leveraging Wikipedia link structure to improve text clustering performance.  ... 
doi:10.5121/ijaia.2014.5605 fatcat:adhak2f6gbgcnf22wfkflijaom

Transferring auxiliary knowledge to enhance heterogeneous web service clustering

Gang Tian, Chengai Sun, Ke qing He, Xiang min Ji
2016 International Journal of High Performance Computing and Networking  
To solve this problem, we propose a new service clustering approach based on transfer learning from auxiliary long text data obtained from Wikipedia.  ...  Most existing clustering approaches are designed to handle long text documents.  ...  Dirichlet allocation (LDA) (Rosen-Zvi et al., 2004 , to enhance web service clustering by incorporating auxiliary long texts and seamlessly leveraging the tagging information.  ... 
doi:10.1504/ijhpcn.2016.074669 fatcat:5c5eei6ylfh25f7phicwew4pmu

Graph-Based Text Similarity Measurement By Exploiting Wikipedia As Background Knowledge

Lu Zhang, Chunping Li, Jun Liu, Hui Wang
2011 Zenodo  
In this paper, we propose a novel text similarity measurement which goes beyond VSM and can find semantic affinity between documents.  ...  Text similarity measurement is a fundamental issue in many textual applications such as document clustering, classification, summarization and question answering.  ...  The main contributions of our work are: (1) We propose a unified framework of graph-based text similarity measurement by leveraging Wikipedia as background knowledge, which can overcome the semantic sensitivity  ... 
doi:10.5281/zenodo.1083541 fatcat:3zjj2m6z4jg23ikthsovaqmp3i

Efficient wikipedia-based semantic interpreter by exploiting top-k processing

Jong Wook Kim, Ashwin Kashyap, Dekai Li, Sandilya Bhamidipati
2010 Proceedings of the 19th ACM international conference on Information and knowledge management - CIKM '10  
Proper representation of the meaning of texts is crucial for enhancing many data mining and information retrieval tasks, including clustering, computing semantic relatedness between texts, and searching  ...  A key obstacle, however, for using Wikipedia as a semantic interpreter is that the sheer size of the concepts derived from Wikipedia makes it hard to efficiently map texts into concept-space.  ...  [Hu et al. 2009 ] proposes a general framework that leverages Wikipedia to enhance clustering performance.  ... 
doi:10.1145/1871437.1871736 dblp:conf/cikm/KimKLB10 fatcat:2zbhftqjoncx7kdjxhxhrclxjq

Leverage Label and Word Embedding for Semantic Sparse Web Service Discovery

Chengai Sun, Liangyu Lv, Gang Tian, Qibo Wang, Xiaoning Zhang, Lantian Guo
2020 Mathematical Problems in Engineering  
Information retrieval-based Web service discovery approach suffers from the semantic sparsity problem caused by lacking of statistical information when the Web services are described in short texts.  ...  The results also suggest that leveraging external information is useful for semantic sparse Web service discovery.  ...  For example, Hu et al. proposed to enhance the short text cluster by leveraging world knowledge [17] .  ... 
doi:10.1155/2020/5670215 fatcat:ihzi5ugzz5cgbdy22w7tnvwbri

Exploiting Wikipedia as external knowledge for document clustering

Xiaohua Hu, Xiaodan Zhang, Caimei Lu, E. K. Park, Xiaohua Zhou
2009 Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '09  
In this paper, we present a novel text clustering method to address these two issues by enriching document representation with Wikipedia concept and category information.  ...  In traditional text clustering methods, documents are represented as "bags of words" without considering the semantic information of each document.  ...  Moreover, we will explore how to utilize the link structure among Wikipedia concepts for document clustering. ACKNOWLEDGEMENTS  ... 
doi:10.1145/1557019.1557066 dblp:conf/kdd/HuZLPZ09 fatcat:l7puoy6mbjbafpt7xlkzkmki7e

Conceptual feature generation for textual information using a conceptual network constructed from Wikipedia

Amir H. Jadidinejad, Fariborz Mahmoudi, M. R. Meybodi
2015 Expert systems  
Furthermore, semantic annotator gets a fragment of natural language text and initiates a random walk to generate conceptual features that represent topical semantic of the input text.  ...  In this paper, a novel semantic annotator is presented to generate conceptual features for text documents.  ...  Gabrilovich and Markovitch (2009) proposed explicit semantic analysis that leverages concepts explicitly defined by humans.  ... 
doi:10.1111/exsy.12133 fatcat:poxgphrhsjfovbmprilm6gljhm

Excavating the mother lode of human-generated text: A systematic review of research that uses the wikipedia corpus

Mohamad Mehdi, Chitu Okoli, Mostafa Mesgari, Finn Årup Nielsen, Arto Lanamäki
2017 Information Processing & Management  
Semantic relations have been shown to enhance the performance of clustering algorithms. Hu et al.  ...  The results of the experiments proved that this approach improves the clustering accuracy [7, p.788 ]. The Wikipedia knowledge base was used by Carmel et al. [15] to enhance cluster labeling.  ... 
doi:10.1016/j.ipm.2016.07.003 fatcat:qgjeatizfzbyjkbo4rsuxea76y

Probabilistic semantic similarity measurements for noisy short texts using Wikipedia entities

Masumi Shirakawa, Kotaro Nakayama, Takahiro Hara, Shojiro Nishio
2013 Proceedings of the 22nd ACM international conference on Conference on information & knowledge management - CIKM '13  
Explicit Semantic Analysis (ESA), a popular Wikipedia-based method, solves these problems by summing the weighted vectors of related entities.  ...  Our method adds related Wikipedia entities to a short text as its semantic representation and uses the vector of entities for computing semantic similarity.  ...  Thus, enriching the semantics of short texts using external knowledge, such as Wikipedia, is essential. Most of the work leveraged Wikipedia for specific tasks on short texts. Ferragina et al.  ... 
doi:10.1145/2505515.2505600 dblp:conf/cikm/ShirakawaNHN13 fatcat:cdcfeypvu5bfbpx4d7ay75kjvu

Exploiting internal and external semantics for the clustering of short texts using world knowledge

Xia Hu, Nan Sun, Chao Zhang, Tat-Seng Chua
2009 Proceeding of the 18th ACM conference on Information and knowledge management - CIKM '09  
In this paper, we propose a novel framework to improve the performance of short text clustering by exploiting the internal semantics from the original text and external concepts from world knowledge.  ...  semantic knowledge bases -Wikipedia and WordNet.  ...  [25] addressed the data sparseness by leveraging web search results to provide greater context for short texts.  ... 
doi:10.1145/1645953.1646071 dblp:conf/cikm/HuSZC09 fatcat:je6zmajetbbg7or72m2poajlcu

Leveraging Wikipedia concept and category information to enhance contextual advertising

Zongda Wu, Guandong Xu, Rong Pan, Yanchun Zhang, Zhiwen Hu, Jianfeng Lu
2011 Proceedings of the 20th ACM international conference on Information and knowledge management - CIKM '11  
Last, we evaluate our approach by using real ads, pages, as well as a great number of concepts and categories of Wikipedia.  ...  In this paper, we present a new contextual advertising approach to overcome the problems, which uses Wikipedia concept and category information to enrich the content representation of an ad (or a page)  ...  For solving the problems of homonymy and polysemy etc., the Wikipedia matching was proposed, whose main idea is to leverage the Wikipedia as an intermediate reference model to enhance the semantic representation  ... 
doi:10.1145/2063576.2063901 dblp:conf/cikm/WuXPZHL11 fatcat:pcfk6c53kfdzpjbslaenui6c7i

Blognoon

Maria Grineva, Maxim Grinev, Dmitry Lizorkin, Alexander Boldakov, Denis Turdakov, Andrey Sysoev, Alexander Kiyko
2011 Proceedings of the 20th international conference companion on World wide web - WWW '11  
It enhances navigation over the Blogosphere with faceted interfaces and recommendations.  ...  We demonstrate Blognoon, a semantic blog search engine with the focus on topic exploration and navigation.  ...  There is a large body of work on using Wikipedia to enhance text processing [3] .  ... 
doi:10.1145/1963192.1963292 dblp:conf/www/GrinevaGLBTSK11 fatcat:xgbrm7jhzzbltftkcwpf57f3aa

Study of Ontology or Thesaurus Based Document Clustering and Information Retrieval

G. Bharathi, D. Venkatesan
2012 Journal of Engineering and Applied Sciences  
Clustering text data faces a number of new challenges. Among others, the volume of text data, dimensionality, sparsity and complex semantics are the most important ones.  ...  These characteristics of text data require clustering techniques to be scalable to large and high dimensional data, and able to handle sparsity and semantics.  ...  Therefore, in order to enhance text clustering by leveraging ontology semantics, two issues need to be addressed: an ontology which can cover the topical domain of individual document collections as completely  ... 
doi:10.3923/jeasci.2012.342.347 fatcat:qsdkroainjc4ljxmfunfiqwonm
« Previous Showing results 1 — 15 out of 3,087 results