883 Hits in 6.8 sec

Excavating the mother lode of human-generated text: A systematic review of research that uses the wikipedia corpus

Mohamad Mehdi, Chitu Okoli, Mostafa Mesgari, Finn Årup Nielsen, Arto Lanamäki
2017 Information Processing & Management  
Turdakov and Velikhov [125] presented a similarity measure based on Dice's measure to compute the semantic relatedness between Wikipedia articles.  ...  [118] used Wikipedia to compare different models for paragraph similarity analysis, and to automatically generate similar smaller corpora.  ... 
doi:10.1016/j.ipm.2016.07.003 fatcat:qgjeatizfzbyjkbo4rsuxea76y

Wikipedia Mining

Kotaro NAKAYAMA, Masahiro ITO, Maike ERDMANN, Masumi SHIRAKAWA, Tomoyuki MICHISHITA, Takahiro HARA, Shojiro NISHIO
2009 Transactions of the Japanese society for artificial intelligence  
In the past few years, a considerable number of researches have been conducted in various areas such as semantic relatedness measurement, bilingual dictionary construction, and ontology construction.  ...  Because of these characteristics, Wikipedia has become a promising corpus and a new frontier for research.  ...  In the past few years, a considerable number of researches have been conducted in various areas such as semantic relatedness measurement, bilingual dictionary construction, and ontology construction.  ... 
doi:10.1527/tjsai.24.549 fatcat:o7e5grmn5rgctf67y7lhfppq6i

Adaptive Concept Resolution for document representation and its applications in text mining

Lidong Bing, Shan Jiang, Wai Lam, Yan Zhang, Shoaib Jameel
2015 Knowledge-Based Systems  
Existing ontology-based document representation methods are static so that the selected semantic concepts for representing a document have a fixed resolution.  ...  We also present a method to integrate Wikipedia entities into an expertedited ontology, namely WordNet, to generate an enhanced ontology named WordNet-Plus, and its performance is also examined under the  ...  WordNet, and construct an enriched ontology, called WordNet-Plus. Consider a Wikipedia entity, with the category information of the entity as clues.  ... 
doi:10.1016/j.knosys.2014.10.003 fatcat:zfffyzjtlbc7jhwikzgzx3qtfy

Mining meaning from Wikipedia

Olena Medelyan, David Milne, Catherine Legg, Ian H. Witten
2009 International Journal of Human-Computer Studies  
Wikipedia is a goldmine of information; not just for its many readers, but also for the growing community of researchers who recognize it as a resource of exceptional scale and utility.  ...  It focuses on research that extracts and makes use of the concepts, relations, facts and descriptions found in Wikipedia, and organizes the work into four broad categories: applying Wikipedia to natural  ...  Acknowledgements We warmly thank Evgeniy Gabrilovich, Rada Mihalcea, Dan Weld, Sö ren Auer, Fabian Suchanek and the YAGO team for their valuable comments on a draft of this paper.  ... 
doi:10.1016/j.ijhcs.2009.05.004 fatcat:mzxszf4jlfcizbgxuemgdwzdiy

Ontology enhancement and concept granularity learning

Shan Jiang, Lidong Bing, Bai Sun, Yan Zhang, Wai Lam
2011 Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '11  
To keep WordNet current with folk wisdom, we propose a method to enhance WordNet automatically by merging Wikipedia entities into WordNet, and construct an enriched ontology, named as WorkiNet.  ...  The learning process takes the characteristics of the given document collection into consideration and the semantic concepts in the tailor-made collection can be used as new features for document representation  ...  CONCLUSIONS In this paper, we construct an enhanced ontology, WorkiNet, by integrating the information of Wikipedia into the expert-edited on-  ... 
doi:10.1145/2020408.2020597 dblp:conf/kdd/JiangBSZL11 fatcat:ranqpc4i5fd3dduo2pi3oazj5u

Creating an Extended Named Entity Dictionary from Wikipedia

Ryuichiro Higashinaka, Kugatsu Sadamitsu, Kuniko Saito, Toshiro Makino, Yoshihiro Matsuo
2012 International Conference on Computational Linguistics  
In our method, we derive a large number of features for Wikipedia titles and train a multiclass classifier by supervised learning.  ...  We devise an extensive list of features for the accurate classification into the ENE types, such as those related to the surface string of a title, the content of the article, and the meta data provided  ...  Here the semantic categories are those defined in the Japanese Goi-Taikei ontology (Ikehara et al., 1997) . There are 2715 semantic categories in all.  ... 
dblp:conf/coling/HigashinakaSSMM12 fatcat:bfxh34v6fjccvpdnezjm63swki

Towards an enhanced and adaptable ontology by distilling and assembling online encyclopedias

Shan Jiang, Lidong Bing, Yan Zhang
2013 Proceedings of the 22nd ACM international conference on Conference on information & knowledge management - CIKM '13  
In this paper, we investigate the problem of making better use of semantic knowledge obtained from different encyclopedia sources.  ...  Finally as a demonstration, a Chinese semantic knowledge repository named JNet is constructed based on this framework.  ...  There are proposed methods based on link analysis between Wikipedia categories [4] , using sentence analyzer [6] and taking learning algorithm [23] .  ... 
doi:10.1145/2505515.2505597 dblp:conf/cikm/JiangBZ13 fatcat:iebhyxcqzve7veuuebronz6joa

Wikipedia HTML Structure Analysis for Ontology Construction

Rim Zarrad, Narjes Doggaz, and Ezzedine Zagrouba
2018 Knowledge organization  
Her scientific research and production activities focus on ontology learning, data retrieval and knowledge extraction.  ...  She obtained her master's degree from the University of Nancy1, France, in 1987 and her PhD from the University of Nancy1, France, in 1992. She works on multi-agent systems and knowledge extraction.  ...  We apply these measures not only for lexical entities (concepts) but also for the evaluation of ontological relations.  ... 
doi:10.5771/0943-7444-2018-2-108 fatcat:ulhgw2j6tnfd7oqisfg6l347ha

Constraints Based Taxonomic Relation Classification

Quang Do, Dan Roth
2010 Conference on Empirical Methods in Natural Language Processing  
Toyota and Honda) is an essential component of textual inference in NLP applications such as Question Answering, Summarization, and Recognizing Textual Entailment.  ...  learning-based approach that makes use of existing resources.  ...  Acknowledgments The authors thank Mark Sammons, Vivek Srikumar, James Clarke and the anonymous reviewers for their insightful com-  ... 
dblp:conf/emnlp/DoR10 fatcat:w5jg3xyu7vh3paqif656bqf7ye

Acquiring Semantic Relations Using the Web for Constructing Lightweight Ontologies [chapter]

Wilson Wong, Wei Liu, Mohammed Bennamoun
2009 Lecture Notes in Computer Science  
Common techniques for acquiring semantic relations rely on static domain and linguistic resources, predefined patterns, and the presence of syntactic cues.  ...  We propose a hybrid approach which brings together established and novel techniques in lexical simplification, word disambiguation and association inference for acquiring coarse-grained relations between  ...  of Chemical Engineering, Curtin University of Technology.  ... 
doi:10.1007/978-3-642-01307-2_26 fatcat:pdee6t3x5vdnvgyydv6rddrmji

Lifetree: Building and Comparison based on User's Tweets

Seyedmahmoud Talebi, Manoj K., G. Hemantha, Nima Nosrati
2018 International Journal of Computer Applications  
Hence, a novel approach for representing users' data has been proposed, which makes the process of recommendation easier and more accurate.  ...  The various uses of the Lifetree included an overall picture of particular users' interests and further helps in event allocation, ads customization, etc...  ...  We used these measurements for five users to do more analysis and comparison with keyword similarities.  ... 
doi:10.5120/ijca2018917897 fatcat:lc3zbcwxkvh5xonyl3u3b6rdvu

Classifying tags using open content resources

Simon Overell, Börkur Sigurbjörnsson, Roelof van Zwol
2009 Proceedings of the Second ACM International Conference on Web Search and Data Mining - WSDM '09  
We describe the implementation of our method on Wikipedia using WordNet categories as our classification schema and ground truth.  ...  Two structural patterns found in Wikipedia are used for training and classification: categories and templates. We apply our system to classifying Flickr tags.  ...  This research is partially supported by the European Union under contract FP6-045032, "Search Environments for Media -SEMEDIA" (  ... 
doi:10.1145/1498759.1498810 dblp:conf/wsdm/OverellSZ09 fatcat:vdcs4wihkbdrddyuhq2ugx7whm

Extracting Semantic Concept Relations from Wikipedia

Patrick Arnold, Erhard Rahm
2014 Proceedings of the 4th International Conference on Web Intelligence, Mining and Semantics (WIMS14) - WIMS '14  
knowledge as provided by repositories such as WordNet is of critical importance for linking or mapping ontologies and related tasks.  ...  Our approach uses a comprehensive set of semantic patterns, finite state machines and NLPtechniques to process Wikipedia definitions and to identify semantic relations between concepts.  ...  The approach focuses on the analysis of the definition sentence of Wikipedia articles and uses finite state machines to extract semantic relation patterns and their operands to discover semantic relations  ... 
doi:10.1145/2611040.2611079 dblp:conf/wims/ArnoldR14 fatcat:fzh2v2gqerbfrg3wv7ivtuvzr4

DBpedia - A crystallization point for the Web of Data

Christian Bizer, Jens Lehmann, Georgi Kobilarov, Sören Auer, Christian Becker, Richard Cyganiak, Sebastian Hellmann
2009 Journal of Web Semantics  
The DBpedia project is a community effort to extract structured information from Wikipedia and to make this information accessible on the Web.  ...  Over the last year, an increasing number of data publishers have begun to set data-level links to DBpedia resources, making DBpedia a central interlinking hub for the emerging Web of Data.  ...  Ultimately, the DBpedia and Semantic MediaWiki have similar goals.  ... 
doi:10.1016/j.websem.2009.07.002 fatcat:eaus7na2vjf3nnzuygqyngbdta

Automatic Generation of Language-Independent Features for Cross-Lingual Classification [article]

Sarai Duek, Shaul Markovitch
2018 arXiv   pre-print
We tested our method, using Wikipedia as our ontology, on the most commonly used test collections in cross-lingual text categorization, and found that it outperforms existing methods.  ...  We also present a method for exploiting the hierarchical structure of the ontology to create virtual supporting documents for languages that do not have them.  ...  Constructing the Feature Generator Our method for generating language-independent features for a document in a specific language l is based on Explicit Semantic Analysis (ESA) (Gabrilovich and Markovitch  ... 
arXiv:1802.04028v1 fatcat:cjlxikzxqfaoxfbxz2jqdkudp4
« Previous Showing results 1 — 15 out of 883 results