A Semantic Metadata Enrichment Software Ecosystem Based on Topic Metadata Enrichments

Ronald Brisebois, Alain Abran, Apollinaire Nadembega, Philippe N'techobo
2017 International Journal of Data Mining & Knowledge Management Process  
As existing computer search engines struggle to understand the meaning of natural language, semantically enriched metadata may improve interest-based search engine capabilities and user satisfaction. This paper presents an enhanced version of the ecosystem focusing on semantic topic metadata detection and enrichments. It is based on a previous paper, a semantic metadata enrichment software ecosystem (SMESE). Through text analysis approaches for topic detection and metadata enrichments this
more » ... propose an algorithm to enhance search engines capabilities and consequently help users finding content according to their interests. It presents the design, implementation and evaluation of SATD (Scalable Annotation-based Topic Detection) model and algorithm using metadata from the web, linked open data, concordance rules, and bibliographic record authorities. It includes a prototype of a semantic engine using keyword extraction, classification and concept extraction that allows generating semantic topics by text, and multimedia document analysis using the proposed SATD model and algorithm. The performance of the proposed ecosystem is evaluated using a number of prototype simulations by comparing them to existing enriched metadata techniques (e.g., AlchemyAPI, DBpedia, Wikimeta, Bitext, AIDA, TextRazor). It was noted that SATD algorithm supports more attributes than other algorithms. The results show that the enhanced platform and its algorithm enable greater understanding of documents related to user interests. Recently, considerable research has gone into developing topic detection approaches using a number of information extraction techniques (IET), such as lexicon, sliding window, boundary techniques, etc. Many of these techniques [14, 15, 17, 8] rely heavily on simple keyword extraction from text. For example, Sayyadi and Raschid [14] proposed an approach for topic detection, based on keywordbased methods, called KeyGraph, that was inspired by the keyword co-occurrence graph and efficient graph analysis methods. The main steps in the KeyGraph approach are as follows: Philippe started with a three-year training as a computer expert at the institute Leonardo da Vinci in Italy. Then, he joined the University of Parma, where he obtained his Bachelor in Computer Engineering with honors. He was then admitted at Polytechnic of Milan, one of the most prestigious engineering school (24th for Engineering in the world) for a master degree in computer engineering. After his first year, he won a scholarship for a double degree exchange program with the Polytechnic School of Montreal to obtain a second master more focused towards research in Natural Language Processing. In the last two years, he worked as research scientist for Ecole Polytechnique de Montreal, Bibliomondo and Nuance communications.
doi:10.5121/ijdkp.2017.7301 fatcat:dhztklu3oratbonywo5bkkofou