Enriching Taxonomies With Functional Domain Knowledge

Nikhita Vedula, Patrick K. Nicholson, Deepak Ajwani, Sourav Dutta, Alessandra Sala, Srinivasan Parthasarathy
2018 The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval - SIGIR '18  
The rising need to harvest domain specific knowledge in several applications is largely limited by the ability to dynamically grow structured knowledge representations, due to the increasing emergence of new concepts and their semantic relationships with existing ones. Such enrichment of existing hierarchical knowledge sources with new information to better model the "changing world" presents two-fold challenges: (1) Detection of previously unknown entities or concepts, and (2) Insertion of the
more » ... new concepts into the knowledge structure, respecting the semantic integrity of the created relationships. To this end we propose a novel framework, ETF, to enrich large-scale, generic taxonomies with new concepts from resources such as news and research publications. Our approach learns a high-dimensional embedding for the existing concepts of the taxonomy, as well as for the new concepts. During the insertion of a new concept, this embedding is used to identify semantically similar neighborhoods within the existing taxonomy. The potential parent-child relationships linking the new concepts to the existing ones are then predicted using a set of semantic and graph features. Extensive evaluations of ETF on large, real-world taxonomies of Wikipedia and WordNet showcase more than 5% F1-score improvements compared to state-of-the-art baselines. We further demonstrate that ETF can accurately categorize newly emerging concepts and question-answer pairs across different domains.
doi:10.1145/3209978.3210000 dblp:conf/sigir/VedulaNADS018 fatcat:bhauejtucze4lozvpfitlwxcxy