Automatic Entity Recognition and Typing from Massive Text Corpora

Xiang Ren, Ahmed El-Kishky, Chi Wang, Jiawei Han
2015 Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD '15  
In today's computerized and information-based society, we are soaked with vast amounts of text data, ranging from news articles, scientific publications, product reviews, to a wide range of textual information from social media. To unlock the value of these unstructured text data from various domains, it is of great importance to gain an understanding of entities and their relationships. In this tutorial, we introduce data-driven methods to recognize typed entities of interest in massive,
more » ... -specific text corpora. These methods can automatically identify token spans as entity mentions in documents and label their types (e.g., people, product, food) in a scalable way. We demonstrate on real datasets including news articles and tweets how these typed entities aid in knowledge discovery and management.
doi:10.1145/2783258.2789988 pmid:26705508 pmcid:PMC4688010 dblp:conf/kdd/RenEWH15 fatcat:2z3gzinvrbawpkgziwpsxcl6p4