A search based approach to entity recognition

Michal Laclavik, Marek Ciglan, Alex Dorman, Stefan Dlugolinsky, Sam Steingold, Martin Šeleng
<span title="">2014</span> <i title="ACM Press"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/ibcfmixrofb3piydwg5wvir3t4" style="color: black;">Proceedings of the first international workshop on Entity recognition &amp; disambiguation - ERD &#39;14</a> </i> &nbsp;
ERD 2014 was a research challenge focused on the task of recognition and disambiguation of knowledge base entities in short and long texts. This write-up describes Magnetic-IISAS team's approach to the entity recognition in search queries with which we have participated in ERD 2014 challenge. Our approach combines techniques of information retrieval, gazetteer based annotation and entity link graph analysis to identify and disambiguate candidate entities. We built a search index with multiple
more &raquo; ... ructured fields extracted from Wikipedia, Freebase and DBPedia. When processing a query, we first retrieve top matching entities from the index. For all retrieved entities, we gather plausible verbalizations, surface forms, that retrieved entities may be referred to with. We match gathered entity surface forms against the original query to confirm the entity relevance to the query. Finally, we exploit Wikipedia link graph to asses the similarity of candidate entities for the purpose of disambiguation and further candidate filtering. In the paper we discuss successful as well as unsuccessful attempts to improve the quality of system results that we have tried during the course of the challenge.
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/2633211.2634352">doi:10.1145/2633211.2634352</a> <a target="_blank" rel="external noopener" href="https://dblp.org/rec/conf/sigir/LaclavikCDDSS14.html">dblp:conf/sigir/LaclavikCDDSS14</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/cltolzcuezbf3jnv3a2gxajemu">fatcat:cltolzcuezbf3jnv3a2gxajemu</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20140816125237/http://web-ngram.research.microsoft.com:80/erd2014/Docs/submissions/erd14_submission_6.pdf" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/b7/fd/b7fdc926a8438134646dc7a1a443976ee7465067.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1145/2633211.2634352"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="external alternate icon"></i> acm.org </button> </a>