Fine Grained Named Entity Recognition via Seq2seq Framework

Huiming Zhu, Chunhui He, Yang Fang, Weidong Xiao
2020 IEEE Access  
Fine-grained Named entity recognition (NER) is crucial to natural language processing (NLP) applications like relation extraction and knowledge graph construction. Most existing fine-grained NER systems suffer from inefficiency problem as they use manually annotated training datasets. To address such issue, our NER system could automatically generate datasets from Wikipedia in distant supervision paradigm through mapping hyperlinks in Wikipedia documents to Freebase. In addition, previous NER
more » ... dels can not effectively process fine-grained labels with more than 100 types. So we introduce a 'BIO' tagging strategy which can identify the position and type attributes simultaneously. Such tagging scheme transfers NER problem into a sequence-to-sequence (seq2seq) based issue. We propose a seq2seq framework to comprehend the input sentence in a comprehensive way. Specifically, we adopt a Bi-LSTM as the encoder to equally process the past and future information of the input. Then we add a self-attention mechanism to handle the long-term dependency problem in a long sequence. When classifying the entity tags, we choose CRF model as it adds more constraints to avoid position logical problem. Experiments are performed on largescale datasets for fine-grained NER tasks. Experimental results verify the effectiveness of FSeqC, and it outperforms other state-of-the-art alternatives consistently and significantly. INDEX TERMS Named entity recognition, fine-grained, seq2seq framework.
doi:10.1109/access.2020.2980431 fatcat:fr5axng35rhj5jtsjeml4wzcxa