An Ontological Framework for Information Extraction From Diverse Scientific Sources

Gohar Zaman, Hairulnizam Mahdin, Khalid Hussain, Atta-Ur-Rahman, Jemal Abawajy, Salama A. Mostafa
2021 IEEE Access  
Automatic information extraction from online published scientific documents is useful in various applications such as tagging, web indexing and search engine optimization. As a result, automatic information extraction has become among the hottest areas of research in text mining. Although various information extraction techniques have been proposed in the literature, their efficiency demands domain specific documents with static and well-defined format. Furthermore, their accuracy is challenged
more » ... with a slight modification in the format. To overcome these issues, a novel ontological framework for information extraction (OFIE) using fuzzy rule-base (FRB) and word sense disambiguation (WSD) is proposed. The proposed approach is validated with a significantly wider document domains sourced from well-known publishing services such as IEEE, ACM, Elsevier, and Springer. We have also compared the proposed information extraction approach against state-of-the-art techniques. The results of the experiment show that the proposed approach is less sensitive to changes in the document format and has a significantly better average accuracy of 89.14% and F-score as 89%. INDEX TERMS Information extraction, semi structure scientific documents, fuzzy rule base, word sense disambiguation, ontological framework.
doi:10.1109/access.2021.3063181 fatcat:bsoz4v7ndvb7xeltpmvo2wmkym