Ontology based annotation of text segments

Samhaa R. El-Beltagy, Maryam Hazman, Ahmed Rafea
2007 Proceedings of the 2007 ACM symposium on Applied computing - SAC '07  
This work exploits the logical structure of information rich texts to automatically annotate text segments contained within them using a domain ontology. The underlying assumption behind this work is that segments in such documents embody self contained informative units. Another assumption is that segment headings coupled with a document's hierarchical structure offer informal representations of segment content; and that matching segment headings to concepts in an ontology/thesaurus can result
more » ... in the creation of formal labels/meta-data for these segments. When an encountered heading can not be matched with any concepts in the ontology, the hierarchical structure of the document is used to infer where a new concept represented by this heading should be added in the ontology. So, in this work the bootstrap ontology is also enriched by new concepts encountered within input documents. This paper also presents issues/problems related to matching textual entities to concepts in an incomplete ontology. The approach presented in this paper was applied to a set of agricultural extension documents. The results of carrying out this experiment demonstrates that the proposed approach is capable of automatically annotating segments with concepts that describe a segment's content with a high degree of accuracy.
doi:10.1145/1244002.1244296 dblp:conf/sac/El-BeltagyHR07 fatcat:gugobs6cpzettls2w7jk3sn7he