A machine learning approach to acronym generation

Yoshimasa Tsuruoka, Sophia Ananiadou, Jun'ichi Tsujii
2005 Proceedings of the ACL-ISMB Workshop on Linking Biological Literature, Ontologies and Databases Mining Biological Semantics - ISMB '05   unpublished
This paper presents a machine learning approach to acronym generation. We formalize the generation process as a sequence labeling problem on the letters in the definition (expanded form) so that a variety of Markov modeling approaches can be applied to this task. To construct the data for training and testing, we extracted acronym-definition pairs from MEDLINE abstracts and manually annotated each pair with positional information about the letters in the acronym. We have built an MEMM-based
more » ... er using this training data set and evaluated the performance of acronym generation. Experimental results show that our machine learning method gives significantly better performance than that achieved by the standard heuristic rule for acronym generation and enables us to obtain multiple candidate acronyms together with their likelihoods represented in probability values.
doi:10.3115/1641484.1641488 fatcat:otmj4uj7tzgk3fm4dlq2lfzoga