A survey of named-entity recognition methods for food information extraction

Gorjan Popovski, Barbara Korousic Seljak, Tome Eftimov
2020 IEEE Access  
As great amounts of food-related information is presented in the form of heterogeneous textual data, computer-based methods are useful to automatically extract such information. One way to do this is to utilize Named-Entity Recognition (NER) methods that are broadly used in computer science for information extraction. Despite the existence of numerous and well-versed NER methods in the biomedical domain, the domain of food science still remains scarcely resourced. In this paper, we provide an
more » ... erview and a comparison of named-entity recognition methods in the food domain, which can be used for automated extraction of food information from text. Four methods are discussed: FoodIE, NCBO (SNOMED CT), NCBO (OntoFood), and NCBO (FoodON). We compare them using a benchmark data set that consists of 1000 manually annotated recipes initially obtained from Allrecipes, which is the largest social network focused on food. After analysing the results from the evaluation, it is evident that FoodIE obtains very promising results compared to the other food named-entity recognition methods taken into consideration. INDEX TERMS Benchmarking, food information extraction, food ontology, named-entity recognition.
doi:10.1109/access.2020.2973502 fatcat:k2bg5fbujfdv7acbjtkpnsu6wi