Using language technology resources and tools to construct Swedish FrameNet

Dana Dannells, Karin Friberg Heppin, Anna Ehrlemark
2014 Proceedings of Workshop on Lexical and Grammatical Resources for Language Processing  
Having access to large lexical and grammatical resources when creating a new language resource is essential for its enhancement and enrichment. This paper describes the interplay and interactive utilization of different language technology tools and resources, in particular the Swedish lexicon SALDO and Swedish Constructicon, in the creation of Swedish FrameNet. We show how integrating resources in a larger infrastructure is much more than the sum of the parts. Introduction This paper describes
more » ... how Swedish language technology resources are exploited to construct Swedish FrameNet (SweFN), 1 a lexical-semantic resource that has been expanded from and constructed in line with Berkeley FrameNet (BFN). The resource has been developed within the framework of the theory of Frame Semantics (Fillmore, 1985) . According to this theory, semantic frames including their participants represent cognitive scenarios as schematic representations of events, objects, situations, or states of affairs. The participants are called frame elements (FEs) and are described in terms of semantic roles such as AGENT, LOCATION, or MANNER. Frames are evoked by lexical units (LUs) which are pairings of lemmas and meanings. To get a visualization of the notion of semantic frames consider the frame Vehicle landing. It has the following definition in BFN: "A flying VEHICLE comes to the ground at a GOAL in a controlled fashion, typically (but not necessarily) operated by an operator." VEHICLE and GOAL are the core elements that together with the description uniquely characterize the frame. Their semantic types are Physical object and Location. The non-core elements of the frame are: CIRCUMSTANCES, COTHEME, DEGREE, DEPICTIVE, EVENT DESCRIPTION, FREQUENCY, GOAL CONDITIONS, MANNER, MEANS, MODE OF TRANSPORTATION, PATH, PERIOD OF ITERATIONS, PLACE, PURPOSE, RE ENCODING, SOURCE, and TIME. The lexical units evoking the frame are: land.v, set down.v, and touch down.v. In addition, the frame contains a number of example sentences which are annotated in terms of LUs and FEs. These sentences carry valence information about different syntactic realizations of the FEs and about their semantic characteristics. Currently SweFN contains around 1,150 frames with over 29,000 lexical units of which 5,000 are verbs, and also 8,300 semantically and syntactically annotated sentences, selected from a corpus. SweFN has mainly been created manually, but as a response to an ever increasing complexity, volume, and specialization of textual evidence, the creation of SweFN is enhanced with automated Natural Language Processing (NLP) techniques. In contrast to the construction of English resources, as well as the construction of framenets for other languages, the resources used to construct SweFN are all linked in a unique infrastructure of language resources. The development of framenets in other languages FrameNet-like resources have been developed in several languages and have been exploited in a range of NLP applications such as semantic parsing (Das et al., 2014), information extraction (Moschitti et This work is licenced under a Creative Commons Attribution 4.0 International License.
