Spoken language understanding

R. De Mori, F. Bechet, D. Hakkani-Tur, M. McTear, G. Riccardi, G. Tur
2008 IEEE Signal Processing Magazine  
S emantics deals with the organization of meanings and the relations between sensory signs or symbols and what they denote or mean [29] . Computational semantics performs a conceptualization of the world using computational processes for composing a meaning representation structure from available signs and their features present, for example, in words and sentences. Spoken language understanding (SLU) is the interpretation of signs conveyed by a speech signal. SLU and natural language
more » ... ing (NLU) share the goal of obtaining a conceptual representation of natural language sentences. Specific to SLU is the fact that signs to be used for interpretation are coded into signals along with other information such as speaker identity. Furthermore, spoken sentences often do not follow the grammar of a language; they exhibit self-corrections, hesitations, repetitions, and other irregular phenomena. SLU systems contain an automatic speech recognition (ASR) component and must be robust to noise due to the spontaneous nature of spoken language and the errors introduced by ASR. Moreover, ASR components output a stream of words with no structure information like punctuation and sentence boundaries. Therefore, SLU systems cannot rely on such markers and must perform text segmentation and understanding at the same time. Obtaining meaning from speech is a complex process and many different approaches and models have been proposed. Systems developed in the 1970s and the 1980s mostly performed syntactic analysis on the best sequence of words hypothesized by an ASR system and used nonprobabilistic rules for mapping syntactic structures into semantic ones expressed as logic formulas. An interesting discussion on computer structures for semantic representations considered in this period can be found in [29] . Meaning representation and approaches for obtaining these representations from words are discussed in this article. Basic related problems are reviewed in [15] . In the 1990s, the need emerged for testing SLU processes on large corpora that could also be used for automatically estimating some model parameters. Probabilistic finite-state interpretation models and grammars were also introduced for dealing with ambiguities introduced by model imprecision. Systems based on these approaches are discussed in this article and are also reviewed in chapter 14 of [6] . Some other approaches transform signals directly into basic semantic constituents to be further composed into semantic structures. This integration of the ASR and SLU processes, which is discussed in this article, generates multiple SLU hypotheses to be further validated using constraints imposed by the context in which a sentence is interpreted.
doi:10.1109/msp.2008.918413 fatcat:zq6vxulkjrbwhpve7cmcgdb3mi