Isolated and Connected Word Recognition—Theory and Selected Applications [chapter]

LAWRENCE R. RABINER, STEPHEN E. LEVINSON
1990 Readings in Speech Recognition  
The art and science of speech recognition have been advanced to the state where it is now possible to communicate reliably with a computer by speaking to it in a disciplined manner using a vocabulary of moderate size. It is the purpose of this paper to outline two aspects of speech-recognition research. First, we discuss word recognition as a classical pattern-recognition problem and show how some fundamental concepts of signal processing, information theory, and computer science can he
more » ... to give us the capability of robust recognition of isolated words and simple connected word sequences. We then describe methods whereby these principles, augmented by modern theories of formal language and semantic analysis, can be used to study some of the more general problems in speech recognition. It is anticipated that these methods will ultimately lead to accurate mechanical recognition of fluent speech under certain controlled conditions. IEEE TRANSACTIONS ON COMMUNICATIONS, VOL. COM-29, NO. 5 , MAY 1981 duce the notion of list searching on an incomplete or corrupted key. Later we will formalize and generalize this approach. Finally we describe a system which recognizes strings of connected digits. This system is substantially different from the other two .in that it recognizes strings of words uttered without pauses between them. This is made possible by an important generalization of the temporal-alignment procedure discussed earlier. Each of thesie three systems represents an advance toward the ultimate goal of speech-recognition research, human/machine conversational-speech communciation. Over the years, this goal has proven to be a most elusive one. Part of the reason for the diffiiculty lies in the fact that extrapolation of the pattern-recognition paradigm does not provide a sufficiently general model of the speech-communication process. Thus in Section IV, we go on to consider some other disciplines which can be used to analyze some phenomena of speech not encompassed by the paltern-recognition model. We begin with a brief description of the human speechcommunication process including both the vocal and auditory apparatus and an abstract definition of communication. We then outline some parts of the theories of formal languages and semantic analysis which can be applied to speech recognition. In particular, we elaborate on a simple formalization of grammar which both dramatically increases the versatility and robustness of our speech-recognition machines and provides insights into the role of linguistic structure in speech recognition. Next, we show how these theories can be implemented and how they increase the capabilities of our experimental speechrecognition systems. We' first return to the notion of list
doi:10.1016/b978-0-08-051584-7.50014-0 fatcat:hovnv5uyefgxthq2syytuwuf7u