Fluency Scoring of English Speaking Tests for Nonnative Speakers Using a Native English Phone Recognizer

Byeong-Yong Jang, Oh-Wook Kwon
2015 Phonetics and Speech Sciences  
requires abundance of time and expense. In the previous works on fluency, Fillmore defined the 4 elements of fluency: the ability to talk at length with minimal pauses, the ability to talk cohesively and logically, the ability to talk in a wide range of contexts or situations, and the ability to create talk [4]. Crystal defined the fluency as 'smooth, rapid, effortless use of language' [5]. Chamber established the definition of fluency in qualitative and quantitative aspects and proposed the
more » ... luation guide for foreign language speaking tests. Chamber's experiments showed that the important elements for fluency evaluation are the rate of speech, the frequency or position of pause, and hesitations, which are temporal and quantitative features [2]. Kormos investigated the effects of temporal and lexical features on fluency evaluation and asserted that important features are the speech rate, the phonation time ratio, the number of stressed words, and the accuracy [3]. In the Deshmukh et al.'s study, 8 prosodic and 8 lexical features were extracted for fluency evaluation, and good performance was generally achieved with the lexical features among which the ABSTRACT We propose a new method for automatic fluency scoring of English speaking tests spoken by nonnative speakers in a free-talking style. The proposed method is different from the previous methods in that it does not require the transcribed texts for spoken utterances. At first, an input utterance is segmented into a phone sequence by using a phone recognizer trained by using native speech databases. For each utterance, a feature vector with 6 features is extracted by processing the segmentation results of the phone recognizer. Then, fluency score is computed by applying support vector regression (SVR) to the feature vector. The parameters of SVR are learned by using the rater scores for the utterances. In computer experiments with 3 tests taken by 48 Korean adults, we show that speech rate, phonation time ratio, and smoothed unfilled pause rate are best for fluency scoring. The correlation of between the rater score and the SVR score is shown to be 0.84, which is higher than the correlation of 0.78 among raters. Although the correlation is slightly lower than the correlation of 0.90 when the transcribed texts are given, it implies that the proposed method can be used as a preprocessing tool for fluency evaluation of speaking tests.
doi:10.13064/ksss.2015.7.2.149 fatcat:eompjxieb5evbb2mfdwz6lqdfu