Classification-based spoken text selection for LVCSR language modeling

Vataya Chunwijitra, Chai Wutiwiwatchai
<span title="2017-10-17">2017</span> <i title="Springer Nature"> <a target="_blank" rel="noopener" href="https://fatcat.wiki/container/tzakietxejgppjzsrojed7bkke" style="color: black;">EURASIP Journal on Audio, Speech, and Music Processing</a> </i> &nbsp;
Large vocabulary continuous speech recognition (LVCSR) has naturally been demanded for transcribing daily conversations, while developing spoken text data to train LVCSR is costly and time-consuming. In this paper, we propose a classification-based method to automatically select social media data for constructing a spoken-style language model in LVCSR. Three classification techniques, SVM, CRF, and LSTM, trained by words and parts-of-speech are comparatively experimented to identify the degree
more &raquo; ... f spoken style in each social media sentence. Spoken-style utterances are chosen by incremental greedy selection based on the score of the SVM or the CRF classifier or the output classified as "spoken" by the LSTM classifier. With the proposed method, just 51.8, 91.6, and 79.9% of the utterances in a Twitter text collection are marked as spoken utterances by the SVM, CRF, and LSTM classifiers, respectively. A baseline language model is then improved by interpolating with the one trained by these selected utterances. The proposed model is evaluated on two Thai LVCSR tasks: social media conversations and a speech-to-speech translation application. Experimental results show that all the three classification-based data selection methods clearly help reducing the overall spoken test set perplexities. Regarding the LVCSR word error rate (WER), they achieve 3.38, 3.44, and 3.39% WER reduction, respectively, over the baseline language model, and 1.07, 0.23, and 0.38% WER reduction, respectively, over the conventional perplexity-based text selection approach.
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1186/s13636-017-0121-5">doi:10.1186/s13636-017-0121-5</a> <a target="_blank" rel="external noopener" href="https://fatcat.wiki/release/44ah7tzjlfciznb4kbinpouaem">fatcat:44ah7tzjlfciznb4kbinpouaem</a> </span>
<a target="_blank" rel="noopener" href="https://web.archive.org/web/20180721005411/https://asmp-eurasipjournals.springeropen.com/track/pdf/10.1186/s13636-017-0121-5?site=asmp-eurasipjournals.springeropen.com" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="https://blobs.fatcat.wiki/thumbnail/pdf/da/03/da03a61c0aea419051930a9df34266109ce3c221.180px.jpg" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href="https://doi.org/10.1186/s13636-017-0121-5"> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> springer.com </button> </a>