A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2017; you can also visit the original URL.
The file type is application/pdf
.
Two-stream indexing for spoken web search
2011
Proceedings of the 20th international conference companion on World wide web - WWW '11
This paper presents two-stream processing of audio to index the audio content for Spoken Web search. The first stream indexes the meta-data associated with a particular audio document. The meta-data is usually very sparse, but accurate. This therefore results in a high-precision, low-recall index. The second stream uses a novel language-independent speech recognition to generate text to be indexed. Owing to the multiple languages and the noise in user generated content on the Spoken Web, the
doi:10.1145/1963192.1963364
dblp:conf/www/AjmeraJMRSSS11
fatcat:maalq7pnpje6nid4tc64p4cv54