A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2019; you can also visit the original URL.
The file type is
End-to-end models, which directly output text given speech using a single neural network, have been shown to be competitive with conventional speech recognition models containing separate acoustic, pronunciation, and language model components. Such models do not require additional resources for decoding and are typically much smaller than conventional models. This makes them particularly attractive in the context of ondevice speech recognition where both small memory footprint and low powerdoi:10.21437/interspeech.2018-1025 dblp:conf/interspeech/PangSPGWZC18 fatcat:lv3la3qv45ab3mccbfkw27bfim