Improved name recognition with meta-data dependent name networks
2004 IEEE International Conference on Acoustics, Speech, and Signal Processing
A transcription system that requires accurate general name transcription is faced with the problem of covering the large number of names it may encounter. Without any prior knowledge, this requires a large increase in the size and complexity of the system due to the expansion of the lexicon. Furthermore, this increase will adversely affect the system performance due to the increased confusability. Here we propose a method that uses meta-data, available at runtime to ensure better name coverage
... tter name coverage without significantly increasing the system complexity. We tested this approach on a voicemail transcription task and assumed meta-data to be available in the form of a caller ID string (as it would show up on a caller ID enabled phone) and the name of the mailbox owner. Networks representing possible spoken realization of those names are generated at runtime and included in network of the decoder. The decoder network is built at training time using a class-dependent language model, with caller and mailbox name instances modeled as class tokens. The class tokens are replaced at test time with the name networks built from the meta-data. The proposed algorithm showed a reduction in the error rate of name tokens of 22.1%.